NLP & LLM Data Scientist - Healthcare & Life Sciences

job
  • Lensa
Job Summary
Location
Boise ,ID 83708
Job Type
Contract
Visa
Any Valid Visa
Salary
PayRate
Qualification
BCA
Experience
2Years - 10Years
Posted
15 Mar 2025
Share
Job Description

NLP & LLM Data Scientist - Healthcare & Life Sciences

Company: Norstella

Location: Remote, United States

Date Posted: Feb 21, 2025

Employment Type: Full Time

Job ID: R-881

Description

At Norstella, our mission is simple: to help our clients bring life-saving therapies to market quicker—and help patients in need.

Founded in 2022, Norstella unites best-in-class brands to help clients navigate the complexities at each step of the drug development life cycle.

Job Description:

Norstella Real World Data (RWD) is seeking a skilled NLP Data Scientist with a clinical background and a focus on Language Models to join our AI & Life Sciences Solutions team. Your expertise in processing and understanding natural language data, along with your knowledge of Electronic Health Records (EHR) and laboratory reports analysis, will be instrumental in driving our data science initiatives and innovations.

Responsibilities:

  1. Employ and leverage NLP and open-source Large Language Models (LLM) such as LLama2, Mixtral, Qwen, BERT, etc., to extract, process, and interpret unstructured medical data from diverse sources like EHRs, medical notes, and laboratory reports.
  2. Collaborate with clinical scientists and data scientists to create efficient NLP models for healthcare.
  3. Conduct data cleaning, preprocessing, and validation to maintain the accuracy and reliability of insights gathered from NLP processes.
  4. Validate and present data findings to stakeholders, exhibiting clear and effective communication skills.

Qualifications:

  1. Master's or Ph.D. degree in Computational Biology, Computer Science, Data Science, Computational Linguistics, Machine Learning, or a related analytical field.
  2. Deep understanding and direct experience (2+ years) in handling and interpreting Electronic Health Records (EHR) and laboratory tests results or genetic test results.
  3. Proven experience (2+ years) in NLP with a strong knowledge of NLP techniques such as Named Entity Recognition (NER), text summarization, topic modeling, etc.
  4. Expert-level understanding and practical experience (1+ years) with open-source Large Language Models (Llama2/3, Mixtral etc.), e.g., prompt engineering, inference, and fine-tuning.
  5. Proficient in Python and SQL, with strong experience in NLP libraries such as NLTK, spaCy, Hugging Face Transformers, and deep learning libraries such as PyTorch, TensorFlow.
  6. Experience in working with AWS cloud environment and large databases (e.g., AWS Redshift).
  7. Excellent verbal and written communication skills, with ability to present complex data to a non-technical audience.

Preferred Qualifications:

  1. Experience dealing with protected health information (PHI) and familiarity with healthcare-related data privacy laws such as HIPAA.
  2. Familiarity with standard healthcare codes and terminologies such as ICD-10, CPT, LOINC, and SNOMED CT.
  3. Experience in RAG (Retrieval-Augmented Generation) and vector store in the context of storing large volumes of healthcare unstructured documents.

Salary: The expected base salary for this position ranges from $140,000 to $200,000. Salary offers are based on relevant skills, training, experience, education, and organizational factors.

Norstella is an equal opportunity employer. We do not discriminate on the grounds of gender, sexual orientation, marital status, race, or any other protected characteristic.

#J-18808-Ljbffr
Other Smiliar Jobs