WaveAccess is looking for a Data Scientist to make the team even stronger. The work on the project will be related to Real-World Pharmaceutical Data and will involve characterizing patient treatment patterns and outcomes using electronic health records and health insurance claims data.
About the Company
WaveAccess is an international results-driven company that provides high-quality custom software development services for hundreds of emerging and established companies globally. By supporting customers with talented software engineers and vast experience in advanced technologies, WaveAccess builds innovative software solutions while minimizing development risks and costs.
Throughout its 22-year history, the company’s highly skilled specialists have implemented over 500 successful projects for market leaders, ambitious startups, and government institutions.
Responsibilities
- Engineer features from sequence data for downstream tasks
- Build sequence models using techniques like RNNs, CNNs, attention, transformers
- Fine-tune transformer models and implement custom model architectures as needed
- Build NLP pipelines for tasks like text classification, named entity recognition, question answering
- Monitor model performance and implement improvements to increase accuracy
- Incorporate multi-modal data such as text, numerics, images, and genomics
- Work with large language models, prompt tuning
Requirements
- Experience using Microsoft Office, such as PowerPoint, Excel, and Word
- Ability to communicate effectively in both verbal and written formats
- Ability to work on multiple projects while executing on key deliverables within required timeframes
- Ability to write and debug scripts in Python, as well as familiarity with common Python packages (e.g., numpy, pandas, scikit-learn, tensorflow/keras/pytorch)
- Deep knowledge of Neural Networks and architectures for working with sequences, particularly RNN, LSTM, Transformers, CNN, and attention.
- Experience with AWS (EC2, S3, SageMaker)
Optional Requirements
- Knowledge of general Machine Learning approaches
- Knowledge of mathematical statistics
- Understanding of CI/CD
Platforms
Windows or Linux with knowledge of shell commands
Programming Languages
Python
Methods/Frameworks
- Numpy
- Transformers
- Matplotlib / Seaborn / Yellowbrick
- Flask / Django
- Imblearn
- Scipy
Tools (Infrastructure)
- Git (GitHub, Bitbucket, GitLab)
- Confluence (or alternative - GitHub Wiki / BookStack / Document360)
- IDE (or alternative - PyCharm / VSCode)
- Jupyter (Jupyter Notebook, Jupyter Lab, Jupyter Hub)
- Basic knowledge of SQL
Optional Tools
- Flake8 or other code linter
- Docker
- Precommit
- Snowflake
Working Conditions
- High white and annually indexed salary
- Employment according to labor laws, 100% payment of sick leave and vacation
- Voluntary medical insurance (VMI) with dental coverage
- Flexible start of the working day
- Weekly seminars, participation in conferences and meetups, and payment for certification exams
#J-18808-Ljbffr