Well-funded seed stage business is looking to hire exceptional Artificial Intelligence Researchers to work on its multi-modal foundational models. Aiming to improve the usability of AI models by breaking out of the text box and into human like interactions, this team is working at the leading edge of human computer interaction.
Currently the focus in on adding speech expertise to the research team. This could in in areas such as speech understanding, speaker diarization, text speech (TTS), speech synthesis, voice conversion, emotion detection, speech generation etc.
Role:
- Research multi-modal generative AI models to create more natural interactions with LLMs
- Train voice and vision models (currently in the pre-training stage)
- Refine quality of models and improve efficiencies
Experience required:
- Very strong research background with multiple first author publications
- Experience with LLMs and Generative AI models on a foundational level
- Extensive knowledge of speech or vision models
- Expert with Python/PyTorch
This role would suit someone with a solid AI/ML background in a large organization who is now looking for a move to a start-up, or a recently graduated MSc or PhD candidates with very current experience in speech, vision, LLM research.
Salary for this is dependent on experience but to remain competitive will be in the range of $250k - $350k plus early-stage equity.
On-site in Seattle, remote considered if you can convince me they can't afford not to hire you.