Research Scientist/Engineer, Alignment Finetuning

job
  • Anthropic
Job Summary
Location
San Francisco ,CA 94199
Job Type
Contract
Visa
Any Valid Visa
Salary
PayRate
Qualification
BCA
Experience
2Years - 10Years
Posted
15 Mar 2025
Share
Job Description

About the role:

As a Research Scientist/Engineer on the Alignment Finetuning team at Anthropic, you'll lead the development and implementation of techniques aimed at training language models that are more aligned with human values: that demonstrate better moral reasoning, improved honesty, and good character. You'll work to develop novel finetuning techniques and to use these to demonstrably improve model behavior.

Responsibilities:

  • Develop and implement novel finetuning techniques using synthetic data generation and advanced training pipelines
  • Use these to train models to have better alignment properties including honesty, character, and harmlessness
  • Create and maintain evaluation frameworks to measure alignment properties in models
  • Collaborate across teams to integrate alignment improvements into production models
  • Develop processes to help automate and scale the work of the team

You may be a good fit if you:

  • Have an MS/PhD in Computer Science, ML, or related field, or equivalent experience
  • Possess strong programming skills, especially in Python
  • Have experience with ML model training and experimentation
  • Have a track record of implementing ML research
  • Demonstrate strong analytical skills for interpreting experimental results
  • Have experience with ML metrics and evaluation frameworks
  • Excel at turning research ideas into working code
  • Can identify and resolve practical implementation challenges

Strong candidates may also have:

  • Experience with language model finetuning
  • Background in AI alignment research
  • Published work in ML or alignment
  • Experience with synthetic data generation
  • Familiarity with techniques like RLHF, constitutional AI, and reward modeling
  • Track record of designing and implementing novel training approaches
  • Experience with model behavior evaluation and improvement
#J-18808-Ljbffr
Other Smiliar Jobs