Research Engineer, Production Model Post-Training - London

Anthropic • London, UK

Posted 3w agoMid-Level Ai research engineer 📍 London

Apply Now →

Skills & Technologies

Python Constitutional ai Rlhf

Overview

Anthropic is hiring a Research Engineer for their Production Model Post-Training team to enhance AI capabilities and safety. You'll implement and optimize post-training techniques using Python and other methodologies. This role requires experience in AI research and engineering.

Job Description

Who you are

You have a strong background in AI research and engineering, with experience in implementing and optimizing post-training techniques at scale. Your expertise in Python allows you to develop robust pipelines for model fine-tuning and evaluation. You are familiar with methodologies such as Constitutional AI and Reinforcement Learning from Human Feedback (RLHF), which are crucial for improving production model quality.

You thrive in collaborative environments, working closely with research teams to translate emerging techniques into production-ready implementations. Your ability to conduct research and develop innovative post-training recipes directly impacts the safety and capabilities of AI systems. You are proactive and can respond to incidents on short notice, demonstrating your commitment to maintaining high standards in AI production.

Desirable

Experience with large-scale AI models and a deep understanding of alignment methodologies will set you apart. Familiarity with tools for measuring and improving model performance across various dimensions is a plus. You are eager to contribute to a mission-driven organization focused on creating beneficial AI systems.

What you'll do

As a Research Engineer on the Post-Training team, you will implement and optimize sophisticated post-training techniques to enhance the capabilities of Anthropic's production models. Your work will involve conducting research to develop and refine post-training recipes that improve model quality and safety. You will design, build, and run efficient pipelines for model fine-tuning and evaluation, ensuring that the models meet high-performance standards.

Collaboration is key in this role, as you will work alongside research teams to translate cutting-edge techniques into practical applications. Your contributions will directly impact the quality and safety of the AI systems that users interact with. You will also develop tools to measure and improve model performance, ensuring that the production models align with Anthropic's mission of creating reliable and interpretable AI systems.

What we offer

At Anthropic, we provide competitive compensation and benefits, including optional equity donation matching and generous vacation and parental leave. You will enjoy flexible working hours and a collaborative office environment in London. Our mission-driven culture encourages innovation and teamwork, allowing you to make a meaningful impact in the field of AI. We believe in the importance of creating AI systems that are safe and beneficial for society, and we invite you to be part of this journey.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Anthropic.

Apply Now →Get Job Alerts

About Anthropic

Key Highlights

🎁 Benefits

🌟 Culture