Applied Safety Research Engineer, Safeguards

Anthropic • San Francisco, CA | New York City, NY

Posted 4h agoMid-Level Applied scientist 📍 San francisco 📍 New york

Apply Now →

Skills & Technologies

Machine learning Python

Overview

Anthropic is seeking an Applied Safety Research Engineer to develop methods for evaluating AI safety. You'll work with machine learning and Python to design experiments that improve model evaluations. This role requires a research-oriented mindset and experience in applied ML.

Job Description

Who you are

You have a strong background in applied machine learning and engineering, with experience in designing experiments that enhance evaluation quality. You understand the importance of creating representative test data and simulating realistic user behavior to ensure model safety. Your analytical skills allow you to identify gaps in evaluation coverage and inform necessary improvements. You are comfortable working at the intersection of research and engineering, and you thrive in collaborative environments where you can contribute to meaningful AI safety initiatives.

Desirable

Experience with safety evaluations in AI systems is a plus, as well as familiarity with user behavior analysis and grading accuracy validation. You are passionate about ensuring AI systems are safe and beneficial for users and society.

What you'll do

In this role, you will design and run experiments aimed at improving the quality of AI safety evaluations. You will develop methods to generate representative test data and simulate realistic user behavior, which are crucial for validating grading accuracy. Your work will involve analyzing how various factors impact model safety behavior, including multi-turn conversations and user diversity. You will also be responsible for productionizing successful research into evaluation pipelines that run during model training and launch, directly influencing how Anthropic understands and enhances the safety of its models.

What we offer

Anthropic provides a collaborative work environment with a focus on building beneficial AI systems. You will have access to competitive compensation and benefits, including optional equity donation matching, generous vacation and parental leave, and flexible working hours. Our office in San Francisco is designed to foster collaboration among colleagues, and we are committed to creating a supportive workplace culture that values your contributions.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Anthropic.

Apply Now →Get Job Alerts

About Anthropic

Key Highlights

🎁 Benefits

🌟 Culture