Member of Technical Staff - Safety Lead

Reflection • SF

Posted 7h ago🏛️ On-Site Senior Ai research engineer 📍 San francisco

Apply Now →

Skills & Technologies

Machine learning Python Reinforcement learning Ai safety

Overview

Reflection is seeking a Senior AI Research Engineer to lead safety evaluations for their AI models. You'll work with advanced methodologies in AI safety and contribute to the development of automated evaluation pipelines. This role requires a graduate degree in Computer Science or related fields and deep technical expertise in LLM safety.

Job Description

Who you are

You hold a graduate degree (MS or PhD) in Computer Science, Machine Learning, or a related discipline, or possess equivalent practical experience in AI Safety. Your deep technical understanding of LLM safety includes adversarial attacks, red-teaming methodologies, and interpretability, making you a valuable asset in ensuring model reliability. You have strong software engineering capabilities, with experience in building automated evaluation pipelines or large-scale ML systems. Your familiarity with Reinforcement Learning (RLHF/RLAIF) and its impact on model safety further enhances your qualifications. You thrive in collaborative environments, working closely with cross-functional teams to translate safety findings into actionable insights.

Desirable

Experience with state-of-the-art jailbreaking techniques and defenses is a plus, as is a background in developing scalable safety benchmarks that adapt to evolving model capabilities. You are proactive in researching potential vulnerabilities and are committed to staying ahead of the curve in AI safety practices.

What you'll do

In this role, you will own the red-teaming and adversarial evaluation pipeline for Reflection’s models, continuously probing for failure modes across security, misuse, and alignment gaps. You will work hand-in-hand with the Alignment team to translate safety findings into concrete guardrails, ensuring that models behave reliably under stress and adhere to deployment policies. Your responsibilities will include validating that every release meets the lab’s risk thresholds before it ships, serving as a critical gatekeeper for open weight releases. You will develop scalable, automated safety benchmarks that evolve alongside model capabilities, moving beyond static datasets to dynamic adversarial testing. Additionally, you will research and implement state-of-the-art jailbreaking techniques and defenses to stay ahead of potential vulnerabilities in the wild.

What we offer

At Reflection, we prioritize work-life balance and offer fully paid parental leave for all new parents, including adoptive and surrogate journeys. We provide financial support for family planning and ensure that you have paid time off when you need it. Our relocation support and various perks are designed to optimize your time and enhance your work experience. Opportunities to connect with teammates are abundant, as we provide lunch and dinner daily, along with regular off-sites and team celebrations. Join us in our mission to build open superintelligence and make it accessible to all.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Reflection.

Apply Now →Get Job Alerts

About Reflection

Key Highlights

🎁 Benefits

🌟 Culture