
Building safe and reliable AI systems for everyone
Anthropic, headquartered in SoMa, San Francisco, is an AI safety and research company focused on developing reliable, interpretable, and steerable AI systems. With over 1,000 employees and backed by Google, Anthropic has raised $29.3 billion in funding, including a monumental Series F round of $13 b...
Anthropic offers comprehensive health, dental, and vision insurance for employees and their dependents, along with inclusive fertility benefits via Ca...
Anthropic's culture is rooted in AI safety and reliability, with a focus on producing less harmful outputs compared to existing AI systems. The compan...

Anthropic • San Francisco, CA
Anthropic is seeking a Research Manager focused on interpretability to contribute to AI safety research. You'll work with a team dedicated to understanding neural networks and their mechanisms. This role is ideal for those passionate about AI interpretability.
You are excited about interpretability research and have a strong interest in AI safety. You may have experience in research roles, particularly in understanding complex systems and their mechanisms. You are open to contributing as an individual contributor in a growing team focused on AI interpretability. You understand the importance of creating reliable and interpretable AI systems that benefit users and society.
Experience in AI research or related fields is a plus. Familiarity with mechanistic interpretability and neural networks will help you thrive in this role. You are a collaborative team player who enjoys working with researchers, engineers, and policy experts to advance the mission of AI safety.
As a Research Manager, you will engage in interpretability research, focusing on reverse engineering how trained models work. You will collaborate with a diverse team to build a solid scientific foundation for understanding neural networks. Your work will contribute to making advanced AI systems safe and beneficial. You will be involved in discussions about the implications of AI interpretability and how it can enhance user trust in AI technologies.
Anthropic offers a competitive compensation package and benefits, including generous vacation and parental leave. You will have flexible working hours and the opportunity to work in a collaborative office space in San Francisco. We are committed to creating a supportive environment where you can grow and contribute to meaningful AI research.
Apply now or save it for later. Get alerts for similar jobs at Anthropic.