Software Engineer, Safeguards

Anthropic • San Francisco, CA | New York City, NY

Posted 4d agoMid-Level Senior Software engineering 📍 San francisco 📍 New york

Apply Now →

Skills & Technologies

Java Python

Overview

Anthropic is seeking Software Engineers for their Safeguards team to develop safety mechanisms for AI systems. You'll work with Java and Python to build monitoring systems and abuse detection infrastructure. This role requires 5-10 years of experience in software engineering.

Job Description

Who you are

You have a Bachelor’s degree in Computer Science, Software Engineering, or comparable experience, along with 5-10+ years of experience in a software engineering position, preferably with a focus on safety mechanisms in AI systems. You are skilled in programming languages such as Java and Python, and you have a strong understanding of building robust systems that can monitor and enforce safety protocols effectively. You are detail-oriented and have experience in developing monitoring systems that can detect unwanted behaviors from API partners. You thrive in collaborative environments and are eager to work with researchers and analysts to improve AI safety.

Desirable

Experience with machine learning frameworks and familiarity with AI safety principles would be a plus. You are comfortable analyzing user reports and have a proactive approach to identifying and mitigating risks associated with AI usage. You are passionate about building systems that prioritize user well-being and uphold ethical standards in technology.

What you'll do

As a Software Engineer on the Safeguards team, you will be responsible for developing monitoring systems that detect unwanted behaviors from our API partners and potentially take automated enforcement actions. You will surface these behaviors in internal dashboards for manual review by analysts. Your role will also involve building abuse detection mechanisms and infrastructure to surface abuse patterns to our research teams, helping to harden models at the training stage. You will work on creating robust and reliable multi-layered defenses for real-time improvement of safety mechanisms that work at scale. Additionally, you will analyze user reports of inappropriate content or accounts, ensuring that our AI systems operate within acceptable use policies.

What we offer

At Anthropic, we offer competitive compensation and benefits, including optional equity donation matching, generous vacation and parental leave, and flexible working hours. You will have the opportunity to work in a lovely office space in San Francisco or New York City, collaborating with a diverse team of committed researchers, engineers, and policy experts. We encourage you to apply even if your experience doesn't match every requirement, as we value diverse perspectives and backgrounds in our mission to create safe and beneficial AI systems.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Anthropic.

Apply Now →Get Job Alerts

About Anthropic

Key Highlights

🎁 Benefits

🌟 Culture