
The personal technology company redefining user experience
Apple Inc. (NASDAQ: AAPL), headquartered in Cupertino, CA, is the world's most valuable company with a market capitalization of $3 trillion as of 2022. Known for its iconic products such as the iPhone, iPad, and Mac, Apple serves over 1 billion active devices globally. The company has a strong commi...
Apple offers comprehensive medical plans covering physical and mental healthcare, paid parental leave, and a gradual return-to-work program. Employees...
Apple's culture emphasizes an obsessive focus on user experience and consumer privacy, setting it apart from competitors. The company promotes inclusi...

Apple • San Francisco, California, United States
Apple is hiring an ML Research Engineer to lead the design and development of automated safety benchmarking methodologies for AI features. You'll work with Python and machine learning techniques to ensure safe and trustworthy AI experiences. This role requires strong analytical skills and experience in AI safety.
You have a strong background in machine learning and data analysis, with experience in developing evaluation frameworks for AI systems. Your expertise in Python allows you to build robust tools for assessing model performance and safety. You understand the importance of ethical AI and are committed to avoiding systemic biases in technology.
You possess excellent analytical judgment, enabling you to investigate the behavior of media-related agents and assess the risks they pose. Your ability to work cross-functionally with engineering and project management teams ensures that safety insights are effectively translated into actionable improvements.
You are familiar with the latest advancements in AI and machine learning, and you stay updated on industry trends and best practices. Your collaborative spirit allows you to thrive in a team environment, where you can share knowledge and contribute to the development of innovative solutions.
Experience with safety benchmarking methodologies and automated evaluation techniques is a plus. Familiarity with statistical analysis and the ability to generate benchmark datasets will enhance your contributions to the team. A passion for creating safe and reliable AI experiences is essential.
In this role, you will lead the design and continuous development of automated safety benchmarking methodologies for AI features across Apple Services. You will investigate how media-related agents behave and develop rigorous evaluation frameworks to assess their safety performance. Your work will support the development of scalable evaluation techniques that empower engineering teams to assess candidate models effectively.
You will collaborate with cross-functional teams, including engineering, product, and governance, to ensure that AI experiences are reliable and aligned with human values. Your contributions will help establish scientific standards for assessing risks and improving the behavior of advanced AI/ML models.
You will be responsible for generating benchmark datasets and developing evaluation methodologies that enable engineering teams to translate safety insights into actionable improvements. Your role will blend deep technical expertise with strong analytical judgment, allowing you to create tools that enhance the safety and trustworthiness of AI applications.
Apple is committed to fostering an inclusive and diverse workplace. You will have the opportunity to work on cutting-edge AI technologies that impact millions of users worldwide. We offer competitive compensation and benefits, along with a collaborative work environment that encourages innovation and professional growth. Join us in our mission to create safe and trustworthy AI experiences for users around the globe.
Apply now or save it for later. Get alerts for similar jobs at Apple.