
Empowering every person and organization on the planet
Microsoft Corporation, headquartered in Redmond, Washington, is a leading technology company known for its software products like Windows and Office, as well as cloud services through Azure. With over 100,000 employees, Microsoft serves millions of customers globally, including major enterprises lik...
Microsoft offers competitive salaries, stock options, generous PTO policies, and comprehensive health benefits. Employees also enjoy a flexible remote...
Microsoft fosters a culture of innovation and inclusivity, emphasizing collaboration across teams and a commitment to diversity. The company values em...

Microsoft • United States, California, Mountain View, United States, Washington, Redmond, United States, New York, New York, United States, Colorado, Boulder
Microsoft is hiring a Member of Technical Staff for LLM Evaluation to develop methodologies for evaluating AI performance. You'll work with machine learning and natural language processing to enhance Copilot's effectiveness. This role requires experience in social sciences and AI evaluation.
You have a strong background in machine learning and natural language processing, with experience in developing evaluation methodologies for AI systems. Your analytical skills are complemented by a creative problem-solving approach, allowing you to tackle complex challenges in evaluating AI performance. You are comfortable collaborating with user researchers and product leaders, ensuring that the evaluation frameworks you build are effective and user-centric. Your experience in the social sciences gives you a unique perspective on understanding user needs and behaviors, which is critical for this role. You are passionate about leveraging technology to improve user experiences and are eager to contribute to a culture of innovation and collaboration at Microsoft.
Experience with automated evaluation frameworks and real-time performance signals would be a plus. Familiarity with user research methodologies and data collection techniques will enhance your ability to succeed in this role. A background in software development or engineering principles can also be beneficial as you work closely with cross-functional teams.
In this role, you will be responsible for developing and implementing cutting-edge methodologies to evaluate how well Copilot performs in real-world usage scenarios. You will work on training classifiers and experimenting with various data collection techniques to gather insights on user interactions with Copilot. Your work will directly impact how Microsoft ensures its AI systems effectively meet user needs, focusing on both task completion and the overall user experience. You will collaborate with user researchers to understand user behaviors and preferences, translating these insights into actionable evaluation strategies. Additionally, you will contribute to building automated evaluation frameworks that provide real-time signals on Copilot's performance, driving continuous improvements in the product. Your role will involve analyzing data and presenting findings to stakeholders, ensuring that the evaluation processes align with Microsoft's mission to empower every person and organization.
Microsoft fosters a culture of inclusion and collaboration, where every employee is encouraged to thrive and innovate. You will have the opportunity to work with cutting-edge technologies and contribute to impactful projects that shape the future of AI. The company offers competitive compensation and benefits, along with opportunities for professional growth and development. You will be part of a diverse team that values respect, integrity, and accountability, working together to achieve shared goals. Microsoft is committed to empowering its employees and providing a supportive environment for career advancement.
Apply now or save it for later. Get alerts for similar jobs at Microsoft.