Member of Technical Staff, LLM Evaluation-MAI

Microsoft • United States, California, Mountain View, United States, Washington, Redmond, United States, New York, New York, United States, Colorado, Boulder

Posted 3w agoMid-Level Ai engineer 📍 Mountain view 📍 Redmond 📍 New york 📍 Boulder

Apply Now →

Skills & Technologies

Machine learning Natural language processing

Overview

Microsoft is hiring a Member of Technical Staff for LLM Evaluation to develop methodologies for evaluating AI performance. You'll work with machine learning and natural language processing to enhance Copilot's effectiveness. This role requires experience in social sciences and AI evaluation.

Job Description

Who you are

You have a strong background in machine learning and natural language processing, with experience in developing evaluation methodologies for AI systems. Your analytical skills are complemented by a creative problem-solving approach, allowing you to tackle complex challenges in evaluating AI performance. You are comfortable collaborating with user researchers and product leaders, ensuring that the evaluation frameworks you build are effective and user-centric. Your experience in the social sciences gives you a unique perspective on understanding user needs and behaviors, which is critical for this role. You are passionate about leveraging technology to improve user experiences and are eager to contribute to a culture of innovation and collaboration at Microsoft.

Desirable

Experience with automated evaluation frameworks and real-time performance signals would be a plus. Familiarity with user research methodologies and data collection techniques will enhance your ability to succeed in this role. A background in software development or engineering principles can also be beneficial as you work closely with cross-functional teams.

What you'll do

In this role, you will be responsible for developing and implementing cutting-edge methodologies to evaluate how well Copilot performs in real-world usage scenarios. You will work on training classifiers and experimenting with various data collection techniques to gather insights on user interactions with Copilot. Your work will directly impact how Microsoft ensures its AI systems effectively meet user needs, focusing on both task completion and the overall user experience. You will collaborate with user researchers to understand user behaviors and preferences, translating these insights into actionable evaluation strategies. Additionally, you will contribute to building automated evaluation frameworks that provide real-time signals on Copilot's performance, driving continuous improvements in the product. Your role will involve analyzing data and presenting findings to stakeholders, ensuring that the evaluation processes align with Microsoft's mission to empower every person and organization.

What we offer

Microsoft fosters a culture of inclusion and collaboration, where every employee is encouraged to thrive and innovate. You will have the opportunity to work with cutting-edge technologies and contribute to impactful projects that shape the future of AI. The company offers competitive compensation and benefits, along with opportunities for professional growth and development. You will be part of a diverse team that values respect, integrity, and accountability, working together to achieve shared goals. Microsoft is committed to empowering its employees and providing a supportive environment for career advancement.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Microsoft.

Apply Now →Get Job Alerts

About Microsoft

Key Highlights

🎁 Benefits

🌟 Culture