About Handshake

Handshake is the career network for the AI economy. 20 million knowledge workers, 1,600 educational institutions, 1 million employers (including 100% of the Fortune 50), and every foundational AI lab trust Handshake to power career discovery, hiring, and upskilling, from freelance AI training gigs to first internships to full-time careers and beyond. This unique value is leading to unparalleled growth; in 2025, we tripled our ARR at scale.

Why join Handshake now:

Shape how every career evolves in the AI economy, at global scale, with impact your friends, family and peers can see and feel
Work hand-in-hand with world-class AI labs, Fortune 500 partners and the world’s top educational institutions
Join a team with leadership from Scale AI, Meta, xAI, Notion, Coinbase, and Palantir, among others
Build a massive, fast-growing business with billions in revenue

About the Role

As a Staff Research Scientist, you will drive frontier research on how we define intelligence of frontier models, i.e. develop benchmarks and measurements that help the research community to understand how large language models (LLMs) understand, reason, and interact with human knowledge. You will:

Lead teams of researchers to produce original research in LLM evaluation methodologies, interpretability, and human-AI knowledge alignment.
Develop novel frameworks and assessment techniques that reveal deep insights into model capabilities, limitations, and emergent behaviors.
Collaborate with engineers to translate research breakthroughs into scalable benchmarks, evaluation systems, and standards.
Pioneer new approaches to measuring reasoning, alignment, and trustworthiness in frontier AI systems.
Author high-quality code to enable large-scale experimentation, reproducible evaluation, and knowledge assessment workflows.
Publish in top-tier conferences and journals, establishing new directions in the science of AI evaluation.
Work cross-functionally with leadership, engineers, and external partners to set industry standards for responsible AI evaluation and alignment.

Desired Capabilities

PhD or equivalent research experience in machine learning, computer science, cognitive science, or related fields with focus on AI evaluation, interpretability, or model understanding.
6+ years of academic or industry experience post-doc in a research-first environment
Strong background in LLM research, evaluation methodologies, and/or foundational AI assessment techniques.
Proven ability to independently design, lead, and execute evaluation research programs with novel data types end-to-end.
Deep proficiency in Python and PyTorch for large-scale model analysis, benchmarking, and evaluation.
Experience building or leading novel benchmark development, systematic model assessment, or interpretability studies.
Strong publication record in post-training, evaluation, or interpretability that demonstrates field-defining contributions.
Ability to clearly communicate complex insights and influence both technical and non-technical stakeholders.

Extra Credit

Experience with RLHF, agent modeling, or AI alignment research.
Familiarity with data-centric AI approaches, synthetic data generation, or human-in-the-loop systems.
Understanding of challenges in scaling foundation models (training stability, safety, inference efficiency).
Contributions to open-source libraries or research tooling.
Interest in the societal impact, deployment ethics, and governance of frontier AI systems.

We Offer

Handshake delivers benefits that help you feel supported and thrive at work and in life.

The below benefits are for full-time US employees.

🎯 Ownership: Equity in a fast-growing company

💰 Financial Wellness: 401(k) match, competitive compensation, financial coaching

🍼 Family Support: Paid parental leave, fertility benefits, parental coaching

💝 Wellbeing: Medical, dental, and vision, mental health support, $500 wellness stipend

📚 Growth: $2,000 learning stipend, ongoing development

💻 Remote & Office: Internet, commuting, and free lunch/gym in our SF office

🏝 Time Off: Flexible PTO, 15 holidays + 2 flex days

🤝 Connection: Team outings & referral bonuses

Explore our mission, values, and comprehensive US benefits at joinhandshake.com/careers .

About Handshake

Key Highlights

🎁 Benefits

🌟 Culture

Staff AI Research Scientist - Evaluation, Handshake AI

Job Description

About Handshake

About the Role

Desired Capabilities

Extra Credit

We Offer

Interested in this role?