LeethubLeethub
JobsCompaniesBlog
Go to dashboard

Leethub

Curated tech jobs from FAANG and top companies worldwide.

Top Companies

  • Google Jobs
  • Meta Jobs
  • Amazon Jobs
  • Apple Jobs
  • Netflix Jobs
  • All Companies →

Job Categories

  • Software Engineering
  • Data, AI & Machine Learning
  • Product Management
  • Design & User Experience
  • Operations & Strategy
  • Remote Jobs
  • All Categories →

Browse by Type

  • Remote Jobs
  • Hybrid Jobs
  • Senior Positions
  • Entry Level
  • All Jobs →

Resources

  • Google Interview Guide
  • Salary Guide 2025
  • Salary Negotiation
  • LeetCode Study Plan
  • All Articles →

Company

  • Dashboard
  • Privacy Policy
  • Contact Us
© 2026 Leethub LLC. All rights reserved.
Home›Jobs›Apple›AI Research Intern - Foundation Models & Multimodal Intelligence
Apple

About Apple

The personal technology company redefining user experience

🏢 Tech, Hardware👥 1001+ employees📅 Founded 1976📍 Cupertino, CA⭐ 4.2
B2CB2BHardwareSaaSTelecommunicationseCommerce

Key Highlights

  • Market cap of $3 trillion as of 2022
  • Over 1 billion active devices worldwide
  • Comprehensive medical plans including mental healthcare
  • Paid parental leave and gradual return-to-work program

Apple Inc. (NASDAQ: AAPL), headquartered in Cupertino, CA, is the world's most valuable company with a market capitalization of $3 trillion as of 2022. Known for its iconic products such as the iPhone, iPad, and Mac, Apple serves over 1 billion active devices globally. The company has a strong commi...

🎁 Benefits

Apple offers comprehensive medical plans covering physical and mental healthcare, paid parental leave, and a gradual return-to-work program. Employees...

🌟 Culture

Apple's culture emphasizes an obsessive focus on user experience and consumer privacy, setting it apart from competitors. The company promotes inclusi...

🌐 Website💼 LinkedIn𝕏 TwitterAll 4783 jobs →
Apple

AI Research Intern - Foundation Models & Multimodal Intelligence

Apple • Zurich, Zurich, Switzerland

Posted 3 months ago🏛️ On-SiteEntry-LevelAi research engineer📍 Zurich
Apply Now →

Job Description

Imagine building the next generation of AI-powered experiences at Apple. We are advancing the state of the art in foundation models, applying them across language, vision, and multimodal understanding to power features used by millions of people worldwide.

Description

As part of the Multimodal Intelligence Team (MINT), with a track record of delivering innovations from the Apple Foundation Model to real-world applications like Visual Intelligence, you will tackle the practical challenges of scaling, optimizing, for building large models as well as integrating such models and agents into Apple products. You’ll collaborate with world-class engineers and scientists to push the boundaries of foundation models and agentic systems while delivering real-world impact

Minimum Qualifications

Currently pursuing a PhD degree or equivalent experience in Machine Learning, Computer Vision, Natural Language Processing, Data Science, Statistics or related areas. Experience with large language models or vision language models and their application in agentic systems. Proficient programming skills in Python and experience with at least one modern deep learning framework (PyTorch, JAX, or TensorFlow).

Preferred Qualifications

Demonstrated publication record in relevant conferences (e.g. NeurIPS, ICML, ICLR, CVPR, etc). Experience with foundation models (language, vision-language, or multimodal). Experience post training (SFT or RL) for optimizing large models for agentic systems. Available for 6-12 months for internship.

Responsibilities

You will work on advancing the post-training capabilities of multimodal foundation models for agentic applications. This includes researching and developing methods to improve how these models understand, reason about, and interact with complex environments through techniques like supervised fine-tuning and reinforcement learning from various reward signals. You will design evaluation frameworks to assess agent performance on realistic tasks and experiment with training strategies that enhance the model's ability to perceive multimodal inputs, understand intent, and execute complex autonomous behaviors. A key part of your role will be staying current with emerging research in foundation model agents, identifying techniques that could advance the state-of-the-art in autonomous systems. Your work will involve large-scale multimodal datasets combining visual, textual, and interaction data to push the boundaries of agent capabilities. A primary goal of this internship is to produce novel research suitable for publication at a top-tier conference. You will collaborate with researchers and engineers passionate about creating more capable autonomous systems, contributing to cutting-edge research in this rapidly evolving field.

Eeo Content

At Apple, we’re not all the same. And that’s our greatest strength. We draw on the differences in who we are, what we’ve experienced, and how we think. Because to create products that serve everyone, we believe in including everyone. Therefore, we are committed to treating all applicants fairly and equally. We will work with applicants to make any reasonable accommodations.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Apple.

Apply Now →Get Job Alerts