Research Intern, Inference (Summer 2026)

Together AI • San Francisco

Posted 1d agoEntry-Level Ai research engineer 📍 San francisco

Apply Now →

Skills & Technologies

Python TensorFlow PyTorch Machine learning Deep learning

Overview

Together AI is hiring a Research Intern for their Inference team to work on scalable serving systems for large foundation models. You'll focus on distributed inference and optimization strategies using Python and machine learning frameworks. This is an entry-level position based in San Francisco.

Job Description

Who you are

You are an aspiring AI researcher with a strong foundation in Python and machine learning frameworks like TensorFlow and PyTorch. You have a keen interest in distributed systems and optimization techniques, and you are eager to learn and contribute to cutting-edge AI research. You thrive in collaborative environments and are excited about the opportunity to work on projects that advance open and transparent AI systems. You understand the importance of efficient and scalable systems in the context of modern AI applications and are motivated to explore innovative solutions.

Desirable

Familiarity with deep learning concepts and architectures is a plus. Experience with compiler-aware optimization or large-scale serving architectures would be beneficial but is not required. A background in computer science or a related field will help you navigate the complexities of the role.

What you'll do

As a Research Intern at Together AI, you will dive into the complexities of distributed inference and contribute to the development of efficient serving systems for large foundation models. You will work closely with the Inference Research team to co-design and implement cross-layer optimizations across models, systems, and hardware. Your projects will focus on areas such as KV cache design and novel inference-time computation strategies, including speculative decoding and phase-aware execution. You will have the opportunity to collaborate with experienced researchers and engineers, gaining hands-on experience in the field of AI.

You will participate in brainstorming sessions and contribute to the design of innovative algorithms and models that aim to lower the cost and latency of modern AI systems. Your work will directly impact the performance and scalability of AI applications, and you will be encouraged to share your ideas and insights with the team. This internship will provide you with valuable exposure to the latest advancements in AI research and the opportunity to develop your skills in a supportive environment.

What we offer

Together AI is committed to fostering a diverse and inclusive workplace. As an intern, you will receive mentorship from experienced professionals in the field and have access to resources that will help you grow your skills. We offer a collaborative and innovative work environment where you can contribute to meaningful projects that have a real impact on society. You will also have the chance to participate in team-building activities and networking events, enhancing your professional development. This internship is a unique opportunity to be part of a mission-driven company that is shaping the future of AI.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Together AI.

Apply Now →Get Job Alerts

About Together AI

Key Highlights

🎁 Benefits

🌟 Culture