
Empowering corporate mentorship for effective learning
Together is a corporate mentorship management platform founded in 2018, headquartered in CityPlace, Toronto, ON. The platform streamlines the mentorship lifecycle, facilitating connections among employees at companies like Heineken, Reddit, and 7-Eleven. With $1.7 million in seed funding, Together a...
Together offers competitive salaries and equity packages, 4 weeks of paid vacation, and a comprehensive health, dental, and vision plan through Honeyb...
Together fosters a culture of autonomy and impact, allowing employees to take on significant responsibilities without bureaucratic constraints. The fo...

Together AI • San Francisco
Together AI is hiring a Research Intern for their Inference team to work on scalable serving systems for large foundation models. You'll focus on distributed inference and optimization strategies using Python and machine learning frameworks. This is an entry-level position based in San Francisco.
You are an aspiring AI researcher with a strong foundation in Python and machine learning frameworks like TensorFlow and PyTorch. You have a keen interest in distributed systems and optimization techniques, and you are eager to learn and contribute to cutting-edge AI research. You thrive in collaborative environments and are excited about the opportunity to work on projects that advance open and transparent AI systems. You understand the importance of efficient and scalable systems in the context of modern AI applications and are motivated to explore innovative solutions.
Familiarity with deep learning concepts and architectures is a plus. Experience with compiler-aware optimization or large-scale serving architectures would be beneficial but is not required. A background in computer science or a related field will help you navigate the complexities of the role.
As a Research Intern at Together AI, you will dive into the complexities of distributed inference and contribute to the development of efficient serving systems for large foundation models. You will work closely with the Inference Research team to co-design and implement cross-layer optimizations across models, systems, and hardware. Your projects will focus on areas such as KV cache design and novel inference-time computation strategies, including speculative decoding and phase-aware execution. You will have the opportunity to collaborate with experienced researchers and engineers, gaining hands-on experience in the field of AI.
You will participate in brainstorming sessions and contribute to the design of innovative algorithms and models that aim to lower the cost and latency of modern AI systems. Your work will directly impact the performance and scalability of AI applications, and you will be encouraged to share your ideas and insights with the team. This internship will provide you with valuable exposure to the latest advancements in AI research and the opportunity to develop your skills in a supportive environment.
Together AI is committed to fostering a diverse and inclusive workplace. As an intern, you will receive mentorship from experienced professionals in the field and have access to resources that will help you grow your skills. We offer a collaborative and innovative work environment where you can contribute to meaningful projects that have a real impact on society. You will also have the chance to participate in team-building activities and networking events, enhancing your professional development. This internship is a unique opportunity to be part of a mission-driven company that is shaping the future of AI.
Apply now or save it for later. Get alerts for similar jobs at Together AI.