LeethubLeethub
JobsCompaniesBlog
Go to dashboard

Leethub

Curated tech jobs from FAANG and top companies worldwide.

Top Companies

  • Google Jobs
  • Meta Jobs
  • Amazon Jobs
  • Apple Jobs
  • Netflix Jobs
  • All Companies →

Job Categories

  • Software Engineering
  • Data, AI & Machine Learning
  • Product Management
  • Design & User Experience
  • Operations & Strategy
  • Remote Jobs
  • All Categories →

Browse by Type

  • Remote Jobs
  • Hybrid Jobs
  • Senior Positions
  • Entry Level
  • All Jobs →

Resources

  • Google Interview Guide
  • Salary Guide 2025
  • Salary Negotiation
  • LeetCode Study Plan
  • All Articles →

Company

  • Dashboard
  • Privacy Policy
  • Contact Us
© 2026 Leethub LLC. All rights reserved.
Home›Jobs›Together AI›Machine Learning, Platform Engineer
Together AI

About Together AI

Empowering corporate mentorship for effective learning

👥 21-100 employees📍 CityPlace, Toronto, ON💰 $1.7m
B2BHRLearningSaaSCommunity

Key Highlights

  • Founded in 2018, headquartered in Toronto, ON
  • Raised $1.7 million in seed funding
  • Partnerships with Heineken, Reddit, and 7-Eleven
  • 4 weeks paid vacation and competitive equity packages

Together is a corporate mentorship management platform founded in 2018, headquartered in CityPlace, Toronto, ON. The platform streamlines the mentorship lifecycle, facilitating connections among employees at companies like Heineken, Reddit, and 7-Eleven. With $1.7 million in seed funding, Together a...

🎁 Benefits

Together offers competitive salaries and equity packages, 4 weeks of paid vacation, and a comprehensive health, dental, and vision plan through Honeyb...

🌟 Culture

Together fosters a culture of autonomy and impact, allowing employees to take on significant responsibilities without bureaucratic constraints. The fo...

🌐 WebsiteAll 51 jobs →
Together AI

Machine Learning, Platform Engineer

Together AI • San Francisco

Posted 4d ago🏛️ On-SiteSeniorMachine learning engineerPlatform engineer📍 San francisco
Apply Now →

Skills & Technologies

CudaPyTorchKubernetesApisDocker

Overview

Together AI is hiring a Senior Machine Learning Platform Engineer to build and optimize a container platform for custom models and inference. You'll work with technologies like CUDA, PyTorch, and Kubernetes in San Francisco.

Job Description

Who you are

You have over 5 years of experience in building large-scale, fault-tolerant distributed systems — you've tackled challenges in optimizing performance and ensuring robustness in complex environments. Your expertise includes working with serverless inference platforms and you are familiar with the intricacies of model bring-up and cloud operations.

You possess a strong understanding of container orchestration, particularly with Kubernetes — you know how to manage multi-cluster scheduling and can identify and resolve machine learning bottlenecks effectively. Your background in profiling and optimization allows you to enhance system performance and developer experience.

You are skilled in writing clear, maintainable software and infrastructure as code (IaC) — you understand the importance of documentation and testing strategies to ensure robustness and fault tolerance in your solutions. You thrive in collaborative environments, partnering with product teams to translate functional requirements into technical solutions.

Desirable

Experience with video or audio generation technologies is a plus — you have a keen interest in the latest advancements in machine learning and are eager to apply them in practical scenarios. Familiarity with queueing theory and inference engines will further enhance your contributions to the team.

What you'll do

In this role, you will focus on enabling custom models and dedicated inference on Together's platform — your responsibilities will include building a container platform that optimizes autoscaling and minimizes cold starts. You will analyze and improve the end-to-end model performance, ensuring a best-in-class developer experience with great tooling.

You will work on multi-cluster orchestration and predictive autoscaling — your insights will help in the development of control panes and model optimization strategies. You will also be involved in writing APIs for managing deployments and developing inference worker SDKs and CLI tools.

Your role will require you to conduct design and code reviews — you will create developer documentation and develop testing strategies that enhance the robustness and scalability of existing distributed systems, APIs, databases, and infrastructure. You will collaborate closely with product teams to understand their needs and deliver solutions that meet business objectives.

What we offer

Together AI provides a dynamic work environment where innovation thrives — you will be part of a team that is dedicated to pushing the boundaries of machine learning technology. We encourage you to apply even if your experience doesn't match every requirement, as we value diverse perspectives and backgrounds.

You will have opportunities for professional growth and development — we believe in fostering talent and providing the resources needed to succeed in your career. Join us in shaping the future of AI and making a significant impact in the industry.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Together AI.

Apply Now →Get Job Alerts