
Empowering corporate mentorship for effective learning
Together is a corporate mentorship management platform founded in 2018, headquartered in CityPlace, Toronto, ON. The platform streamlines the mentorship lifecycle, facilitating connections among employees at companies like Heineken, Reddit, and 7-Eleven. With $1.7 million in seed funding, Together a...
Together offers competitive salaries and equity packages, 4 weeks of paid vacation, and a comprehensive health, dental, and vision plan through Honeyb...
Together fosters a culture of autonomy and impact, allowing employees to take on significant responsibilities without bureaucratic constraints. The fo...

Together AI • San Francisco
Together AI is hiring a Senior Machine Learning Platform Engineer to build and optimize a container platform for custom models and inference. You'll work with technologies like CUDA, PyTorch, and Kubernetes in San Francisco.
You have over 5 years of experience in building large-scale, fault-tolerant distributed systems — you've tackled challenges in optimizing performance and ensuring robustness in complex environments. Your expertise includes working with serverless inference platforms and you are familiar with the intricacies of model bring-up and cloud operations.
You possess a strong understanding of container orchestration, particularly with Kubernetes — you know how to manage multi-cluster scheduling and can identify and resolve machine learning bottlenecks effectively. Your background in profiling and optimization allows you to enhance system performance and developer experience.
You are skilled in writing clear, maintainable software and infrastructure as code (IaC) — you understand the importance of documentation and testing strategies to ensure robustness and fault tolerance in your solutions. You thrive in collaborative environments, partnering with product teams to translate functional requirements into technical solutions.
Experience with video or audio generation technologies is a plus — you have a keen interest in the latest advancements in machine learning and are eager to apply them in practical scenarios. Familiarity with queueing theory and inference engines will further enhance your contributions to the team.
In this role, you will focus on enabling custom models and dedicated inference on Together's platform — your responsibilities will include building a container platform that optimizes autoscaling and minimizes cold starts. You will analyze and improve the end-to-end model performance, ensuring a best-in-class developer experience with great tooling.
You will work on multi-cluster orchestration and predictive autoscaling — your insights will help in the development of control panes and model optimization strategies. You will also be involved in writing APIs for managing deployments and developing inference worker SDKs and CLI tools.
Your role will require you to conduct design and code reviews — you will create developer documentation and develop testing strategies that enhance the robustness and scalability of existing distributed systems, APIs, databases, and infrastructure. You will collaborate closely with product teams to understand their needs and deliver solutions that meet business objectives.
Together AI provides a dynamic work environment where innovation thrives — you will be part of a team that is dedicated to pushing the boundaries of machine learning technology. We encourage you to apply even if your experience doesn't match every requirement, as we value diverse perspectives and backgrounds.
You will have opportunities for professional growth and development — we believe in fostering talent and providing the resources needed to succeed in your career. Join us in shaping the future of AI and making a significant impact in the industry.
Apply now or save it for later. Get alerts for similar jobs at Together AI.