LeethubLeethub
JobsCompaniesBlog
Go to dashboard

Leethub

Curated tech jobs from FAANG and top companies worldwide.

Top Companies

  • Google Jobs
  • Meta Jobs
  • Amazon Jobs
  • Apple Jobs
  • Netflix Jobs
  • All Companies →

Job Categories

  • Software Engineering
  • Data, AI & Machine Learning
  • Product Management
  • Design & User Experience
  • Operations & Strategy
  • Remote Jobs
  • All Categories →

Browse by Type

  • Remote Jobs
  • Hybrid Jobs
  • Senior Positions
  • Entry Level
  • All Jobs →

Resources

  • Google Interview Guide
  • Salary Guide 2025
  • Salary Negotiation
  • LeetCode Study Plan
  • All Articles →

Company

  • Dashboard
  • Privacy Policy
  • Contact Us
© 2026 Leethub LLC. All rights reserved.
Home›Jobs›Together AI›Senior Backend Engineer, Inference Platform
Together AI

About Together AI

Empowering corporate mentorship for effective learning

👥 21-100 employees📍 CityPlace, Toronto, ON💰 $1.7m
B2BHRLearningSaaSCommunity

Key Highlights

  • Founded in 2018, headquartered in Toronto, ON
  • Raised $1.7 million in seed funding
  • Partnerships with Heineken, Reddit, and 7-Eleven
  • 4 weeks paid vacation and competitive equity packages

Together is a corporate mentorship management platform founded in 2018, headquartered in CityPlace, Toronto, ON. The platform streamlines the mentorship lifecycle, facilitating connections among employees at companies like Heineken, Reddit, and 7-Eleven. With $1.7 million in seed funding, Together a...

🎁 Benefits

Together offers competitive salaries and equity packages, 4 weeks of paid vacation, and a comprehensive health, dental, and vision plan through Honeyb...

🌟 Culture

Together fosters a culture of autonomy and impact, allowing employees to take on significant responsibilities without bureaucratic constraints. The fo...

🌐 WebsiteAll 51 jobs →
Together AI

Senior Backend Engineer, Inference Platform

Together AI • San Francisco

Posted 4d ago🏛️ On-SiteSeniorBackend engineer📍 San francisco
Apply Now →

Skills & Technologies

PythonDockerKubernetesAWSMachine learningNvidia dynamoOpenAI API

Overview

Together AI is seeking a Senior Backend Engineer to build and optimize their Inference Platform for advanced generative AI models. You'll work with technologies like Python, Docker, and AWS to enhance performance and scalability. This role requires strong experience in backend engineering and machine learning.

Job Description

Who you are

You have 5+ years of backend engineering experience, particularly in building production systems that leverage advanced AI models. Your expertise includes optimizing performance and scalability, ensuring that applications run efficiently on a large scale. You thrive in environments where you can take deep technical ownership and make impactful contributions to the team.

Your technical skills include proficiency in Python and experience with containerization technologies like Docker and orchestration tools such as Kubernetes. You understand the intricacies of cloud platforms, particularly AWS, and how to leverage them for high-performance applications. You are also familiar with machine learning concepts and have experience working with generative AI models, which allows you to collaborate effectively with research teams.

You are passionate about contributing to the open-source community and have experience with projects that enhance inference performance and efficiency. You enjoy solving complex problems related to global request routing, load balancing, and resource allocation, and you have a knack for optimizing latency to ensure the best user experience.

Desirable

Experience with NVIDIA Dynamo and OpenAI API is a plus, as it aligns with the technologies used in the Inference Platform. Familiarity with large-scale GPU utilization and performance optimization techniques will set you apart.

What you'll do

In this role, you will be responsible for shaping the core inference backbone that powers Together AI's frontier models. You will work hands-on with cutting-edge hardware, including tens of thousands of GPUs, to optimize their performance and ensure they are fully utilized. Your work will directly impact the efficiency and accessibility of generative AI models for developers, enterprises, and researchers.

You will collaborate closely with world-class researchers to bring new model architectures into production, ensuring that the latest advancements in AI are effectively integrated into the platform. Your role will involve addressing performance-critical challenges, such as optimizing global request routing and load balancing, to enhance the overall user experience.

You will also contribute to and leverage open-source projects like SGLang and vLLM, pushing the boundaries of inference performance and efficiency. Your contributions will help shape the tools that advance the industry and make generative AI more accessible to a wider audience.

What we offer

Together AI offers competitive compensation, equity, and benefits, reflecting the value of your contributions to the team. You will be part of a culture that emphasizes deep technical ownership and high impact, where your work will make a significant difference in the field of AI. Join us in our mission to bring the most advanced generative AI models to the world and help shape the future of technology.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Together AI.

Apply Now →Get Job Alerts