LeethubLeethub
JobsCompaniesBlog
Go to dashboard

Leethub

Curated tech jobs from FAANG and top companies worldwide.

Top Companies

  • Google Jobs
  • Meta Jobs
  • Amazon Jobs
  • Apple Jobs
  • Netflix Jobs
  • All Companies →

Job Categories

  • Software Engineering
  • Data, AI & Machine Learning
  • Product Management
  • Design & User Experience
  • Operations & Strategy
  • Remote Jobs
  • All Categories →

Browse by Type

  • Remote Jobs
  • Hybrid Jobs
  • Senior Positions
  • Entry Level
  • All Jobs →

Resources

  • Google Interview Guide
  • Salary Guide 2025
  • Salary Negotiation
  • LeetCode Study Plan
  • All Articles →

Company

  • Dashboard
  • Privacy Policy
  • Contact Us
© 2026 Leethub LLC. All rights reserved.
Home›Jobs›Cohere›Staff Software Engineer, Inference Infrastructure
Cohere

About Cohere

AI solutions built for enterprise trust and security

🏢 Tech👥 501-1000 employees📅 Founded 2019📍 Grange Park, Toronto, ON💰 $1.5b⭐ 4
B2BArtificial IntelligenceMachine LearningSaaS

Key Highlights

  • Headquartered in Grange Park, Toronto, ON
  • $1.5 billion in funding from top investors
  • Clients include Royal Bank of Canada, Fujitsu, and Oracle
  • Focus on AI solutions for regulated industries

Cohere, headquartered in Grange Park, Toronto, ON, specializes in enterprise-grade AI solutions tailored for regulated industries such as banking and telecom. With $1.5 billion in funding, Cohere has secured contracts with major clients including Royal Bank of Canada, Fujitsu, and Oracle, providing ...

🎁 Benefits

Cohere offers comprehensive benefits including 100% coverage for health, dental, and vision insurance premiums, a $2,000 annual education benefit, six...

🌟 Culture

Cohere's culture emphasizes security and trust in AI adoption, focusing on enterprise needs rather than consumer trends. The company prioritizes a sup...

🌐 Website💼 LinkedInAll 123 jobs →
Cohere

Staff Software Engineer, Inference Infrastructure

Cohere • San Francisco

Posted 1d ago🏢 HybridSeniorStaff engineer📍 San francisco
Apply Now →

Skills & Technologies

PythonMachine learningNatural Language ProcessingDockerKubernetes

Overview

Cohere is hiring a Staff Software Engineer for their Inference Infrastructure team to build high-performance AI platforms. You'll work with technologies like Python and Docker to deploy optimized NLP models. This role requires experience in machine learning and scalable systems.

Job Description

Who you are

You have a strong background in software engineering with a focus on building scalable and reliable machine learning systems — your experience includes deploying models in production environments with low latency and high throughput. You are proficient in Python and have a solid understanding of machine learning principles, particularly in natural language processing (NLP). Your familiarity with containerization technologies like Docker and orchestration tools such as Kubernetes enables you to create efficient deployment pipelines. You thrive in collaborative environments, working closely with cross-functional teams to deliver impactful AI solutions.

You are passionate about the potential of AI to transform industries and improve user experiences — your previous projects have involved developing APIs that serve machine learning models, and you understand the importance of optimizing performance and reliability. You are comfortable interfacing with customers to gather requirements and provide tailored solutions that meet their needs. Your problem-solving skills are complemented by a keen attention to detail, ensuring that the systems you build are robust and maintainable.

Desirable

Experience with large language models and familiarity with AI frameworks such as TensorFlow or PyTorch would be a plus. You may also have knowledge of cloud platforms like AWS or GCP, which can enhance your ability to deploy and scale applications effectively. A background in research or a strong understanding of the latest advancements in AI and NLP will set you apart in this role.

What you'll do

As a Staff Software Engineer at Cohere, you will play a crucial role in shaping the future of AI by developing and deploying the infrastructure that powers our large language models. You will work on optimizing the performance of our AI platform, ensuring that it can handle high volumes of requests with minimal latency. Collaborating with researchers and product teams, you will help define the architecture and design of our model serving systems, focusing on scalability and reliability.

You will be responsible for implementing best practices in software development, including code reviews, testing, and documentation — your contributions will directly impact the quality and efficiency of our AI services. You will also engage with customers to understand their needs and provide support in integrating our models into their applications. Your role will involve continuous learning and adaptation as you stay updated on the latest trends and technologies in AI and machine learning.

What we offer

Cohere provides a supportive and inclusive work environment where you can thrive both personally and professionally. We offer a competitive salary and benefits package, including a generous vacation policy and parental leave. Our commitment to mental health and well-being is reflected in our personal enrichment benefits, which support your interests in arts, culture, and fitness. With flexible remote work options and offices in major cities, including San Francisco, you can choose the work environment that suits you best. Join us in our mission to scale intelligence and make a meaningful impact on the future of AI.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Cohere.

Apply Now →Get Job Alerts