LeethubLeethub
JobsCompaniesBlog
Go to dashboard

Leethub

Curated tech jobs from FAANG and top companies worldwide.

Top Companies

  • Google Jobs
  • Meta Jobs
  • Amazon Jobs
  • Apple Jobs
  • Netflix Jobs
  • All Companies →

Job Categories

  • Software Engineering
  • Data, AI & Machine Learning
  • Product Management
  • Design & User Experience
  • Operations & Strategy
  • Remote Jobs
  • All Categories →

Browse by Type

  • Remote Jobs
  • Hybrid Jobs
  • Senior Positions
  • Entry Level
  • All Jobs →

Resources

  • Google Interview Guide
  • Salary Guide 2025
  • Salary Negotiation
  • LeetCode Study Plan
  • All Articles →

Company

  • Dashboard
  • Privacy Policy
  • Contact Us
© 2026 Leethub LLC. All rights reserved.
Home›Jobs›Together AI›Network Architect
Together AI

About Together AI

Empowering corporate mentorship for effective learning

👥 21-100 employees📍 CityPlace, Toronto, ON💰 $1.7m
B2BHRLearningSaaSCommunity

Key Highlights

  • Founded in 2018, headquartered in Toronto, ON
  • Raised $1.7 million in seed funding
  • Partnerships with Heineken, Reddit, and 7-Eleven
  • 4 weeks paid vacation and competitive equity packages

Together is a corporate mentorship management platform founded in 2018, headquartered in CityPlace, Toronto, ON. The platform streamlines the mentorship lifecycle, facilitating connections among employees at companies like Heineken, Reddit, and 7-Eleven. With $1.7 million in seed funding, Together a...

🎁 Benefits

Together offers competitive salaries and equity packages, 4 weeks of paid vacation, and a comprehensive health, dental, and vision plan through Honeyb...

🌟 Culture

Together fosters a culture of autonomy and impact, allowing employees to take on significant responsibilities without bureaucratic constraints. The fo...

🌐 WebsiteAll 49 jobs →
Together AI

Network Architect

Together AI • San Francisco

Posted 2w ago🏛️ On-SiteSeniorNetwork engineer📍 San francisco
Apply Now →

Skills & Technologies

CiscoCloudTraffic engineeringLoad balancingCapacity planningBgpNetwork architectureData centerLatencyFault tolerance

Job Description

About the Role

Together AI is building the next-generation AI compute platform, and networking is at the center of that mission. As a Network Architect, you will define and evolve the global network architecture that powers our AI training, inference, and research platforms. This is a deeply technical and strategic role: you will own the end-to-end routing, topology, traffic engineering, and control-plane strategy for a global network spanning self-built data centers, partner colo, cloud environments, and high-capacity backbone fabrics.

You will collaborate closely with infrastructure engineering, compute systems, hardware, and operations teams to design architectures that deliver massive east–west bandwidth, low latency, high resiliency, and predictable performance at multi-terabit scale. Your work directly influences how we build, scale, and operate the physical and logical networks that underpin cutting-edge AI workloads.

This is a role for architects who are hands-on enough to validate designs in production, experienced enough to reason about systems at huge scale, and creative enough to develop architectures that don’t exist yet.

Responsibilities

  • Define and evolve Together AI’s global routing and backbone architecture, spanning self-built data centers, partner colocation sites, PoPs, cloud regions, and interconnect fabrics.
  • Establish the end-to-end topology strategy for high-bandwidth AI workloads: east–west fabrics, spine/superspine/core, DCI, and cross-region interconnect.
  • Design traffic engineering, load balancing, and capacity planning models to ensure low latency, deterministic performance, and fault tolerance at scale.
  • Develop the multicloud interconnect and peering strategy, including BGP policy frameworks, route leak mitigation, and security posture across heterogeneous networks.
  • Architect the control-plane stack for programmability, stability, and automation—including routing design, provisioning, configuration management, and state consistency.
  • Establish foundational observability primitives for a global backbone (telemetry, flow sampling, path validation, synthetic testing, health models).
  • Work closely with compute, storage, hardware, and data platform teams to ensure network design meets the performance demands of distributed AI training workloads.
  • Collaborate with operations and NOC teams to ensure designs are supportable, debuggable, and resilient under real-world failure conditions.
  • Provide architectural direction and mentorship to engineers across the org, influencing long-term strategy for both physical and virtual network domains.
  • Model evolving topologies for next-generation workloads (multi-Tbps east–west, high fan-in/fan-out distributed systems, GPU cluster fabrics).
  • Evaluate and guide the adoption of emerging technologies: advanced optical transport, RoCEv2, high-speed Ethernet fabrics, Infiniband overlays, EVPN/VXLAN, SR-MPLS/SRv6, programmable data planes, and hardware offload.

Requirements

  • Have deep experience designing and operating large-scale GPU clusters or HPC-style compute fabrics, and understand the unique demands these workloads place on network design (east–west dominance, congestion behavior, fan-in/fan-out patterns, loss sensitivity).
  • Are fluent in building high-throughput data center fabrics (leaf–spine/superspine/core) that support tens of thousands of GPUs, multi-terabit east–west traffic, and strict performance SLAs.
  • Have architected or operated RoCEv2 or lossless Ethernet environments at scale—including PFC/ECN tuning, congestion control, and end-to-end stability considerations.
  • Are experienced designing backbone and DCI architectures that support GPU training clusters across multiple regions, interconnect exotic fabrics, and handle high-volume synchronization traffic.
  • Have led architecture for networks spanning multiple clouds, private backbones, and diverse PoPs, and understand how AI workloads behave across these domains.
  • Design with operational realities in mind: observability, capacity modeling, automation, telemetry, and failure-mode analysis for GPU-heavy environments.
  • Are comfortable setting architectural direction in fast-moving environments where compute, storage, and network evolution are tightly coupled.

About Together AI

Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure.

Compensation

We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $250,000 - $280,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge.

Equal Opportunity

Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.

Please see our privacy policy at https://www.together.ai/privacy  



Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Together AI.

Apply Now →Get Job Alerts