LeethubLeethub
JobsCompaniesBlog
Go to dashboard

Leethub

Curated tech jobs from FAANG and top companies worldwide.

Top Companies

  • Google Jobs
  • Meta Jobs
  • Amazon Jobs
  • Apple Jobs
  • Netflix Jobs
  • All Companies →

Job Categories

  • Software Engineering
  • Data, AI & Machine Learning
  • Product Management
  • Design & User Experience
  • Operations & Strategy
  • Remote Jobs
  • All Categories →

Browse by Type

  • Remote Jobs
  • Hybrid Jobs
  • Senior Positions
  • Entry Level
  • All Jobs →

Resources

  • Google Interview Guide
  • Salary Guide 2025
  • Salary Negotiation
  • LeetCode Study Plan
  • All Articles →

Company

  • Dashboard
  • Privacy Policy
  • Contact Us
© 2026 Leethub LLC. All rights reserved.
Home›Jobs›Rackspace›AI Model Serving Specialist
Rackspace

About Rackspace

Your partner in managed cloud solutions

🏢 Tech👥 5K-10K📅 Founded 1998📍 San Antonio, Texas, United States

Key Highlights

  • Headquartered in San Antonio, Texas
  • Over 200,000 customers including BMW and NASA
  • $1.5B+ raised in funding
  • Approximately 7,000 employees worldwide

Rackspace Technology, Inc., headquartered in San Antonio, Texas, is a leading managed cloud computing company that provides services such as cloud migration, managed hosting, and multi-cloud solutions. With over 200,000 customers, including major brands like BMW and NASA, Rackspace has raised over $...

🎁 Benefits

Employees enjoy competitive salaries, stock options, generous PTO policies, remote work flexibility, and comprehensive health benefits....

🌟 Culture

Rackspace fosters a customer-centric culture with a strong emphasis on service excellence and innovation in cloud technology, encouraging employees to...

🌐 Website💼 LinkedIn𝕏 TwitterAll 90 jobs →
Rackspace

AI Model Serving Specialist

Rackspace • United States - Remote

Posted 2w ago🏠 RemoteAi engineer📍 United states💰 USD82,300 - USD140,580 / year
Apply Now →

Skills & Technologies

KubernetesNvidia tritonVllmKserveRbacApiGpuHelmPython

Job Description

Role Purpose

Enable enterprise customers to operationalize AI workloads by deploying and optimizing model-serving platforms (e.g., NVIDIA Triton, vLLM, KServe) within Rackspace’s Private Cloud and Hybrid environments. This role bridges AI engineering and platform operations, ensuring secure, scalable, and cost-efficient inference services.

Key Responsibilities : -
Model Deployment & Optimization 
Package and deploy ML/LLM models on Triton, vLLM, or KServe within Kubernetes clusters.
Tune performance (batching, KV-cache, TensorRT optimizations) for latency and throughput SLAs.

Platform Integration 
Work with VMware VCF9, NSX-T, and vSAN ESA to ensure GPU resource allocation and multi-tenancy.
Implement RBAC, encryption, and compliance controls for sovereign/private cloud customers.

API & Service Enablement 
Integrate models with Rackspace’s Unified Inference API and API Gateway for multi-tenant routing.
Support RAG and agentic workflows by connecting to vector databases and context stores.

Observability & FinOps 
Configure telemetry for GPU utilization, request tracing, and error monitoring.
Collaborate with FinOps to enable usage metering and chargeback reporting.

Customer Engineering Support 
Assist solution architects in onboarding customers, creating reference patterns for BFSI, Healthcare, and other verticals.
Provide troubleshooting and performance benchmarking guidance.

Continuous Improvement 
Stay current with emerging model-serving frameworks and GPU acceleration techniques.
Contribute to reusable Helm charts, operators, and automation scripts.

Interested in this role?

Apply now or save it for later. Get alerts for similar jobs at Rackspace.

Apply Now →Get Job Alerts