
Empowering every person and organization on the planet
Microsoft Corporation, headquartered in Redmond, Washington, is a leading technology company known for its software products like Windows and Office, as well as cloud services through Azure. With over 100,000 employees, Microsoft serves millions of customers globally, including major enterprises lik...
Microsoft offers competitive salaries, stock options, generous PTO policies, and comprehensive health benefits. Employees also enjoy a flexible remote...
Microsoft fosters a culture of innovation and inclusivity, emphasizing collaboration across teams and a commitment to diversity. The company values em...

Microsoft • United States, Washington, Redmond, United States, California, Mountain View
Microsoft Advertising is seeking a Senior Software Engineer to join our Ads Engineering Platform team and advance the core capabilities of our ad-serving infrastructure—the engine that powers advertising across Bing Search, MSN, Microsoft Start, and shopping experiences in the Edge browser. Our serving stack operates at massive global scale, delivering millions of ad requests per second through a geo-distributed, low-latency system that combines large-scale GPU/CPU inference, real-time bidding, and intelligent ranking pipelines. This role focuses on advancing the performance, efficiency, and scalability of the next generation of model serving and inference platforms for Ads.As a senior technical leader, you’ll design and optimize high-performance serving systems and GPU inference frameworks that drive measurable latency improvements and cost efficiency across Microsoft’s ad ecosystem. You’ll work across the stack—from CUDA kernel tuning and NUMA-aware threading to large-scale distributed orchestration and model deployment for deep learning and LLM workloads. This is a rare opportunity to shape the architecture of one of the world’s most advanced, mission-critical online serving platforms, collaborating with world-class engineers to deliver innovation at Internet scale.
Design and lead the development of large-scale, distributed online serving systems—including GPU-accelerated and CPU-based ranking/inference pipelines—to process millions of ad requests per second with ultra-low latency, high throughput, and strong reliability.
Architect and optimize end-to-end inference infrastructure, including model serving, batching/streaming, caching, scheduling, and resource orchestration across heterogeneous hardware (GPU, CPU, and memory tiers).
Profile and optimize performance across the full stack—from CUDA kernels and GPU pipelines to CPU threads and OS-level scheduling—identifying bottlenecks, tuning latency tails, and improving cost efficiency through advanced profiling and instrumentation.
Own live-site reliability as a DRI: design telemetry, alerting, and fault-tolerance mechanisms; drive rapid diagnosis and mitigation of performance regressions or outages in globally distributed systems.
Collaborate and mentor across teams—driving architecture reviews, enforcing engineering excellence, promoting system-level optimization practices, and mentoring others in deep debugging, profiling, and performance engineering.
Required/minimum qualifications
Additional or preferred qualifications
Software Engineering IC4 - The typical base pay range for this role across the U.S. is USD $119,800 - $234,700 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $158,400 - $258,000 per year.
Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:
https://careers.microsoft.com/us/en/us-corporate-pay
This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.
Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.
Apply now or save it for later. Get alerts for similar jobs at Microsoft.