
Empowering every person and organization on the planet
Microsoft Corporation, headquartered in Redmond, Washington, is a leading technology company known for its software products like Windows and Office, as well as cloud services through Azure. With over 100,000 employees, Microsoft serves millions of customers globally, including major enterprises lik...
Microsoft offers competitive salaries, stock options, generous PTO policies, and comprehensive health benefits. Employees also enjoy a flexible remote...
Microsoft fosters a culture of innovation and inclusivity, emphasizing collaboration across teams and a commitment to diversity. The company values em...

Microsoft • United States, Washington, Redmond, United States, California, Mountain View
Microsoft is seeking a Principal Software Engineer to advance the ad-serving infrastructure for Microsoft Advertising. You'll design and optimize high-performance serving systems and GPU inference frameworks, focusing on scalability and efficiency. This role requires expertise in CUDA, Python, and deep learning technologies.
Microsoft Advertising is seeking a highly experienced Principal Software Engineer to join our Ads Engineering Platform team and advance the core capabilities of our ad-serving infrastructure—the engine that powers advertising across Bing Search, MSN, Microsoft Start, and shopping experiences in the Edge browser. Our serving stack operates at massive global scale, delivering millions of ad requests per second through a geo-distributed, low-latency system that combines large-scale GPU/CPU inference, real-time bidding, and intelligent ranking pipelines. This role focuses on advancing the performance, efficiency, and scalability of the next generation of model serving and inference platforms for Ads.As a senior technical leader, you’ll design and optimize high-performance serving systems and GPU inference frameworks that drive measurable latency improvements and cost efficiency across Microsoft’s ad ecosystem. You’ll work across the stack—from CUDA kernel tuning and NUMA-aware threading to large-scale distributed orchestration and model deployment for deep learning and LLM workloads. This is a rare opportunity to shape the architecture of one of the world’s most advanced, mission-critical online serving platforms, collaborating with world-class engineers to deliver innovation at Internet scale.
Required/minimum qualifications
Software Engineering IC6 - The typical base pay range for this role across the U.S. is USD $163,000 - $296,400 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $220,800 - $331,200 per year.
Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:
https://careers.microsoft.com/us/en/us-corporate-pay
This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.
Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.
Apply now or save it for later. Get alerts for similar jobs at Microsoft.