OverviewThe ProductThe Copilot, Agents, and Platform (CAP) organization powers mission-critical Microsoft 365 and Dynamics 365 services, including BizChat, Work IQ, Agent 365, and the agent runtime for the next era of work. Together, we are building the platform that will power Microsoft's AI experiences for decades to come.
The TeamWe are the M365 Core AI Inferencing Engineering team within CAP. Every major technology shift-from PCs to mobile and from on-premises to cloud-has fundamentally changed how products are built. AI is driving the next transformation, and GPU management has become critical infrastructure for this new era.
Our mission is to build and operate mission-critical, hyperscale, high-performance, cost-efficient, and compliant AI infrastructure that powers Microsoft's Large Language Model (LLM) services across Microsoft 365, Business & Industry Copilot, and other AI-powered products.
We are a growing team of software engineers, architects, applied scientists, and product managers from diverse backgrounds and experiences. We collaborate across Microsoft to solve challenging problems at scale and help teams deliver innovative AI experiences. We embrace AI in every aspect of our work and are driven by customer focus, technical excellence, creativity, teamwork, accountability, agility, and continuous learning.
The CandidateWe are looking for a Software Engineer II to help build the future of AI infrastructure at Microsoft. This is an opportunity to work on cutting-edge AI technologies at a scale and impact that is difficult to find anywhere else.
You will partner with teams across CAP and Microsoft to build the AI inferencing platform that enables engineers and applied scientists to rapidly develop AI-powered innovations using the latest models and technologies in a compliant, reliable, and cost-efficient manner.
Are you passionate about shaping the AI-first era? Do you enjoy solving complex technical challenges that impact millions of users? Do you thrive in a fast-paced environment where engineers take end-to-end ownership and collaborate closely to deliver results?
Join us and help build the AI inferencing platform that enables Microsoft's engineers and data scientists to do their best work while powering the next generation of AI experiences.
Responsibilities- Lead the design, implementation, and delivery of LLM API management service for millions of customers
- Maniacally manage cost and availability - set the benchmark for the industry
- Coach your team for building and running large scale platforms and experiences that get used by hundreds of millions of users every day
- Work independently and collaboratively with other product teams across Power Platform, BizApps, and Microsoft
QualificationsRequired Qualifications:- Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
- OR equivalent experience.
Other Requirements:Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings:
- Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
Preferred Qualifications:- Master's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
- OR Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
- OR equivalent experience.
- Solid coding skills in Python and/or C#, .Net.
- Solid technical expertise, met by proficiency in implementing 2 or more of the following or equivalent areas:
- 4+ years building and running high scale services.
- Experience with LLMs, orchestrators, embedding models, and vector databases.
- Implementation of Distributed Systems.
- Implementation of Message based architecture.
- High scale OLTP or OLAP storage implementations.
- Experience leading and delivering projects that span multiple engineering organizations.
#M365Core, #CAP
This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.