Google

Staff Software Engineer, GPU Performance

Google$207K — $300K *
Consumer Technology
8 - 10 years of experience
Job Overview by Ladders

Qualifications

  • Bachelor's degree or equivalent practical experience.
  • 8 years of experience in software development.
  • 5 years experience in testing and launching software products, with 3 years in software design and architecture.
  • Proficiency with modern GPU architectures (NVIDIA, AMD, etc.) and understanding of memory hierarchies.
  • Experience with LLMs and their deployment on AI accelerators.
  • Low-level GPU programming expertise (CUDA, Triton, etc.) and performance engineering knowledge.

Responsibilities

  • Identify and maintain benchmarks for LLM training and serving to drive performance improvements.
  • Collaborate with teams like DeepMind to address complex ML model performance issues.
  • Conduct architecture-level simulations on GPU designs and perform roofline analysis for strategic guidance.
  • Evaluate performance and efficiency metrics to identify bottlenecks and engineer solutions on a large scale.
  • Execute performance benchmarks on GPU hardware using tools such as TRT-LLM and vLLM.

Benefits

  • Health, dental, vision, life, and disability insurance.
  • 401(k) retirement plan with company matching.
  • 20 days paid vacation, accruing at 6.15 hours per pay period in the first five years.
  • 40 hours of sick leave per year, increasing to 69 hours in Seattle, plus dispersal of 5 discretionary sick days.
  • Maternity leave of 28-30 weeks and baby bonding leave of 18 weeks.
  • 13 paid holidays each year.
Full Job Description
info_outline
X In accordance with Washington state law, we are highlighting our comprehensive benefits package, which is available to all eligible US based employees. Benefits for this role include:
  • Health, dental, vision, life, disability insurance
  • Retirement Benefits: 401(k) with company match
  • Paid Time Off: 20 days of vacation per year, accruing at a rate of 6.15 hours per pay period for the first five years of employment
  • Sick Time: 40 hours/year (increased to 69 hours/year for Seattle) including 5 discretionary sick days per instance
  • Maternity Leave (Short-Term Disability Baby Bonding): 28-30 weeks
  • Baby Bonding Leave: 18 weeks
  • Holidays: 13 paid days per year
Note: By applying to this position you will have an opportunity to share your preferred working location from the following: Sunnyvale, CA, USA; Kirkland, WA, USA; New York, NY, USA.

Minimum qualifications:
  • Bachelor's degree or equivalent practical experience.
  • 8 years of experience in software development.
  • 5 years of experience testing, and launching software products, and 3 years of experience with software design and architecture.
  • Experience with modern GPU architectures (NVIDIA, AMD, or other AI accelerators), memory hierarchies, and performance bottlenecks.
  • Experience with modern LLMs and their deployment on AI accelerators.
  • Experience with low-level GPU programming (CUDA, Triton, CUTLASS, etc.) and performance engineering techniques.

Preferred qualifications:
  • Master's degree or PhD in Engineering, Computer Science, or a related technical field.
  • 8 years of experience with data structures and algorithms.
  • 3 years of experience in a technical leadership role leading project teams and setting technical direction.
  • 3 years of experience working in a structured organization involving cross-functional, or cross-business projects.
  • Experience with compiler optimization, code generation, and runtime systems for GPU architectures (OpenXLA, MLIR, Triton, etc.).


Responsibilities
  • Identify and maintain LLM training and serving benchmarks, using them to identify performance opportunities, drive XLA:GPU/Triton performance toward XLA releases.
  • Engage with various teams, like DeepMind, to solve challenging ML model performance problems.
  • Run architecture-level simulations on GPU designs and perform roofline analysis to guide partner teams.
  • Analyze performance and efficiency metrics to identify bottlenecks and then design and implement solutions at Google fleet-wide scale.
  • Run performance benchmarks on GPU hardware using internal and external tools such as TRT-LLM, vLLM , and SGLang.


About Google

Google is a multinational technology company that specializes in Internet-related services and products. These include online advertising technologies, search engine, cloud computing, software, and hardware. Google was founded in 1998 by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University. The company has grown tremendously since then and has become one of the most valuable companies in the world. Google's mission is to organize the world's information and make it universally accessible and useful.
Learn more about Google
Size
156,500 employees
Market Cap
$1,115.4 billion
Industry
Net Income
$40.2 billion
Founded
1998
5 Year Trend
+23.3%
Revenue
$182.5 billion
NASDAQ

Similar Jobs

More Jobs at Google

More Consumer Technology Jobs

Find similar Staff Software Engineer, GPU Performance jobs: