info_outline
X Note: By applying to this position you will have an opportunity to share your preferred working location from the following:
New York, NY, USA; Sunnyvale, CA, USA.
Minimum qualifications: - Bachelor's degree or equivalent practical experience.
- 8 years of experience programming in C or Python.
- 5 years of experience testing, and launching software products.
- 5 years of experience with performance, large-scale systems data analysis, visualization tools, or debugging.
- 3 years of experience with software design and architecture.
Preferred qualifications: - Experience with state-of-the-art ML compilers and their internals, experience writing compiler optimization passes.
- Experience with debugging correctness and performance issues at all levels of the ML software stack.
- Familiarity with accelerator HW architectures (TPUs/GPUs).
About the jobWith your technical expertise you will manage project priorities, deadlines, and deliverables. You will design, develop, test, deploy, maintain, and enhance software solutions.
As a part of this team, you will build the Accelerated Linear Algebra (XLA) compiler which enables Tensor Processing Unit (TPUs), Google's in-house custom designed processor, to accelerate machine learning and other scientific computing workloads for both internal Google customers and external Cloud customers.
You will need to support new workloads, optimize for new models and new characteristics, as well as support new TPU hardware across multiple generations.
In this role, you will be working on a state-of-the-art TPU compiler with opportunities to work up and down the compiler stack, as well as on end user ML models and on Hardware (HW)/Software (SW) co-design.The AI and Infrastructure team is redefining what's possible. We empower Google customers with breakthrough capabilities and insights by delivering AI and Infrastructure at unparalleled scale, efficiency, reliability and velocity. Our customers include Googlers, Google Cloud customers, and billions of Google users worldwide.
We're behind Google's groundbreaking innovations, empowering the development of AI models, delivering unparalleled computing power to global services, and providing the essential platforms that enable developers to build the future. From software to hardware our teams are shaping the future of world-leading hyperscale computing, with key teams working on the development of our TPUs, Vertex AI for Google Cloud, Google Global Networking, Data Center operations, systems research, and much more.Individual pay is determined by factors including job-related skills, experience, and relevant education or training.
US: $207000 - $301000 (USD) 20% bonus target equity benefits
Learn more about benefits at Google .
Responsibilities - Contribute to the compiler for a novel processor designed to accelerate machine learning workloads.
- Target and compile high-performance implementations of operations at distributed scale.
- Design and implement new compiler passes that extract more performance out of current and next-generation TPUs, directly impacting fleet efficiency.
- Collaborate closely with hardware designers to co-design future processors.
- Research high-level representations to effectively program large-scale, distributed, and heterogeneous systems.