Google

Software Engineering Manager, Fault Tolerance Testing

Google$207K — $301K *
Information Technology
8 - 10 years of experience
Job Overview by Ladders

Qualifications

  • Bachelor's degree or equivalent practical experience
  • 8 years in software development
  • 5 years creating product roadmaps and collaborating with teams
  • 3 years in reliability engineering
  • 3 years in a technical leadership role
  • 2 years in people management or team leadership

Responsibilities

  • Set and communicate team priorities aligned with organizational goals
  • Establish clear individual expectations and provide regular feedback
  • Develop and evolve mid-term technical goals and roadmaps
  • Design and assess systems solutions to complex problems
  • Review and provide feedback on code to ensure best practices

Benefits

  • Health, dental, vision, life, disability insurance
  • 401(k) retirement plan with company match
  • 20 days of vacation per year, with accrual for the first five years
  • 40 hours of sick time/year, increasing to 69 hours/year in Seattle
  • 29 weeks of maternity leave
  • 18 weeks of baby bonding leave
  • 13 paid holidays per year
Full Job Description
info_outline
X In accordance with Washington state law, we are highlighting our comprehensive benefits package, which is available to all eligible US based employees.

Benefits for this role include:
  • Health, dental, vision, life, disability insurance
  • Retirement Benefits: 401(k) with company match
  • Paid Time Off: 20 days of vacation per year, accruing at a rate of 6.15 hours per pay period for the first five years of employment
  • Sick Time: 40 hours/year (increased to 69 hours/year for Seattle) including 5 discretionary sick days per instance
  • Maternity Leave (Short-Term Disability Baby Bonding): 28-30 weeks
  • Baby Bonding Leave: 18 weeks
  • Holidays: 13 paid days per year


Minimum qualifications:
  • Bachelor's degree, or equivalent practical experience.
  • 8 years of experience in software development.
  • 5 years of experience creating product roadmaps, and working with cross-functional teams.
  • 3 years of experience in reliability engineering.
  • 3 years of experience in a technical leadership role.
  • 2 years of experience in a people management or team leadership role.

Preferred qualifications:
  • Master's degree or PhD in Computer Science or related technical field.
  • Experience designing, building, or operating highly available, fault-tolerant distributed systems. Direct experience with chaos engineering, fault-injection testing frameworks, or large-scale disaster recovery simulations.
  • Background in designing developer-facing products, APIs, SDKs, or self-service automation tools, with a strong emphasis on reducing friction and improving developer velocity.
  • Experience with Google's server frameworks (Pod), gRPC/Stubby-based RPC layers, or container orchestrators is highly desirable.
  • Experience defining and driving organizational key metrics (SLIs/SLOs, adoption rates, platform health) to measure the success of infrastructure initiatives.


About the job

Like Google's own ambitions, the work of a Software Engineer goes beyond just Search. Software Engineering Managers have not only the technical expertise to take on and provide technical leadership to major projects, but also manage a team of Engineers. You not only optimize your own code but make sure Engineers are able to optimize theirs. As a Software Engineering Manager you manage your project goals, contribute to product strategy and help develop your team. Teams work all across the company, in areas such as information retrieval, artificial intelligence, natural language processing, distributed computing, large-scale system design, networking, security, data compression, user interface design; the list goes on and is growing every day. Operating with scale and speed, our exceptional software engineers are just getting started -- and as a manager, you guide the way.

With technical and leadership expertise, you manage engineers across multiple teams and locations, a large product budget and oversee the deployment of large-scale projects across multiple sites internationally.

In this role, you will drive the technical goal to transition Fault Tolerance Testing (FTT) from a set of manual compliance verification tools to a proactive, AI-driven, and autonomous resilience platform. This position sits at the intersection of large-scale distributed systems, developer velocity, and cloud reliability, offering immense visibility and the opportunity to directly safeguard Google Cloud Platform's (GCP) global infrastructure.

Individual pay is determined by factors including job-related skills, experience, and relevant education or training.

US: $207000 - $301000 (USD) 20% bonus target bonus equity benefits

Learn more about benefits at Google .

Responsibilities
  • Set and communicate team priorities that support the broader organization's goals. Align strategy, processes, and decision-making across teams.
  • Set clear expectations with individuals based on their level and role and aligned to the broader organization's goals. Meet regularly with individuals to discuss performance and development and provide feedback and coaching.
  • Develop the mid-term technical goal and roadmap within the scope of your (often multiple) team(s). Evolve the roadmap to meet anticipated future requirements and infrastructure needs.
  • Design, guide and vet systems designs within the scope of the broader area, and write product or system development code to solve ambiguous problems.
  • Review code developed by other engineers and provide feedback to ensure best practices (e.g., style guidelines, checking code in, accuracy, testability, and efficiency).

About Google

Google is a multinational technology company that specializes in Internet-related services and products. These include online advertising technologies, search engine, cloud computing, software, and hardware. Google was founded in 1998 by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University. The company has grown tremendously since then and has become one of the most valuable companies in the world. Google's mission is to organize the world's information and make it universally accessible and useful.
Learn more about Google
Size
156,500 employees
Market Cap
$1,115.4 billion
Industry
Net Income
$40.2 billion
Founded
1998
5 Year Trend
+23.3%
Revenue
$182.5 billion
NASDAQ

Similar Jobs

More Jobs at Google

More Information Technology Jobs

Find similar Software Engineering Manager, Fault Tolerance Testing jobs: