CloudZero

Senior CloudOps Engineer

CloudZero$120K — $150K *
Information Technology
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • 3 to 5+ years of experience in building and operating distributed systems on AWS.
  • Strong skills in Python and Infrastructure as Code, specifically with Pulumi or Terraform.
  • Familiarity with frontier AI models (Claude, Codex, or Gemini).
  • Experience using monitoring tools like Prometheus or Datadog.
  • Proven capability to debug production issues calmly Under pressure.
  • Emphasis on thoughtful system design over reactive troubleshooting.
  • Strong documentation skills for team clarity and system stability.
  • Ability to communicate complex technical matters to non-technical stakeholders.

Responsibilities

  • Design and maintain Pulumi modules for reliable cloud resources.
  • Own the entire infrastructure process without manual console interventions.
  • Instrument systems for quick failure detection and data-driven debugging.
  • Embed observability into all systems to preempt customer-reported issues.
  • Automate deployments, scaling, backups, and changes intelligently.
  • Collaborate with Product Engineering to design resilient services and optimize deployment pipelines.
  • Ensure cost and performance efficiency to exemplify best practices in cloud usage.

Benefits

  • Work in a fast-growing company with a real impact on customer decisions.
  • Engage in meaningful operational challenges at scale.
  • Opportunity to shape performance and reliability of innovative cloud infrastructure.
  • Collaborative environment with Product Engineering teams enhancing skill development and knowledge sharing.
Full Job Description
About the Role

CloudZero is growing fast. Our customer base is expanding, the data challenges we're solving are getting more complex, and the platform is scaling to match. As a CloudOps Engineer you'll be a force multiplier for our engineering organization, owning the performance, reliability, and observability of CloudZero's infrastructure and empowering teams to ship features that help customers understand and optimize their cloud spend.

This is real infrastructure work at real scale, not a ticket-closing role or a console-clicking job. CloudZero processes billions of events daily across AWS, Azure, and GCP. Our customers rely on real-time, accurate cost data to make business-critical decisions, and any instability in our system impacts their planning. Built entirely on a unique serverless architecture with no EC2s or containers, our platform demands infrastructure that scales gracefully, fails predictably, and recovers automatically.

If you thrive on hard operational problems, care deeply about reliability and performance, and want to see your work matter to customers in direct and measurable ways, this role was built for you.

What You'll Do

Infrastructure as Code
  • Design and maintain Pulumi modules that provision reliable, cost-efficient cloud resources
  • Own infrastructure end to end with no clicking through consoles

Observability
  • Instrument systems so that failures surface quickly and debugging happens with data, not guesswork
  • Build observability into everything so you know about problems before customers do

Automation
  • Automate deployments, scaling, backups, and limit changes; if humans are doing it repeatedly, build a system to do it instead
  • Balance automation intelligently, building solutions to real problems rather than automating for its own sake

Partner with Product Engineering
  • Help teams design resilient services, review architectures for operational complexity, and build deployment pipelines that enable safe and fast shipping
  • Optimize for cost and performance; CloudZero's business is helping others optimize cloud costs, and we should be exemplars of efficient cloud usage ourselves
What You Bring
  • 3 to 5+ years of experience building and operating distributed systems in AWS
  • Strong skills in Python and Infrastructure as Code using Pulumi or Terraform
  • Experience with frontier AI models such as Claude, Codex, or Gemini
  • Hands-on experience with monitoring tools such as Prometheus or Datadog
  • Proven ability to debug production issues under pressure
  • Values thoughtful, reliable system design over reactive hero efforts
  • Strong documentation habits to support long-term team clarity and system stability
  • Ability to clearly explain complex technical issues to non-technical stakeholders
  • Excited to take ownership of infrastructure and solve operational challenges at scale

About CloudZero

CloudZero is a cloud cost intelligence platform that helps companies optimize their cloud spending. The company's platform provides real-time visibility into cloud costs and usage, allowing companies to identify areas where they can reduce costs and improve efficiency. CloudZero's software integrates with a variety of cloud providers, including Amazon Web Services, Microsoft Azure, and Google Cloud Platform. The company was founded in 2016 and is headquartered in Cambridge, Massachusetts.
Learn more about CloudZero
Size
50 employees
Industry
Net Income
-$3 million
Founded
2016
5 Year Trend
+80%
Revenue
$2 million

Similar Jobs

More Jobs at CloudZero

  • CloudZero
    GRC Manager
    $100K — $130K *
    Boston, MA 02115 (Suffolk County)
    Enterprise Technology
    In-Person
  • CloudZero
    GRC Manager
    $120K — $150K *
    San Francisco, CA 94112 (San Francisco County)
    Enterprise Technology
    In-Person
  • CloudZero
    Senior CloudOps Engineer
    $120K — $160K *
    San Francisco, CA 94112 (San Francisco County)
    Information Technology
    In-Person
  • CloudZero
    Senior CloudOps Engineer
    $120K — $150K *
    Boston, MA 02115 (Suffolk County)
    Information Technology
    In-Person

More Information Technology Jobs

Find similar Senior CloudOps Engineer jobs: