Engineer, Production Engineering

Guild.ai, Inc

• $120K — $150K *

San Francisco, CA 94112In-Person

Technical Services

5 - 7 years of experience

Today

Be an Early Applicant

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

5+ years in Production Engineering, Platform Engineering, or security-focused infrastructure role
Strong hands-on experience with Kubernetes and GCP in production
Proficient in Terraform for infrastructure management
Strong programming skills in Python, Go, TypeScript, etc.
Experience with compliance frameworks like SOC2 and secure system design

Responsibilities

Manage and evolve production and staging infrastructure on GCP using Terraform
Deploy and operate within customer VPCs across AWS, Azure, and GCP
Build and maintain Kubernetes-based sandboxing for agent execution
Own observability stack with OpenTelemetry and integrations like New Relic and Splunk
Lead work for SOC2 compliance, including audits and control implementations
Manage HackerOne engagement for penetration testing and bug bounty
Design and maintain automated CI/CD workflows for deployment

Benefits

Hybrid/Onsite work model in the San Francisco Bay Area
Opportunity to work in an early-stage startup environment
Direct contribution to product development
High autonomy in decision-making and tool selection
Engagement with cutting-edge infrastructure for AI agents

Full Job Description

Engineer - Production Engineering

Location: San Francisco Bay Area (Hybrid/Onsite)
Type: Full-time
Stage: Early-stage startup
About the Role

We are building the control plane for AI agents in teams and companies.

As a Production Engineer, you will own the infrastructure, security, and compliance systems that allow our platform to ship fast and run reliably at scale. This is not a traditional ops role - you will write real code, contribute directly to the product, and own the full security and compliance surface of an early-stage company.

You'll work across Kubernetes infrastructure, cloud delivery, agent sandboxing, SOC2 compliance, IT systems, and production observability - and you'll contribute to the product itself, building security-sensitive features and auditing application code for vulnerabilities.

If you want to own the production backbone for the agent-native era - from a Terraform module to a pentest to an API key implementation - we want to talk.
What You'll Own

1. Cloud & Kubernetes Infrastructure

Our Stack: Manage and evolve our production and staging infrastructure on GCP (GKE) using Terraform. Own DNS, networking, and environment configuration end-to-end.
Customer Environments: Deploy and operate within customer VPCs across AWS, Azure, and GCP - adapting to varied infrastructure constraints, security requirements, and enterprise networking configurations.
Agent Sandboxing: Build and maintain Kubernetes-based sandboxing for agent execution - ensuring agents operate within strict network boundaries and must route through our API gateway rather than having unfettered internet access.
Observability: Own our observability stack, including OpenTelemetry instrumentation and integrations with New Relic and Splunk, to give the team deep visibility into system performance and agent runtime behavior.

2. Security, Compliance & IT

SOC2 & Audits: Lead infrastructure and operational work to support SOC2 compliance, including audit preparation, evidence collection, and control implementation.
Penetration Testing & Bug Bounty: Manage our HackerOne engagement - coordinating pentests, triaging incoming bug bounty reports, and driving remediation.
Product Security: Audit application code for security vulnerabilities, contribute security-sensitive product features (e.g., API key management), and ensure product and infrastructure security are coherent end-to-end.
IT & Identity: Own our IT stack - Okta, device management, and access controls - keeping the company secure as we scale.

3. CI/CD & Progressive Delivery

Deployment Pipelines: Design and maintain safe, automated CI/CD workflows supporting rollout strategies like canary and blue-green deployments.
Release Velocity: Make shipping to production a routine, boring, highly automated non-event.

What We're Looking For

Strong Fit

Experience: 5+ years in Production Engineering, Platform Engineering, or a security-focused infrastructure role, ideally at a fast-growing startup or SaaS company.
Our Stack: Strong hands-on experience with Kubernetes and GCP in production; comfortable with Terraform for managing real infrastructure.
Code over Click: Strong programming skills (Python, Go, TypeScript, etc.) with a passion for automating away toil.
Security Depth: Hands-on experience with compliance frameworks (SOC2), vulnerability management, and secure system design.

Bonus Points

Background with multi-tenant SaaS or enterprise security and procurement requirements.
Exposure to AI/ML infrastructure, particularly agent runtimes.
Experience building security-sensitive product features alongside infrastructure work.
Experience supporting pentests / bug bounties
Experience deploying and operating in customer VPCs or other external cloud environments across AWS, Azure, and/or GCP - navigating enterprise networking, security, and access constraints.

Why This Role is Unique

Broad Ownership: You'll own the full security and compliance surface of an early-stage company - from SOC2 to sandboxed agent execution to IT - while also contributing directly to the product.
Agent Infrastructure: You'll design infrastructure for autonomous AI agents, not just traditional web services - introducing unique sandboxing, observability, and security challenges.
Our Infra and Theirs: You'll operate across both our own production environment and customer cloud environments, requiring you to be fluent across AWS, Azure, and GCP.
High Autonomy: As an early hire, you'll have a seat at the table to choose the tools and define the architecture that carries us to scale.

Who Thrives Here

Engineers who are as comfortable reading application code for vulnerabilities as they are writing a Terraform module.
People who enjoy owning the full security and compliance surface, not just one layer of it.
Builders who can navigate the constraints of customer enterprise environments without losing velocity.
Those who are energized - not overwhelmed - by the breadth of an early-stage technical operations role.

* Ladders Estimates

Similar Jobs

DevSecOps Chief
$150K — $180K *
SAIC
Beale Afb, CA 95903 (Yuba County)
4 days ago
DevSecOps Engineer - Intelligent Platforms & Agents
$107K — $195K *
Leidos
Remote
6 days ago
Senior DevSecOps Engineer - Tech Lead
$120K — $150K *
The Real Brokerage
Remote
1 week ago
DevSec Ops Engineer
$120K — $150K *
SAIC
Beale Afb, CA 95903 (Yuba County)
3 weeks ago
Solutions Operations Lead
$149K — $248K *
Guidehouse
Remote
1 month ago

Get Ready For Your
Next Interview

More Jobs at Guild.ai, Inc

Forward Deployed Engineer (FDE)
$120K — $150K *
San Francisco, CA 94112 (San Francisco County)
Reposted 2 weeks ago
Information Technology
In-Person
Brand & Marketing Designer
$90K — $130K *
San Francisco, CA 94112 (San Francisco County)
Reposted 2 weeks ago
Consumer Technology
In-Person
Product Manager
$120K — $160K *
San Francisco, CA 94112 (San Francisco County)
Reposted 2 weeks ago
Enterprise Technology
In-Person
AI Engineer, Production Agents
$130K — $180K *
San Francisco, CA 94112 (San Francisco County)
Reposted 3 weeks ago
Information Technology
In-Person
Software Engineer - Agent Control Plane
$130K — $180K *
San Francisco, CA 94112 (San Francisco County)
Reposted 3 weeks ago
Information Technology
In-Person

More Technical Services Jobs

BI Consultant & Solutions Lead
$120K — $150K *
Confidential Company
San Diego, CA 92101 (San Diego County)
Today
Residential Sales Consultant I
$90K — $120K *
Service Experts
Richmond, VA 23223 (Richmond City County)
Reposted Today
Operations Engineer, GES NA Ops Engineering
$76K — $114K *
Amazon
Arlington, VA 22204 (Arlington County)
Reposted Today
Launch Leader, SSD OpX Launch Excellence
$120K — $163K *
Amazon
American Fork, UT 84003 (Utah County)
Today
Commercial HVAC Service Technician
$71K — $126K *
Oklahoma Chiller
Oklahoma City, OK 73160 (Cleveland County)
Today

Find similar Engineer, Production Engineering jobs:

Nationwide San Francisco, CA

Engineer, Production Engineering

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar Engineer, Production Engineering jobs:

Get Ready For Your
Next Interview