Pinterest

Sr. Production Engineer, Solutions Engineering

Pinterest$139K — $287K *
US-AnywhereRemote in United States
Information Technology
5 - 7 years of experience
Job Overview by Ladders

Qualifications

  • 5+ years of experience in large-scale distributed systems
  • Bachelor's degree in Computer Science or equivalent
  • Strong programming skills in Python or Go
  • Deep understanding of Linux/Unix systems
  • Experience with Infrastructure as Code tools
  • Experience deploying web applications to cloud infrastructure
  • Preferred: Experience with AI agents in infrastructure automation

Responsibilities

  • Design and build AI agents to enhance production reliability
  • Lead infrastructure modernization with AI initiatives
  • Transform consulting patterns into scalable platforms
  • Develop knowledge infrastructure for operational agents
  • Create software solutions for reliability in distributed systems
  • Build tools and automation to reduce operational overhead
  • Develop meaningful service level indicators for system health

Benefits

  • Flexible work model with occasional in-office collaboration
  • Opportunity to work remotely with minimal in-office requirements
  • Equity eligibility based on performance
  • Inclusive and equitable workplace culture
  • Access to Pinterest's resources and technology for personal development
Full Job Description
The Production Engineering organization at Pinterest is accountable for ensuring overall Pinterest availability as well as enhancing Engineering teams' capability to design, build and operate robust systems at scale. Pinterest's applications and infrastructure handle billions of monthly page views and petabytes of data as Pinterest continues to grow and scale.

As a Senior Production Engineer on Solutions Engineering, you will design and build AI agents, platforms, tools, frameworks and methodologies to assure the reliability of our large-scale distributed systems serving hundreds of millions of monthly active users, handling hundreds of thousands of requests per second, and managing tens of petabytes of data. You'll lead infrastructure modernization initiatives, build intelligent automation that eliminates operational toil and amplifies engineering productivity, and transform successful consulting patterns into reusable platforms that democratize reliability expertise across Pinterest's 2500+ engineers.

What you'll do
  • Design and build AI agents that augment production reliability work - Develop agents that assist engineers with service health analysis, reliability recommendations, migration playbook generation, and risk identification, enabling faster decision-making while keeping humans in the loop for critical judgment calls
  • Drive large-scale infrastructure modernization with AI-accelerated execution - Lead Kubernetes adoption and platform transitions using AI to generate automation, accelerate delivery, and create patterns that enable self-service adoption for standard use cases while tackling novel architecture challenges
  • Transform consulting patterns into scalable platforms - Execute scoped reliability engagements with engineering teams, then encode successful approaches into AI-assisted tools, automation, and self-serve documentation that enable teams to handle similar problems independently while escalating complex challenges to experts
  • Build the knowledge infrastructure that powers Pinterest's operational agent ecosystem - Create migration playbooks, operational runbooks, incident patterns, and best practices that democratize reliability expertise and raise the baseline capabilities of all Pinterest engineers
  • Develop software solutions to enable reliability and operability of large-scale distributed systems - Build a deep understanding of how Pinterest's systems behave, scale, interact and fail, and use that insight to identify risks and opportunities for remediation through automation
  • Build tools and automation to eliminate toil and reduce operational overhead - Create frameworks, processes and best practices that encode reliability expertise into software, making operational excellence accessible to all engineers while freeing experts to tackle harder problems
  • Build meaningful, insightful and actionable SLIs - Develop service level indicators that provide clear signals of system health and enable data-driven reliability decisions across Pinterest Engineering
  • Automate critical portions of Pinterest's engineering processes - Build automation that minimizes risk and maximizes the speed of innovation, enabling safe, rapid deployment and operational changes at scale
  • Manage capacity and performance to help scale our infrastructure - Partner with teams to plan and optimize capacity across public and private clouds around the world, ensuring efficient resource utilization as Pinterest grows


What we're looking for:
  • 5+ years of industry experience building and operating large-scale, high-performance distributed systems
  • Bachelor's degree in Computer Science or related field, or equivalent experience
  • Strong programming skills in Python or Go - ability to build production-grade platforms, agents, and automation
  • Deep knowledge of Linux/Unix internals and experience with open source infrastructure (MySQL, Kafka, Envoy, Hadoop, etc.)
  • Infrastructure as Code experience (Terraform, Puppet, Chef, Ansible, Docker, Kubernetes)
  • Experience deploying web applications to cloud infrastructure (AWS, GCP, or Azure) and working with distributed, service-oriented architecture
  • Preferred:
    • Experience developing AI agents for infrastructure automation, operational decision-making, or reliability workflows
    • AI/ML infrastructure experience (LLM-based systems, model serving, agentic workflows)
    • Technical consulting or embedded SRE experience with cross-functional engineering teams


In-Office Requirement Statement:
  • We let the type of work you do guide the collaboration style. That means we're not always working in an office, but we continue to gather for key moments of collaboration and connection.
  • This role will need to be in the office for in-person collaboration 1-2 times every 6 months and therefore can be situated anywhere in the country.


Relocation Statement:
  • This position is not eligible for relocation assistance. Visit our PinFlex page to learn more about our working model.


#LI-REMOTE

#LI-JT1

At Pinterest we believe the workplace should be equitable, inclusive, and inspiring for every employee. In an effort to provide greater transparency, we are sharing the base salary range for this position. The position is also eligible for equity. Final salary is based on a number of factors including location, travel, relevant prior experience, or particular skills and expertise.

Information regarding the culture at Pinterest and benefits available for this position can be found here.

US based applicants only

$139,764-$287,749 USD

About Pinterest

Pinterest is a social media platform that allows users to discover and save ideas for recipes, home decor, fashion, and more. The company was founded in 2010 and is headquartered in San Francisco, California. Pinterest has over 400 million monthly active users and is available in over 30 languages. The company's mission is to help people discover and do what they love.
Learn more about Pinterest
Size
3,225 employees
Market Cap
$16 billion
Industry
Net Income
-$128.3 million
Founded
2009
5 Year Trend
+53.9%
Revenue
$1.6 billion
NASDAQ

Similar Jobs

More Jobs at Pinterest

More Information Technology Jobs

Find similar Sr. Production Engineer, Solutions Engineering jobs: