Staff Data Platform Engineer

Tatari

• $190K — $240K *

San Francisco, CA 94112In-Person

Information Technology

Less than 5 years of experience

Today

Be an Early Applicant

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

3+ years in cloud infrastructure, SRE, or platform engineering (AWS preferred)
Operational instinct honed through experience with production systems
Strong fundamentals in Linux and scripting languages (Bash, Python, etc.)
Experience with high availability architecture techniques (blue/green deployments, load balancing)
Familiarity with workflow orchestration tools (e.g., Airflow)
Knowledge of distributed data processing frameworks (e.g., Spark)
Experience with containerization and orchestration tools (e.g., Kubernetes, Docker)
Competence in infrastructure-as-code techniques (e.g., Terraform)

Responsibilities

Own the reliability and availability of the data platform across all environments
Enforce and improve environmental promotion discipline
Define and uphold SOPs for deployments and maintenance
Instrument and monitor platform health using observability tooling
Participate in architecture discussions, ensuring readiness of systems
Collaborate with cross-functional teams on infrastructure needs
Identify and remediate reliability risks before they escalate

Benefits

Health insurance coverage for you and your dependents
401K, FSA, and commuter benefits
$150 monthly spending account
$1,000 annual continued education benefit
$500 Newbie Productivity Perk
Unlimited PTO and sick days
Monthly Company Wellness Day Off
Snacks, drinks, and catered lunches at the office
Team building events
Hybrid RTO of 2 days per week in office.

Full Job Description

This is a systems and infrastructure position first. As a Data Platform Engineer, you will be responsible for the reliability, stability, and operational health of our data platform - including how it is deployed, monitored, maintained, and promoted across environments. Data engineering skills are a plus and will be developed on the job; what we cannot teach is operational discipline.

If you have spent your career keeping production systems alive, know what it feels like to break prod and never want to do it again, and treat lower environments as non-negotiable gates rather than suggestions - we want to talk to you.

This is not a data engineering role. You will not spend most of your time writing jobs or consuming the platform. You will be administering, scaling, hardening, and evolving it.Responsibilities

Own the reliability and availability of our data platform infrastructure across all environments
Enforce and improve environment promotion discipline - staging is not prod, and prod is sacred
Define and uphold SOPs around deployments, maintenance windows, and change management
Instrument and monitor platform health using observability tooling; build alerting that means something
Participate in architecture and deployment discussions; push back when something isn't ready
Collaborate with data scientists, engineers, and product managers on infrastructure needs - as a partner, not an order-taker
Identify and remediate reliability risks before they become incidents
Support customer-facing and internal systems with a bias toward stability over velocity

QualificationsThe right candidate leans SRE. Data platform experience is additive - we will train the right person. Bullets marked with * are strongly preferred; all others are meaningful signal.

Operational instinct - "the fear" - you've been burned by prod, you respect it, and you've built habits around it. You know what a proper maintenance window looks like, you communicate before you touch production, and you don't spin up new initiatives while something critical is still burning in.
3+ years in cloud infrastructure, SRE, or platform engineering (AWS preferred; GCP/Azure experience translates)
High Availability architecture: blue/green deployments, data replication, load balancing
Experience with workflow orchestration (Airflow or similar DAG-based schedulers - or general job scheduling/cron systems at scale)
Strong Linux fundamentals and scripting (Bash, Python, or similar)
Distributed data processing (Spark, PySpark, or similar big data frameworks - or experience managing clusters that run them)
Containerization and orchestration (Kubernetes, Docker, or similar)
Data ingestion, ETL, or streaming systems (Kafka, Flink, or similar - or experience operating message queues and pipelines)
Infrastructure-as-code and provisioning (Terraform, Helm, or similar)
OLAP and OLTP databases (Clickhouse, Postgres, Redshift, or similar - query patterns, indexing, and operational care)
Monitoring, logging, and observability (Datadog, Prometheus, Kibana, or similar)
Managed data platforms (Databricks or similar - administering and scaling, not just consuming)
Network infrastructure fundamentals: load balancers, DNS, auto-scaling, multi-region topologies, proxies
Security and access management: least-privilege, secrets management, controls for data systems
MLOps concepts or tooling - a plus

What we value above technical skillsWe are explicitly willing to trade depth in data tooling for the right operational character. Specifically:

Humility - you don't know everything, you say so, and you ask before acting in unfamiliar territory
Methodical execution - you minimize variables, you don't premature-optimize, you finish what you started before starting something new
Communication - you tell the team what you're doing before you do it, especially in shared or production environments
Ownership - when something goes wrong, you look inward first
Independence - you can drive projects end-to-end, from ambiguous requirements to high quality deliverables. But you aren't afraid to ask for help.

Benefits:

Total compensation ($190,000 - $240,000)
Equity compensation
Health insurance coverage for you and your dependents
401K, FSA, and commuter benefits
$150 monthly spending account
$1,000 annual continued education benefit
$500 Newbie Productivity Perk
Unlimited PTO and sick days
Monthly Company Wellness Day Off
Snacks, drinks, and catered lunches at the office
Team building events
Hybrid RTO of 2 days per week in office.

#LI-HYBRID

* Ladders Estimates

Similar Jobs

Senior Software Engineer, DGX Cloud Production Engineering
$184K — $356K *
NVIDIA Corporation
Remote
6 days ago
Member of Technical Staff, Site Reliability
$160K — $270K *
Mandolin
San Francisco, CA 94112 (San Francisco County)
3 weeks ago
Data Operations Senior Lead
$151K — $242K *
Workiva, Inc
Remote
1 month ago
Senior Site Reliability Engineer
$163K — $203K *
Prosper
San Francisco, CA 94112 (San Francisco County)
1 month ago
AI Services Technical Lead
$141K — $307K *
Lam Research
Fremont, CA 94536 (Alameda County)
1 month ago
Senior SRE
$167K — $196K *
LiveRamp
San Francisco, CA 94112 (San Francisco County)
1 month ago

Get Ready For Your
Next Interview

More Jobs at Tatari

Staff Data Platform Engineer
$190K — $240K *
New York, NY 10025 (New York County)
Today
Information Technology
In-Person
Staff Data Platform Engineer
$190K — $240K *
San Francisco, CA 94112 (San Francisco County)
Today
Information Technology
In-Person
Staff Data Platform Engineer
$190K — $240K *
Los Angeles, CA 90011 (Los Angeles County)
Today
Information Technology
In-Person
Sr QA Engineer - SDET (CM)
$140K — $170K *
Los Angeles, CA 90011 (Los Angeles County)
5 days ago
Information Technology
In-Person
Client Development, Team Lead
$175K — $200K *
San Francisco, CA 94112 (San Francisco County)
2 weeks ago
Business Services
In-Person

More Information Technology Jobs

Client Partner - Banking / Financial Services / Capital Markets
$325K — $350K + $100K bonus *
Large IT Services Firm (client of TechLink Systems)
New York, NY 10001 (New York County)
Today
Business Development Director
$300K — $345K + $120K bonus *
Tier1 IT Services Firm
Kansas City, MO 64116 (Clay County)
1 week ago
Client Partner / Business Developemnt - Banking
$250K — $320K + $70K bonus *
IT Services Firm (client of TechLink Systems)
New York, NY 10001 (New York County)
1 week ago
Site Reliability Engineer
$140K — $155K *
Axle Informatics
Frederick, MD 21702 (Frederick County)
Today
Software Engineer
$120K — $160K *
Heliux, Inc.
San Francisco, CA 94112 (San Francisco County)
Reposted Today

Find similar Staff Data Platform Engineer jobs:

Nationwide San Francisco, CA

Staff Data Platform Engineer

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar Staff Data Platform Engineer jobs:

Get Ready For Your
Next Interview