Software Engineer, Multimodal Storage Infrastructure

Eventual Computing

• $130K — $180K *

San Francisco, CA 94112In-Person

Information Technology

Less than 5 years of experience

Today

Be an Early Applicant

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

5-7 years of experience in storage infrastructure and databases
Strong understanding of indexing techniques like B+ trees and LSM trees
Familiarity with modern storage formats such as Parquet, Iceberg, and Delta
Experience with cloud object storage, NVMe, and other storage hierarchies
Passionate about databases, with a habit of reading related academic papers

Responsibilities

Design and build the storage and indexing layer for multimodal datasets
Optimize query engines through predicate and projection pushdown
Select and extend modern open formats for storage efficiency
Develop versioning and schema evolution for reproducible customer data
Collaborate with the Dataloading team to minimize data transfer
Work with the Visual Understanding team to integrate model outputs directly into the index

Benefits

In-person, tight-knit team culture with a 4-day work week in-office
Catered meals for SF-based employees
Commuter benefits for daily travel
Health, vision, and dental insurance coverage
Flexible paid time off policy
Up-to-date Apple equipment for work
401(k) retirement plan with matching contributions

Full Job Description

Your Role

As a Storage Infrastructure Engineer, you'll take everything we know about modern databases and apply it to the world of Physical AI. Our warehouse co-indexes video, sensors, embeddings, and sim outputs on the same row, versioned, with a third query layer (not row/column, not vector/semantic) - content-aware queries over what's inside clips. Your job is to make that layer fast: the right indices for petabyte-scale video, predicate pushdowns that elide whole files, file formats that respect random access into clips, and a query path that turns "left-arm grasp failures on deformable objects" into the smallest possible read.

You should believe, in your bones, that the best read is the read elided.

Key Responsibilities

Design and build the storage and indexing layer: row groups, column chunks, secondary indices, vector indices, and the metadata that lets queries skip everything that doesn't matter.
Push the query engine harder - predicate pushdown, projection pushdown, late materialization - across multimodal columns including video, embeddings, and sensor streams.
Choose, extend, or build on top of modern open formats (Parquet, Iceberg, Delta etc) and build our own/contribute upstream where it makes sense.
Build versioning and schema evolution for multimodal datasets so customer data stays reproducible across months of experimentation.
Partner with the Dataloading team on the format-to-loader boundary so an iceberg.scan(...) translates into the absolute minimum of bytes hitting NVMe.
Partner with the Visual Understanding team to land model outputs in the index without an external glue layer.

What we look for

You love thinking about indices. B+ trees, LSM trees, bitmap indices, vector indices, learned indices - you have favorites and you have grudges.
You love thinking about query engines. Predicate pushdown makes you happy. Late materialization makes you happier.
Strong familiarity with the storage hierarchy: cloud object stores, NVMe, block storage, spinning disk, RAM, GPU memory - and the latency and cost of moving between them.
Strong opinions about Parquet - love it or hate it, you've earned the opinion. Same for Iceberg, Delta, Lance, and the other lakehouse formats.
A real love for databases and query systems. You read database papers for fun.
You believe the best read is the read elided.

Nice to have

Background from a storage or table-format team - Lance, Iceberg, Delta, Hudi, Spiral, Snowflake, BigQuery, Databricks Photon, DuckDB, ClickHouse, or similar.
You've attempted to build your own database before. Or, at minimum, fantasized about it in detail.
Experience with Rust or modern C++ for storage engines.
Hands-on time with vector indices (HNSW, IVF, SCANN) or hybrid retrieval systems.
Comfort with the OLAP/lakehouse ecosystem: catalogs, file layout, compaction, manifest formats, time travel.

Perks & Benefits

In-person, tight-knit team - 4 days/week in our SF Mission office.
Competitive comp and meaningful startup equity.
Catered lunches and dinners for SF employees.
Commuter benefit.
Team-building events and poker nights.
Health, vision, and dental coverage.
Flexible PTO.
Latest Apple equipment.
401(k) plan with match.

If you've ever read a Parquet footer for fun and thought "this is so close to what video needs, but yet so far" - we should talk.

* Ladders Estimates

Similar Jobs

Oracle Database Engineer (Remote)
$175K — $195K *
GovCIO
Remote
Yesterday
Staff Database Engineer and AI Orchestrator
$164K — $282K *
Coupang
Mountain View, CA 94040 (Santa Clara County)
6 days ago
Database Engineer, Postgres
$116K — $145K *
Collibra
Remote
2 weeks ago
Database Engineer (Remote Opportunity)
$90K — $130K *
Mind Computing, Inc.
Remote
3 weeks ago
Lead Engineer, Database Engineering
$130K — $186K *
Green Dot Corporation
San Francisco, CA 94112 (San Francisco County)
1 month ago
Software Engineer- Database Infrastructure
$160K — $180K *
Discord
San Francisco, CA 94112 (San Francisco County)
1 month ago

Get Ready For Your
Next Interview

More Jobs at Eventual Computing

Software Engineer, Multimodal Storage Infrastructure
$130K — $180K *
San Francisco, CA 94112 (San Francisco County)
Today
Information Technology
In-Person

More Information Technology Jobs

Senior Infrastructure Engineer
$110K — $140K *
MedAdvisor International Pty Ltd / Adheris Health
Woburn, MA 01801 (Middlesex County)
Today
Engineering Director - Tax Product Development
$210K — $325K *
BDO USA, LLP
Chicago, IL 60606 (Cook County)
Reposted Today
Desk Side Support Manager
$80K — $150K *
ActioNet, Inc
Washington, DC 20011 (District Of Columbia County)
Reposted Today
Tech Lead Manager, AI / Machine Learning
$120K — $160K *
Numerator
Remote
Reposted Today
Cybersecurity Architect III
$120K — $150K *
General Atomics Aeronautical Systems, Inc
San Diego, CA 92154 (San Diego County)
Reposted Today

Find similar Software Engineer, Multimodal Storage Infrastructure jobs:

Nationwide San Francisco, CA

Software Engineer, Multimodal Storage Infrastructure

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar Software Engineer, Multimodal Storage Infrastructure jobs:

Get Ready For Your
Next Interview