Software Engineer, Multimodal Storage Infrastructure

Eventual Computing

$130K — $180K *
Information Technology
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • 5-7 years of experience in storage infrastructure and databases
  • Strong understanding of indexing techniques like B+ trees and LSM trees
  • Familiarity with modern storage formats such as Parquet, Iceberg, and Delta
  • Experience with cloud object storage, NVMe, and other storage hierarchies
  • Passionate about databases, with a habit of reading related academic papers

Responsibilities

  • Design and build the storage and indexing layer for multimodal datasets
  • Optimize query engines through predicate and projection pushdown
  • Select and extend modern open formats for storage efficiency
  • Develop versioning and schema evolution for reproducible customer data
  • Collaborate with the Dataloading team to minimize data transfer
  • Work with the Visual Understanding team to integrate model outputs directly into the index

Benefits

  • In-person, tight-knit team culture with a 4-day work week in-office
  • Catered meals for SF-based employees
  • Commuter benefits for daily travel
  • Health, vision, and dental insurance coverage
  • Flexible paid time off policy
  • Up-to-date Apple equipment for work
  • 401(k) retirement plan with matching contributions
Full Job Description
Your Role

As a Storage Infrastructure Engineer, you'll take everything we know about modern databases and apply it to the world of Physical AI. Our warehouse co-indexes video, sensors, embeddings, and sim outputs on the same row, versioned, with a third query layer (not row/column, not vector/semantic) - content-aware queries over what's inside clips. Your job is to make that layer fast: the right indices for petabyte-scale video, predicate pushdowns that elide whole files, file formats that respect random access into clips, and a query path that turns "left-arm grasp failures on deformable objects" into the smallest possible read.

You should believe, in your bones, that the best read is the read elided.

Key Responsibilities
  • Design and build the storage and indexing layer: row groups, column chunks, secondary indices, vector indices, and the metadata that lets queries skip everything that doesn't matter.
  • Push the query engine harder - predicate pushdown, projection pushdown, late materialization - across multimodal columns including video, embeddings, and sensor streams.
  • Choose, extend, or build on top of modern open formats (Parquet, Iceberg, Delta etc) and build our own/contribute upstream where it makes sense.
  • Build versioning and schema evolution for multimodal datasets so customer data stays reproducible across months of experimentation.
  • Partner with the Dataloading team on the format-to-loader boundary so an iceberg.scan(...) translates into the absolute minimum of bytes hitting NVMe.
  • Partner with the Visual Understanding team to land model outputs in the index without an external glue layer.


What we look for
  • You love thinking about indices. B+ trees, LSM trees, bitmap indices, vector indices, learned indices - you have favorites and you have grudges.
  • You love thinking about query engines. Predicate pushdown makes you happy. Late materialization makes you happier.
  • Strong familiarity with the storage hierarchy: cloud object stores, NVMe, block storage, spinning disk, RAM, GPU memory - and the latency and cost of moving between them.
  • Strong opinions about Parquet - love it or hate it, you've earned the opinion. Same for Iceberg, Delta, Lance, and the other lakehouse formats.
  • A real love for databases and query systems. You read database papers for fun.
  • You believe the best read is the read elided.


Nice to have
  • Background from a storage or table-format team - Lance, Iceberg, Delta, Hudi, Spiral, Snowflake, BigQuery, Databricks Photon, DuckDB, ClickHouse, or similar.
  • You've attempted to build your own database before. Or, at minimum, fantasized about it in detail.
  • Experience with Rust or modern C++ for storage engines.
  • Hands-on time with vector indices (HNSW, IVF, SCANN) or hybrid retrieval systems.
  • Comfort with the OLAP/lakehouse ecosystem: catalogs, file layout, compaction, manifest formats, time travel.


Perks & Benefits
  • In-person, tight-knit team - 4 days/week in our SF Mission office.
  • Competitive comp and meaningful startup equity.
  • Catered lunches and dinners for SF employees.
  • Commuter benefit.
  • Team-building events and poker nights.
  • Health, vision, and dental coverage.
  • Flexible PTO.
  • Latest Apple equipment.
  • 401(k) plan with match.


If you've ever read a Parquet footer for fun and thought "this is so close to what video needs, but yet so far" - we should talk.

Similar Jobs

More Jobs at Eventual Computing

More Information Technology Jobs

Find similar Software Engineer, Multimodal Storage Infrastructure jobs: