Torc Robotics

Senior, ML Engineer - Auto Tagger

Torc Robotics$177K — $212K *
US-AnywhereRemote in Ann Arbor, MI
Transportation
5 - 7 years of experience
Job Overview by Ladders

Qualifications

  • BS or MS in Computer Science, Robotics, Engineering, or a STEM field, with 6+ years in data engineering, ML systems, or autonomous data curation.
  • Strong Python and SQL skills, with heavy experience processing massive time-series or unstructured datasets.
  • Hands-on machine learning and dataset curation experience, with a track record of improving downstream model performance.
  • Experience using Databricks (or similar) for large-scale analytics and indexing large vehicle datasets.
  • Expertise in distributed compute frameworks (Ray, Spark, Beam) and cloud platforms (AWS, GCP, Azure) for heavy data workloads.
  • Experience applying scenario-description standards like Pegasus layers.
  • Exceptional communication skills to convey complex engineering challenges to stakeholders.
  • Demonstrated technical leadership through mentoring and defining engineering roadmaps.

Responsibilities

  • Architect and optimize distributed data pipelines to handle massive multi-sensor logs for driving events.
  • Develop and fine-tune heuristic-based and ML-assisted algorithms for scenario classification.
  • Extract and structure scenario data ensuring consistency using the Pegasus layer standard.
  • Manage ingestion of tagged events into the observations database for efficient querying and retrieval.
  • Drive cross-functional alignment to define interesting scenarios and operationalize data loops.
  • Mentor less-experienced engineers and foster a culture of technical excellence.

Benefits

  • 100% paid medical, dental, and vision premiums for full-time employees.
  • 401K plan with a 6% employer match.
  • Flexibility in schedule and generous paid vacation available immediately after start date.
  • Company-wide holiday office closures.
  • AD+D and Life Insurance.
Full Job Description
Meet The Team:

The Auto Tagger team is the engine behind our data flywheel, responsible for translating petabytes of raw, multi-modal vehicle data into a highly curated library of critical driving scenarios. By mining driving logs for long-tail events, we provide the foundational data required for safe autonomous trucking. Leveraging Pegasus logical layers, this team structures and catalogs findings into an observations database that directly accelerates development across autonomous perception, sensor fusion, and generative simulation testing.

What You'll Do:
  • Scenario Mining at Scale: Architect and optimize distributed data pipelines to process massive multi-sensor logs (camera, LiDAR, radar, kinematics), automatically extracting and cataloging safety-critical and long-tail driving events.
  • Advanced Event Tagging: Develop and tune both heuristic-based and ML-assisted algorithms (including exploring Vision-Language Models or semantic vector search) to automatically classify and describe complex environmental and behavioral scenarios.
  • Standardized Data Structuring: Extract and format scenario data utilizing the Pegasus layer standard (alongside opensource frameworks) to ensure semantic consistency and rigorous metadata integrity.
  • Data Flywheel Integration: Manage the ingestion of tagged events into the observations database, enabling high-speed querying and retrieval for ML training, regression testing, and system validation.
  • Cross-Functional Alignment: Operate with broad autonomy to drive consensus across organizational boundaries. Collaborate closely with downstream consumers in perception, simulation, and systems engineering to define what constitutes an "interesting scenario" and operationalize a continuous data loop.
  • Mentorship & Team Growth: Guide, mentor, and elevate less-experienced engineers. Lead design reviews, establish coding standards, and foster a culture of technical excellence and collaborative problem-solving.

What You'll Need to Succeed:
  • BS or MS in Computer Science, Robotics, Engineering, or a STEM field, with 6+ years in data engineering, ML systems, or autonomous data curation.
  • Core Languages: Strong Python and SQL skills, with heavy experience processing massive time-series or unstructured datasets.
  • ML & Dataset Curation: Hands-on machine learning and dataset curation experience, with a demonstrated history of implementing targeted datasets that measurably improve downstream model performance.
  • Data Exploration: Hands-on experience using Databricks (or similar platforms) for large-scale analytics, interactive querying, and making massive vehicle datasets searchable.
  • Cloud & Compute: Expertise in distributed compute frameworks (Ray, Spark, Beam) and cloud platforms (AWS, GCP, or Azure) for executing heavy data workloads.
  • AV Standards: Experience parsing complex data formats and applying scenario-description standards like Pegasus layers.
  • Communication: Exceptional ability to translate complex data engineering challenges into clear strategies for cross-functional stakeholders.
  • Technical Leadership: Proven track record of mentoring teams, driving system architecture, and defining engineering roadmaps.

Bonus Points!
  • Auto-labeling & VLMs: Familiarity with foundational models, auto-labeling pipelines, or zero-shot classification for scenario extraction.
  • Model Serving: Experience with vLLM, SGLang, or similar frameworks for highly optimized, high-throughput model serving and inference
  • Semantic Inference: Experience with semantic extraction and attribute mapping to help build out a robust semantic inference engine, moving beyond standard bounding-box object detection.
  • Data Tooling: Familiarity with parsing robotics formats (ROS bags, MCAP) and optimizing high-performance columnar storage formats (Parquet, Arrow).
  • Downstream Integration: Knowledge of how scenario data feeds into generative simulation workflows, neural rendering, or sensor fusion validation.
  • Advanced Retrieval: Experience building semantic retrieval systems or vector databases for automotive data.

Perks of Being a Torc'r

Torc cares about our team members and we strive to provide benefits and resources to support their health, work/life balance, and future. Our culture is collaborative, energetic, and team focused. Torc offers:
  • A competitive compensation package that includes a bonus component and stock options
  • 100% paid medical, dental, and vision premiums for full-time employees
  • 401K plan with a 6% employer match
  • Flexibility in schedule and generous paid vacation (available immediately after start date)
  • Company-wide holiday office closures
  • AD+D and Life Insurance

Job ID: R-102717

Hiring Range for Job Opening

US Pay Range

$177,300-$212,800 USD

About Torc Robotics

Torc Robotics is a company that develops autonomous vehicle technology. It was founded in 2005 in Blacksburg, Virginia, and has since become a leader in the field of self-driving vehicles. Torc Robotics has developed autonomous technology for a variety of applications, including military vehicles, mining trucks, and consumer cars. The company has partnerships with major automotive manufacturers, including Daimler Trucks North America and Caterpillar. In 2019, Torc Robotics was acquired by Daimler Trucks North America, and it continues to operate as a subsidiary of the company.
Learn more about Torc Robotics
Size
200 employees
Industry
Founded
2005

Similar Jobs

More Jobs at Torc Robotics

More Transportation Jobs

Find similar Senior, ML Engineer - Auto Tagger jobs: