Position OverviewAs a Software Engineer, Data Infra you are the architect of the "Laboratory" where Dyna's robotic intelligence is refined. You won't just move data; you will build the interactive systems that bridge the gap between raw multimodal sensor streams and production-ready ML models.
This is a high-impact, hands-on role where you will design the infrastructure to visualize model performance, automate data labeling, and manage the "human-in-the-loop" workflows - such as teleoperation and skill capture - that are critical to our scaling. If you enjoy solving complex geometric problems and building the low-friction tools that empower ML researchers, you'll thrive here.
What You'll Do- Interactive Data Systems: Architect the engines and interfaces that unify raw robot logs, video, and 3D sensor data. You will enable seamless "human-in-the-loop" workflows, from episode annotation to analyzing manual interventions.
- Signal Extraction & Geometry: Design and implement algorithms to extract structured signals (trajectories, events, 3D poses) from raw captures. Strong mathematical intuition is required to turn pixels and point clouds into ground-truth insights.
- Evaluation & Benchmarking: Build high-performance tools to compare model-driven motion against human-captured data, helping the team quantify model progress across diverse tasks.
- Scalable ML Pipelines: Build and operate distributed data pipelines (using Python, GCP/AWS, and Kubernetes) for the ingestion, transformation, and validation of terabytes of multimodal data.
- Observability & Debugging: Develop visualization tools that make complex model behaviors and sensor data easy to interpret, reducing the time from "data collected" to "model trained."
- Startup Fluidity: Collaborate across ML, Robotics, and Product teams. As an early member of the data team, you will help define the roadmap where no blueprint yet exists.
What You'll Bring- The Experience: 5+ years of professional software experience, ideally with a focus on data-intensive or "human-in-the-loop" platforms.
- Technical Stack: Proficiency in Python (NumPy, Pandas) and a solid understanding of modern backend architectures. Experience with React/TypeScript is a major plus for building internal observability tools.
- Mathematical Core: Strong skills in algorithms and geometric calculations (e.g., coordinate transformations, 3D spatial reasoning).
- Data Mastery: Hands-on experience with relational and NoSQL databases (PostgreSQL, Redis) and cloud-native infrastructure (GCP/AWS).
- Problem-Solving: The ability to debug real-world data issues - from sensor drift to pipeline bottlenecks - independently.
Bonus Points For:- Experience with multimodal data (video, LiDAR, time-series) in robotics or autonomous systems.
- Familiarity with Airflow, Kubeflow, or similar distributed batch processing systems.
- Experience with experiment tracking frameworks like Weights & Biases or MLFlow.
- Experience as an early hire in a fast-paced startup environment.
Don't let a checklist stop you. Data shows that underrepresented groups often only apply if they meet 100% of the criteria. We value problem-solving and grit over keyword matching. If you're passionate about the intersection of geometry and robotics, we want to hear from you-even if you don't check every box.