Senior Storage Engineer at Hydra Host

Hydra Host

$120K — $160K *
Miami, FL 33186In-Person
Information Technology
8 - 10 years of experience
Job Overview by Ladders

Qualifications

  • 8+ years of experience in high-performance storage systems design and implementation for compute clusters
  • Demonstrated ability to build storage infrastructure from scratch
  • Expertise in block storage solutions like NVMe, SAN, and Ceph
  • Proficient in object storage technologies such as S3 and MinIO
  • Strong knowledge of parallel file systems including WekaIO and Lustre
  • Solid experience with Linux systems engineering and automation
  • Understanding of AI/ML data pipelines for storage optimization

Responsibilities

  • Define and architect Hydra Host's first production-grade storage platform
  • Lead technical decisions related to storage stack design and performance tuning
  • Select and maintain block and object storage solutions
  • Design for high-throughput, low-latency access to large datasets
  • Integrate and optimize parallel file systems for maximum performance
  • Ensure compatibility with diverse GPU/OEM ecosystems
  • Develop automation and management tooling focusing on reliability and scalability
  • Collaborate with cross-functional teams to integrate storage with compute and network layers

Benefits

  • Opportunity to build innovative, cutting-edge storage solutions from the ground up
  • Hands-on role with significant technical ownership
  • Exposure to large-scale AI and HPC workloads
  • Collaborative environment with cross-functional teams
  • Potential for mentorship and leadership opportunities
Full Job Description
Job Title: Storage Engineer

The role:

As a Storage Engineer, you will be responsible for designing and building Hydra Host's first production-grade storage platform from the ground up, supporting the company's rapidly expanding network of bare-metal GPU clusters.

You'll own the architecture, technology selection, implementation, and evolution of this platform, defining how Hydra Host manages data for large-scale, distributed AI workloads across global data centers.

This is a senior, hands-on role for an engineer who has built storage systems for GPU clusters before, with deep expertise in both block and object storage and a strong understanding of parallel file systems, performance optimization, and large-scale orchestration.

Key Responsibilities
• Define, architect, and implement Hydra Host's first production storage platform tailored for bare-metal GPU clusters and AI/HPC workloads.
• Lead all technical decisions around storage stack design, from hardware infrastructure to parallel file system orchestration and performance tuning.
• Select, build, and maintain storage solutions spanning both block (NVMe, SAN, Ceph, etc.) and object storage (S3-compatible, custom, or Ceph Object Gateway) layers.
• Design for high-throughput, low-latency access, supporting large datasets, rapid checkpointing, and parallel access for distributed AI training workloads.
• Integrate and optimize parallel file systems such as Lustre, BeeGFS, Spectrum Scale, WekaIO, or CephFS, ensuring maximum performance and fault tolerance.
• Ensure compatibility across Hydra's diverse GPU/OEM ecosystem, accounting for unique firmware, BMC/Redfish APIs, and hardware configurations.
• Develop automation, observability, and management tooling for storage, focusing on reliability, scalability, and efficiency.
• Act as a builder and architect: deeply hands-on in deployment, troubleshooting, and optimization, while guiding long-term storage roadmap.
• Collaborate cross-functionally with GPU, HPC, and platform engineering teams to integrate storage with compute and network layers.
• Interface with customers and product leadership to define feature priorities, performance benchmarks, and future enhancements.

Must-Have Qualifications
• 8+ years of progressive, hands-on experience designing and implementing high-performance storage systems for compute clusters in HPC, AI, or bare-metal cloud environments.
• Proven track record building storage infrastructure from scratch, not just operating existing systems.
• Deep expertise in block storage (NVMe, SAN, Ceph, distributed block systems) and object storage (S3, MinIO, Ceph Object Gateway, etc.).
• Strong background in parallel file systems (WekaIO, BeeGFS, Lustre, Spectrum Scale, or similar) supporting GPU or AI cluster workloads.
• Solid foundation in Linux systems engineering, automation, and scripting for distributed environments.
• Familiarity with BMC, Redfish APIs, and OEM server firmware for bare-metal management.
• Deep understanding of AI/ML data pipelines: model checkpointing, data locality, and multi-tiered storage optimization.
• Excellent problem-solving, debugging, and communication skills, able to translate technical decisions into clear architectural direction.

Preferred Qualifications
• Experience building storage solutions for large-scale GPU or HPC infrastructure.
• History of technical leadership or mentorship, growing teams or owning a product roadmap.
• Experience evaluating and managing vendor relationships and negotiating storage hardware/software contracts.
• Contributions to open-source HPC or storage projects (Ceph, Lustre, BeeGFS, etc.).
• Familiarity with confidential computing, secure data handling, or high-availability architectures.

Similar Jobs

More Jobs at Hydra Host

More Information Technology Jobs

Find similar Senior Storage Engineer at Hydra Host jobs: