Staff Engineer in Test

Data Direct Networks

$145K — $185K *
US-AnywhereRemote in California, US
Information Technology
8 - 10 years of experience
Job Overview by Ladders

Qualifications

  • 10+ years in software quality engineering, focused on distributed systems or infrastructure platforms.
  • Expertise in test automation using Python and Bash, plus CI/CD tools like Git and Jenkins.
  • Strong understanding of distributed concurrency, file systems, and I/O stack behavior.
  • Experience in large-scale performance testing and reliability validation.
  • Skilled in diagnosing complex system issues through logs, traces, and profiling tools.

Responsibilities

  • Design and implement test strategies for networking and security features in a distributed system.
  • Create automated test suites for validating concurrency, data consistency, and system performance.
  • Build and maintain robust automation frameworks using tools like Pytest and Docker.
  • Execute performance tests for system latency, throughput, and workload stability.
  • Collaborate with cross-functional teams to influence quality-focused design decisions.

Benefits

  • Opportunity to work in a flat organizational structure that promotes hands-on involvement.
  • Encouragement for proactive contributions to the mission and leadership through initiative.
  • Emphasis on strong communication skills for team success.
  • Dynamic and driven work environment focused on engineering excellence.
Full Job Description
Overview

Job Description

We are seeking a highly skilled and technically strong Staff Engineer in Test to lead system level quality engineering efforts for networking, security and other enterprise readiness aspects of Infinia, DDN’s large-scale distributed data platform.

 

In this role, you will be a senior technical authority responsible for planning and implementing test strategies and test infrastructures to ensure correctness, stability, performance, and resilience of Infinia’s distributed architecture. You will work across core subsystems—including the I/O path, memory management, networking stack, scheduling layers, multi-tenant services, and NVMe-backed storage

patterns—to ensure platform quality at scale.

 

This is a hands-on, high-impact IC role for someone who can solve hard problems, automate at scale, leverage AI to improve velocity and elevate quality engineering across the organization.

 

Key Responsibilities:

 

Quality Engineering & System Validation

  • Design detailed test strategies and validation plans for networking and security features for distributed system
  • Create scalable, automated test suites that validate multi-tenant behavior, concurrency, data consistency, and system-level performance.

Automation Frameworks & Tooling

  • Build and maintain robust automation using tools such as Pytest and container-based environments leveraging Docker, Jenkins, Kubernetes.
  • Develop reusable automation templates, harnesses, and utilities to accelerate test creation and reduce engineering overhead.

Performance, Reliability & Scale Testing

  • Construct and execute performance tests covering I/O throughput, system latency, NVMe access patterns, concurrency limits, and long-running workload stability.
  • Use advanced tools (profilers, fuzzers, failure-injection frameworks, trace analyzers) to uncover issues in distributed workflows.
  • Analyze CPU, memory, disk, and network utilization to diagnose performance bottlenecks and identify regression risks.

Cross-Functional Quality Leadership

  • Work closely with architects, developers, release engineering, DevOps, and customer engineering to drive quality-first design decisions.
  • Participate in feature design reviews, ensuring testability, observability, and resilience are built into system components.
  • Lead root cause analysis (RCA) for complex issues and propose long-term improvements to engineering practices and platform stability.

Documentation & Quality Standards

  • Produce clear, detailed test plans, automation guides, design-review feedback, and quality metrics reports.
  • Contribute to the development and maintenance of internal QA standards, best practices, and. onboarding materials.

Required Qualifications

  • 10+ years of experience in software quality engineering, with strong focus on distributed systems, system-level testing, or infrastructure platforms.
  • Hands-on expertise in test automation using Python, Bash, and modern CI/CD tooling (Git, Jenkins, etc.).
  • Strong understanding of:
    • Distributed concurrency
    • File systems and I/O stack behavior
    • Storage performance analysis (NVMe, SPDK)
    • Networking, tracing, and system observability
  • Experience with large-scale performance testing, stress testing, and reliability validation.
  • Demonstrated skill in diagnosing complex system issues across logs, traces, network captures, and profiling tools.
  • ISTQB or equivalent certification preferred.

Preferred Qualifications

  • Experience validating large-scale data platforms, storage engines, or distributed scheduling systems.
  • Experience with AI technologies in context of quality engineering, such as issue triaging, test generation, automation.
  • Familiarity with observability technologies such as OpenTelemetry, Grafana, Prometheus.
  • Background in compliance or security testing (e.g., access control, backup/restore workflows, Section 508/HIPAA/PCI).
  • Contributions to open-source test frameworks or distributed systems validation tools.

Salary Range for this role: $145,000 - $185,000

DDN

Join our dynamic and driven team, where engineering excellence is at the heart of everything we do. We seek individuals who love to challenge themselves and are fueled by curiosity. Here, you'll have the opportunity to work across various areas of the company, thanks to our flat organizational structure that encourages hands-on involvement and direct contributions to our mission. Leadership is earned by those who take initiative and consistently deliver outstanding results, both in their work ethic and deliverables, making strong prioritization skills essential. Additionally, we value strong communication skills in all our engineers and researchers, as they are crucial for the success of our teams and the company as a whole.

 

Interview Process: After submitting your application, one of our recruiters will review your resume. If your application passes this stage, you will be invited to a 30-minute interview during which a member of our team will ask some basic questions. If you clear the interview, you will enter the main process, which can consist of up to four interviews in total:

 

  • Coding assessment: Often in a language of your choice.
  • Systems design: Translate high-level requirements into a scalable, fault-tolerant service (depending on role).
  • Real-time problem-solving: Demonstrate practical skills in a live problem-solving session.
  • Meet and greet with the wider team.
  • Our goal is to finish the main process in 2-3 weeks at most.

 

#LI-Remote

Similar Jobs

More Jobs at Data Direct Networks

More Information Technology Jobs

Find similar Staff Engineer in Test jobs: