ECS

Senior ML Serving Engineer

ECS$120K — $160K *
Aerospace & Defense
8 - 10 years of experience
Job Overview by Ladders

Qualifications

  • 10-12 years of experience in secure DoW or equivalent model serving environments
  • Current Secret security clearance with the ability to obtain a Top Secret (TS) clearance
  • CNCF-Certified Kubernetes Administrator (CKA) or equivalent certification
  • Expertise in Kubernetes, Helm, Docker, GitLab CI, VMware, Prometheus, Grafana, and Elastic Stack
  • Strong problem-solving and decision-making skills
  • Excellent interpersonal and communication abilities

Responsibilities

  • Implement model-runtime deployment patterns for WDP while supporting DoW missions
  • Develop service templates and deployment specifications for enterprise pipelines
  • Conduct performance tuning and optimize resource allocation for operational stability
  • Validate runtime patterns across secure data environments
  • Support automated workflows and API configurations for model serving readiness
  • Produce critical documentation including performance reports and configuration packages
  • Collaborate with multi-national teams to enhance deployment consistency and operational readiness

Benefits

  • Comprehensive health insurance
  • 401(k) retirement plan with employer matching
  • Generous PTO and paid holidays
  • Professional development and certification reimbursement
  • Flexible work arrangements
Full Job Description
Everforth ECS is seeking a Senior ML Serving Engineer to work in the National Capital Region covering the Pentagon, Falls Church, and Fairfax. Please Note: This position is contingent upon contract award.

The War Data Platform (WDP) is a key initiative within the U.S. Department of War's (DoW) AI-First strategy introduced in early 2026. The WDP focuses on operational warfighting data and aims to accelerate the deployment of artificial intelligence (AI) on the battlefield. The WDP extends to Unclassified, Secret, and Top Secret environments, and supports collaboration between Combatant Commands, Joint Staff directorates, Senior Executive Service leaders, and operational analysts.

This role implements the model-runtime deployment pattern used across WDP Core Integration AI and machine learning serving environments, ensuring consistent, secure, and high-performance model delivery to DoW missions and senior leaders.
• Implements the model-runtime deployment pattern used across WDP Core Integration artificial intelligence and machine learning serving environments supporting DoW missions, Joint Staff analysts, Combatant Command elements, and Senior Executive Service leadership.
• Develops service templates, runtime configurations, scaling behaviors, and deployment specifications consumed by enterprise pipelines and API access patterns.
• Applies Kubernetes, Helm, Docker, GitLab Continuous Integration, VMware environments, Prometheus, Grafana, Elastic Stack, and hardened deployment workflows to establish consistent runtime behavior for production-ready model artifacts.
• Conducts performance tuning, latency optimization, and resource-allocation refinement to maintain operational stability across serving surfaces.
• Validates runtime patterns and operational readiness across higher-domain enclaves, including SIPR and JWICS, by resolving enclave-specific runtime constraints, adapting deployment templates, and aligning runtime behavior with cross-domain security architectures.
• Supports automated scanning workflows, cross-domain transfer validation, and API endpoint configuration activities to maintain readiness for model serving operations.
• Produces mission-critical deliverables including runtime configuration packages, deployment templates, performance reports, operational readiness assessments, and enclave-specific runtime documentation.
• Collaborates with Platform One, Cloud One, multi-national engineering teams, and cross-service mission partners to strengthen operational readiness, reinforce deployment consistency, and advance program value commitments across all enclaves.
• Participates in Tier-4 incident response actions to maintain service-level agreements, operational continuity, and mission performance for enterprise AI model serving capabilities.
• Performs other duties as assigned.
• Current Secret security clearance with the ability to obtain and maintain a Top Secret (TS) security clearance with Sensitive Compartmented Information (SCI).
• 10-12 years of experience implementing and managing model serving and runtime environments in secure DoW or equivalent settings.
• CNCF-Certified Kubernetes Administrator (CKA) or equivalent Kubernetes certification.
• Proven proficiency with Kubernetes, Helm, Docker, GitLab CI, VMware, Prometheus, Grafana, and Elastic Stack for hardened deployment workflows.
• Successful track record of performance tuning, latency optimization, and resource-allocation refinement for production-grade AI/ML models.
• Strong problem-solving and decision-making capabilities, with a proven ability to weigh the relative costs and benefits of potential actions and identify the most appropriate solution.
• Highly developed interpersonal and oral/written communication skills, with the ability to effectively and professionally interact with a diverse set of stakeholders (from peers to end-users to executive management).

About ECS

ECS is a leading provider of digital solutions and services to the federal government. The company was founded in 2001 by Roy Kapani and has since grown to become a trusted partner to a wide range of government agencies. ECS offers a broad range of services, including cloud computing, cybersecurity, and artificial intelligence. The company has been recognized for its innovative solutions and has won numerous awards, including the AWS Public Sector Partner of the Year award.
Learn more about ECS
Size
2,000 employees
Industry

Similar Jobs

More Jobs at ECS

More Aerospace & Defense Jobs

Find similar Senior ML Serving Engineer jobs: