ECS

Senior ML Serving Engineer

ECS$120K — $160K *
Information Technology
8 - 10 years of experience
Job Overview by Ladders

Qualifications

  • 10-12 years of experience in secure model serving and runtime environments, particularly within DoW or similar settings.
  • Current Secret security clearance with potential for Top Secret (TS) clearance and Sensitive Compartmented Information (SCI).
  • CNCF-Certified Kubernetes Administrator (CKA) or equivalent certification required.
  • Proficiency in Kubernetes, Helm, Docker, GitLab CI, VMware, Prometheus, Grafana, and Elastic Stack for deployment workflows.
  • Strong problem-solving skills with the ability to make effective decisions based on cost-benefit analysis.
  • Excellent interpersonal and communication skills for interaction across various organizational levels.

Responsibilities

  • Implement model-runtime deployment patterns for AI/ML environments supporting DoW missions.
  • Develop service templates, runtime configurations, and deployment specifications for enterprise pipelines.
  • Apply Kubernetes and related technologies to ensure consistent runtime behavior of model artifacts.
  • Conduct performance tuning and resource allocation to maintain operational stability.
  • Validate runtime patterns in various security enclaves by resolving deployment constraints.
  • Support cross-domain transfer validation and maintain readiness for model serving operations.
  • Produce essential documentation including configuration packages and operational assessments.
  • Collaborate with multi-national teams to enhance operational readiness and deployment consistency.

Benefits

  • Professional development and continuing education opportunities.
  • Exposure to cutting-edge technologies in an AI-first strategy.
  • Collaboration with senior leaders and cross-service mission partners.
  • Opportunity to work in a mission-critical environment that directly impacts national security.
Full Job Description
Everforth ECS is seeking a Senior ML Serving Engineer to work in the National Capital Region covering the Pentagon, Falls Church, and Fairfax. Please Note: This position is contingent upon contract award.

The War Data Platform (WDP) is a key initiative within the U.S. Department of War's (DoW) AI-First strategy introduced in early 2026. The WDP focuses on operational warfighting data and aims to accelerate the deployment of artificial intelligence (AI) on the battlefield. The WDP extends to Unclassified, Secret, and Top Secret environments, and supports collaboration between Combatant Commands, Joint Staff directorates, Senior Executive Service leaders, and operational analysts.

This role implements the model-runtime deployment pattern used across WDP Core Integration AI and machine learning serving environments, ensuring consistent, secure, and high-performance model delivery to DoW missions and senior leaders.
• Implements the model-runtime deployment pattern used across WDP Core Integration artificial intelligence and machine learning serving environments supporting DoW missions, Joint Staff analysts, Combatant Command elements, and Senior Executive Service leadership.
• Develops service templates, runtime configurations, scaling behaviors, and deployment specifications consumed by enterprise pipelines and API access patterns.
• Applies Kubernetes, Helm, Docker, GitLab Continuous Integration, VMware environments, Prometheus, Grafana, Elastic Stack, and hardened deployment workflows to establish consistent runtime behavior for production-ready model artifacts.
• Conducts performance tuning, latency optimization, and resource-allocation refinement to maintain operational stability across serving surfaces.
• Validates runtime patterns and operational readiness across higher-domain enclaves, including SIPR and JWICS, by resolving enclave-specific runtime constraints, adapting deployment templates, and aligning runtime behavior with cross-domain security architectures.
• Supports automated scanning workflows, cross-domain transfer validation, and API endpoint configuration activities to maintain readiness for model serving operations.
• Produces mission-critical deliverables including runtime configuration packages, deployment templates, performance reports, operational readiness assessments, and enclave-specific runtime documentation.
• Collaborates with Platform One, Cloud One, multi-national engineering teams, and cross-service mission partners to strengthen operational readiness, reinforce deployment consistency, and advance program value commitments across all enclaves.
• Participates in Tier-4 incident response actions to maintain service-level agreements, operational continuity, and mission performance for enterprise AI model serving capabilities.
• Performs other duties as assigned.
• Current Secret security clearance with the ability to obtain and maintain a Top Secret (TS) security clearance with Sensitive Compartmented Information (SCI).
• 10-12 years of experience implementing and managing model serving and runtime environments in secure DoW or equivalent settings.
• CNCF-Certified Kubernetes Administrator (CKA) or equivalent Kubernetes certification.
• Proven proficiency with Kubernetes, Helm, Docker, GitLab CI, VMware, Prometheus, Grafana, and Elastic Stack for hardened deployment workflows.
• Successful track record of performance tuning, latency optimization, and resource-allocation refinement for production-grade AI/ML models.
• Strong problem-solving and decision-making capabilities, with a proven ability to weigh the relative costs and benefits of potential actions and identify the most appropriate solution.
• Highly developed interpersonal and oral/written communication skills, with the ability to effectively and professionally interact with a diverse set of stakeholders (from peers to end-users to executive management).

About ECS

ECS is a leading provider of digital solutions and services to the federal government. The company was founded in 2001 by Roy Kapani and has since grown to become a trusted partner to a wide range of government agencies. ECS offers a broad range of services, including cloud computing, cybersecurity, and artificial intelligence. The company has been recognized for its innovative solutions and has won numerous awards, including the AWS Public Sector Partner of the Year award.
Learn more about ECS
Size
2,000 employees
Industry

Similar Jobs

More Jobs at ECS

More Information Technology Jobs

Find similar Senior ML Serving Engineer jobs: