HPC Systems Engineer

SAIC • $100K — $130K *

Charlottesville, VA 22903In-Person

Aerospace & Defense

8 - 10 years of experience

Reposted Today

Be an Early Applicant

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

Bachelor's degree in science/technology or 10 years of relevant experience
8+ years of experience in HPC or distributed compute environments
6+ years administering Linux systems
Active Top Secret clearance required
Experience with workload schedulers like Slurm or PBS
Scripting or automation skills in Bash or Python
Ability to obtain DoD 8140 (8570) IAT Level II certification

Responsibilities

Support deployment and sustainment of Linux-based HPC clusters
Manage cluster platform configuration and scheduler administration
Troubleshoot distributed compute workloads
Conduct performance analysis across compute, storage, and network layers
Provide support for GPU compute workload operations
Develop automation scripts and operational tooling

Benefits

Ongoing professional development opportunities
Access to cutting-edge technology in a mission-critical environment
On-site support in a collaborative research environment
Opportunity to work with high-performance computing architectures
Contribute to national security and defense initiatives

Full Job Description

Job Description

Description

SAIC is looking for a highly qualified HPC Systems Engineer to support the Army's Golden Dome initiative. The engineer will support the deployment and sustainment of Linux-based High Performance Computing (HPC) cluster environments used for distributed compute workloads, simulation environments, and GPU-enabled processing.

The environment will include:

multi-node Linux compute clusters
workload scheduling platforms such as Slurm or PBS
cluster provisioning frameworks (e.g., xCAT, Warewulf)
high-performance networking technologies including RDMA / InfiniBand
distributed parallel compute workloads utilizing MPI or OpenMP
GPU-enabled compute resources supporting CUDA-based processing

The system will be used to support scientific computing, simulation workloads, and other distributed compute operations within a secure research environment.

Candidates should be comfortable working within cluster-scale computing environments where performance, scheduler configuration, and distributed workload execution are critical operational factors.

The HPC Systems Engineer will support the build-out, configuration, and sustainment of HPC cluster platforms.

The role focuses on:

cluster platform configuration
scheduler administration
distributed compute troubleshooting
performance analysis across compute, storage, and network layers
GPU compute workload support
automation and operational tooling

Candidates should have experience working with multi-node Linux cluster environments and distributed compute workloads.

Core Technical Capabilities

Candidates should demonstrate capability in most of the following areas.

HPC Cluster Platforms

Experience supporting multi-node Linux compute clusters, including node integration, configuration, and operational sustainment.

Experience with cluster provisioning tools such as xCAT, Warewulf, or similar node deployment systems is beneficial.

Workload Scheduling Platforms

Experience supporting distributed compute workloads using schedulers such as:

Slurm
PBS / PBS Pro
Torque
Grid Engine

Candidates should understand queue configuration, job submission workflows, and scheduler troubleshooting.

Candidates should understand how workload schedulers interact with distributed compute workloads and containerized execution environments.

Linux Systems Administration

Strong Linux administration experience including:

command-line system administration
server and compute node configuration
system troubleshooting in distributed compute environments

Experience with RHEL-based environments is preferred.

Distributed and Containerized Workloads

Experience supporting distributed compute workloads utilizing parallel computing frameworks such as:

MPI
OpenMP
GPU compute frameworks

Candidates should understand how workload schedulers interact with distributed compute workloads and containerized execution environments within HPC clusters.

Familiarity with container technologies commonly used in HPC environments such as:

Docker
Podman
Singularity / Apptainer

Candidates should understand how containerized workloads interact with schedulers, GPU resources, and distributed compute environments.

Experience supporting containerized HPC workloads or integrating container platforms with cluster infrastructure is desirable.

HPC Networking

Familiarity with high-performance networking technologies including:

RDMA networking
InfiniBand
high-throughput cluster networking architectures

Candidates should be comfortable assisting with troubleshooting cluster communication or performance issues.

GPU Compute Environments

Experience supporting GPU-enabled compute environments and workloads utilizing CUDA frameworks is desirable.

Automation and Operational Tooling
Experience writing scripts or operational tooling using languages such as:

Bash
Python

Automation experience supporting system administration or cluster operations is beneficial.

Qualifications

Candidates must meet the following requirements:

Bachelor degree in science/technology; 10 additional YoE can be substituted for degree
8+ years of experience is required
Minimum 6 years of experience administering Linux systems in enterprise, research computing, or distributed compute environments
An Active Top Secret clearance is required; an active TS/SCI clearance must be obtained prior to beginning work.
100% onsite support in Charlottesville, VA
Experience supporting distributed compute environments or HPC cluster platforms
Experience working with workload schedulers such as Slurm, PBS, Torque, or similar systems
Experience administering Linux systems through command-line interfaces
Experience with scripting or automation tools (Bash, Python, or similar)
Ability to obtain required DoD 8140 (8570) IAT Level II certification
Candidates must have direct experience with HPC or distributed compute environments.

Candidates with the following experience are strongly preferred:

Administration of multi-node HPC cluster environments
Experience with parallel or distributed file systems such as Lustre, BeeGFS, or GPFS
Experience supporting GPU-enabled compute environments and CUDA workloads
Experience with configuration management tools such as Ansible or Puppet
Experience supporting research, laboratory, or mission computing environments
Experience supporting systems within DoD/DoW or IC environments

Overview

SAIC accepts applications on an ongoing basis and there is no deadline.

About SAIC

Science Applications International Corporation (SAIC) is a technology integrator in the technical, engineering, intelligence, and enterprise information technology markets. SAIC has approximately 26,000 employees and operates in more than 70 countries. The company was founded in 1969 and is headquartered in Reston, Virginia. SAIC provides services to the U.S. government, including the Department of Defense, the intelligence community, and civilian agencies. The company also serves commercial customers in the healthcare, energy, and financial services sectors.

Learn more about SAIC

Size

26,000 employees

Market Cap

$6 billion

Industry

Information Technology

Net Income

$206 million

Founded

1969

5 Year Trend

+10.7%

Revenue

$6.8 billion

NASDAQ

SAIC

* Ladders Estimates

Similar Jobs

Systems Engineer
$80K — $110K *
SAIC
Hampton, VA 23666 (Hampton City County)
Reposted Today
Infrastructure Engineer
$100K — $130K *
SAIC
Chantilly, VA 20152 (Loudoun County)
Reposted Today
Modeling and Simulation Engineer
$90K — $130K *
SAIC
Chantilly, VA 20152 (Loudoun County)
Reposted Today
IT Specialist (SysAdmin)
$90K — $120K *
National Institutes of Health
Rockville, MD 20850 (Montgomery County)
Today
Systems Engineer
$80K — $110K *
Rheinmetall
Warminster, PA 18974 (Bucks County)
Reposted Today
Infrastructure Engineering Analyst
$80K — $110K *
The World Bank Group
Washington, DC 20011 (District Of Columbia County)
Today

Get Ready For Your
Next Interview

More Jobs at SAIC

Director, Capture Manager - Intel Space
$200K — $240K *
Remote
Reposted Today
Aerospace & Defense
Remote in Virginia, US
ZTL - ATC Training Instructional Supervisor
$90K — $120K *
Hampton, GA 30228 (Henry County)
Reposted Today
Education, Government & Non-Profit
In-Person
Financial Intel Analyst Senior
$90K — $130K *
Arlington, VA 22204 (Arlington County)
Reposted Today
Finance & Insurance
In-Person
Senior Radio Communication Engineer
$100K — $130K *
Hampton, VA 23666 (Hampton City County)
Today
Aerospace & Defense
In-Person
Combat and Strategic Systems Engineer
$70K — $95K *
Crane, IN 47522 (Martin County)
Reposted Today
Aerospace & Defense
In-Person

More Aerospace & Defense Jobs

Site General Manager
$200K — $500K++ $60K bonus *
Spartronics
Williamsport, PA 17703 (Lycoming County)
Today
Chief Executive Officer
The Mitalmor Group
New York, NY 10001 (New York County)
Reposted Yesterday
Engineering Program Manager
$80K — $150K *
Signature Research, Inc.
Calumet, MI 49913 (Houghton County)
5 days ago
Additive Manufacturing Engineer
$90K — $120K *
Innovative Rocket Technologies Inc.
Hauppauge, NY 11788 (Suffolk County)
Reposted Today
Solution Development Manager C5ISRT
$175K — $225K *
Thales Group
Ottawa, ON K1G 3J6
Reposted Today

Find similar HPC Systems Engineer jobs:

Nationwide Charlottesville, VA

HPC Systems Engineer

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar HPC Systems Engineer jobs:

Get Ready For Your
Next Interview