HPC Administrator

Jahnel Group

$90K — $130K *
Information Technology
5 - 7 years of experience
Job Overview by Ladders

Qualifications

  • 7+ years of experience in administering Linux systems within enterprise or HPC environments
  • Proven support for High Performance Computing (HPC) infrastructure and clustered computing environments
  • Hands-on experience with workload scheduling tools like PBS Professional or Slurm
  • Strong knowledge of Red Hat Enterprise Linux (RHEL) or similar distributions
  • Experience with GPU-based computing platforms, notably NVIDIA and CUDA
  • Proficiency in scripting and automation using Python or Bash
  • Familiarity with container technologies such as Docker or Kubernetes

Responsibilities

  • Administer and maintain Altair Access, PBS Professional, and Slurm workload environments
  • Manage user onboarding, account provisioning, and resource allocation policies
  • Install and support engineering applications like ANSYS and STAR-CCM+
  • Monitor cluster health, job execution, and system capacity
  • Maintain GPU infrastructure, including NVIDIA drivers and CUDA toolkits
  • Perform system patching, upgrades, and security remediation
  • Troubleshoot hardware, software, and application-related issues
  • Support containerized workloads and AI/ML environments

Benefits

  • Work hours from 8:00 am to 5:00 pm EST with flexibility for workload demands
  • Opportunity to work with security-conscious clients requiring background checks
Full Job Description
We are seeking an experienced HPC Administrator with a strong background in Linux systems administration, HPC environments, and workload scheduling platforms such as PBS Professional or Slurm. The ideal candidate is comfortable supporting engineering and research teams, managing GPU-enabled infrastructure, automating operational processes, and maintaining highly available, secure computing environments that support mission-critical workloads.
Responsibilities
  • Administer and maintain Altair Access, PBS Professional, and Slurm workload scheduling environments
  • Manage user onboarding, account provisioning, queue access, and resource allocation policies
  • Install, configure, and support engineering applications including ANSYS, STAR-CCM+, SIMPACK, Simufact, Tacoma, and related software platforms
  • Monitor cluster health, performance, job execution, storage utilization, and overall system capacity
  • Maintain and support GPU infrastructure, including NVIDIA drivers, CUDA toolkits, and fabric management services
  • Perform operating system patching, firmware upgrades, security remediation, and vulnerability management activities
  • Troubleshoot hardware, software, application, networking, storage, and job scheduling issues
  • Support containerized workloads and AI/ML environments utilizing GPU resources
  • Coordinate support activities with technology vendors including Dell, NVIDIA, Altair, Ansys, and other strategic partners
  • Develop automation solutions and infrastructure-as-code practices to improve operational efficiency and consistency
  • Ensure compliance with cybersecurity and regulatory requirements, including CMMC, NIST 800-171, and export control standards
  • Provide technical guidance to engineering and research teams regarding application performance, resource utilization, and HPC best practices
  • Participate in capacity planning, architecture reviews, hardware refreshes, and future HPC expansion initiatives
  • Collaborate with cross-functional teams to maintain a reliable, secure, and high-performing computing environment
Required Skills & Qualifications
  • 7+ years of experience administering Linux systems in enterprise or HPC environments
  • Experience supporting High Performance Computing (HPC) infrastructure and clustered computing environments
  • Hands-on experience with PBS Professional, Slurm, or similar workload schedulers
  • Strong knowledge of Red Hat Enterprise Linux (RHEL) or comparable Linux distributions
  • Experience supporting engineering and simulation applications in a research or engineering environment
  • Experience administering GPU-based computing platforms, including NVIDIA GPUs and CUDA
  • Proficiency with scripting and automation using Python, Bash, or similar technologies
  • Experience with container technologies such as Docker, Podman, or Kubernetes
  • Strong understanding of storage, networking, system performance tuning, and troubleshooting
  • Familiarity with infrastructure automation and configuration management practices
  • Knowledge of cybersecurity principles, vulnerability remediation, and enterprise compliance requirements
  • Strong analytical, troubleshooting, and problem-solving skills
  • Ability to communicate effectively with technical and non-technical stakeholders
  • Experience working collaboratively within engineering, research, or scientific computing environments
Other Information

The work hours will be approximately 8:00 am to 5:00 pm EST, depending on workload, with the occasional late night when a tight deadline calls for it. We work for security-conscious clients, thus background checks will be required. Salary dependent upon experience.

Similar Jobs

More Jobs at Jahnel Group

More Information Technology Jobs

Find similar HPC Administrator jobs: