Systems Engineer, HPC (US & Canada)

Mistral AI

$90K — $130K *
Information Technology
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • 5-7 years of experience in Linux systems administration
  • Experience in large-scale HPC clusters or cloud infrastructure
  • Familiarity with job schedulers such as Slurm
  • Strong troubleshooting skills across systems, hardware, and networks
  • Proficient in automation tools like Ansible or Terraform

Responsibilities

  • Operate and maintain large-scale Linux environments including bare metal and cloud
  • Monitor system health and ensure high availability
  • Support various production and research workloads
  • Scale infrastructure clusters to handle hundreds to thousands of nodes
  • Automate operational tasks to improve system management
  • Contribute to system design and architecture decisions
  • Collaborate with cross-functional teams and act as a liaison between users and infrastructure

Benefits

  • Hybrid work environment offering flexibility
  • Opportunity to work on cutting-edge AI infrastructure
  • Collaborative work culture with multiple teams
  • Focus on professional development and learning new skills
Full Job Description
About the Role

We are looking for Systems Engineers / System Administrators to help design, operate, and scale the infrastructure behind Mistral's AI platforms.

This is a hands-on, hybrid role combining:

Systems administration (operating and troubleshooting large-scale Linux environments)

Systems engineering (automation, scalability, and performance improvements)

You'll work closely with infrastructure, HPC, and research teams to ensure our clusters and platforms run reliably at scale.

What You'll Work On
Core Systems Operations
  • Operate and maintain large-scale Linux environments (bare metal, clusters, cloud)
  • Monitor system health, troubleshoot incidents, and ensure high availability
  • Support production and research workloads across multiple environments
Scaling Infrastructure
  • Help scale clusters toward hundreds to thousands of nodes
  • Work on systems handling petabyte-scale storage
  • Improve performance, reliability, and resource utilisation
Automation & Engineering
  • Automate operational tasks using tools like Python, Bash, Ansible, or Terraform
  • Improve deployment, provisioning, and system lifecycle management
  • Contribute to system design and architecture decisions
Cross-Functional Collaboration
  • Work closely with:
    • HPC / infrastructure teams
    • Platform / DevOps engineers
    • Research teams
  • Act as a bridge between users and infrastructure
What We're Looking For
Must-have
  • Strong Linux systems administration experience (core requirement)
  • Experience working in large-scale environments:
    • HPC clusters or cloud infrastructure
  • Experience with Job schedulers (e.g. Slurm)
  • Solid troubleshooting skills across systems, hardware, and networks
Nice-to-have (any of these)

We are not expecting everything - strong depth in one area is valuable.
  • Containers / orchestration (e.g. Kubernetes)
  • Storage systems (e.g. Ceph, Lustre, NFS)
  • Networking fundamentals (Ethernet; InfiniBand is a plus)
  • Infrastructure as Code / automation tooling
  • GPU or AI/ML experience
Profile We Value
  • Pragmatic problem solver who can operate in fast-scaling environments
  • Comfortable working across multiple domains ("Swiss army knife" mindset)
  • Able to go deep in one area while learning others
  • Low-ego, collaborative, and hands-on


Similar Jobs

More Jobs at Mistral AI

More Information Technology Jobs

Find similar Systems Engineer, HPC (US & Canada) jobs: