KLA Tencor

Sr. HPC Systems Architect (Linux)

KLA Tencor$129K — $220K *
Information Technology
8 - 10 years of experience
Job Overview by Ladders

Qualifications

  • 8+ years of experience with HPC systems or large-scale Linux infrastructure, or equivalent education and experience.
  • Proven hands-on expertise in HPC storage and Linux environments.
  • Strong knowledge of various Linux distributions including Rocky, RHEL, SuSE, and Ubuntu.
  • Experience crafting and operating large-scale parallel storage systems.
  • Thorough understanding of HPC hardware platforms including servers, GPUs, and networking.
  • Advanced Linux systems knowledge, particularly with PXE/netboot and high availability.
  • Proficient in networking fundamentals such as TCP/IP and DNS.

Responsibilities

  • Lead design and implementation of high-performance compute (HPC) clusters, ensuring performance and reliability.
  • Act as a technical authority for HPC storage with expertise in parallel file systems like Lustre and BeeGFS.
  • Deliver balanced, high-performance solutions by applying systems knowledge across CPU and GPU architectures.
  • Create hardware BOMs for HPC clusters while coordinating vendor activities.
  • Design and optimize Linux operating systems specifically for HPC environments.
  • Translate project specifications into effective subsystem designs, driving execution effectively.
  • Support transition of systems to manufacturing, providing quality documentation and procedures.

Benefits

  • Medical, dental, and vision insurance coverage.
  • 401(K) with company matching contributions.
  • Employee stock purchase program (ESPP).
  • Tuition reimbursement and student debt assistance.
  • Development opportunities and career growth programs.
  • Wellness benefits including an employee assistance program (EAP).
  • Generous paid time off and company holidays.
Full Job Description
Job Description/Preferred Qualifications

This role provides senior technical leadership for the architecture, deployment, and long-term scalability of large-scale HPC storage and compute platforms! It owns systems end-to-end-from early architectural definition through full production-partnering across engineering, manufacturing, and strategic vendors to deliver highly available, high-performance infrastructure at scale.

The scope emphasizes deep technical ownership, architectural decision-making, and solving sophisticated infrastructure challenges in live production environments! This work directly develops critically important HPC platforms built for adaptability, scale, and operational excellence, driving real-world impact across core products and technologies.

Job Duties, but not limited to:
  • Lead the design, implementation, and ongoing support of high-performance compute (HPC) clusters, taking accountability for system performance, reliability, and scalability
  • Serve as a technical authority for HPC storage, with deep hands-on expertise in parallel file systems such as Lustre, GPFS, and BeeGFS
  • Apply sophisticated systems knowledge across CPU and GPU architectures, high-bandwidth interconnects, and robust storage subsystems to deliver balanced, high-performance solutions
  • Lead the creation of hardware BOMs for HPC clusters, working directly with vendors and coordinating hardware release activities
  • Design, configure, and optimize Linux operating systems for HPC environments.
  • Translate project specifications and performance requirements into subsystem and system-level designs, driving execution while meeting technical and schedule commitments
  • Support the design, release, and transition of new systems to manufacturing and customers, providing high-quality golden images, procedures, scripts, and documentation
  • Lead EOL part re-qualification activities to ensure long-term system viability and supportability
Qualifications, but not limited to:
  • Proven experience with HPC systems and Linux platform.
  • Strong, distro-agnostic Linux experience (Rocky, RHEL, SuSE, Ubuntu)
  • Strong scripting skills in Shell and Python
  • Strong understanding of HPC hardware platforms (servers, GPUs, networking, storage, BIOS/BMC)
  • Advanced Linux systems knowledge (PXE/netboot, systemd, HA concepts)
  • Solid networking fundamentals (TCP/IP, DNS, DHCP, LDAP, HTTP)
  • Experience with configuration management and automation (Salt, Ansible, Puppet, Chef, etc.)
  • Interest in HPC storage
Preferred Qualifications:
  • Strong DevOps and automation mentality (CI/CD pipelines, Git, infrastructure as code)
  • Experience with containers for HPC (Singularity, Docker)
  • Monitoring and observability experience (Prometheus, Grafana)
  • Familiarity with Apache/Nginx and supporting infrastructure services


Minimum Qualifications

Requires minimum of 8 years of related experience with a Bachelor's degree; or 6 years and a Master's degree; or a PhD with 3 years experience; or equivalent experience.

Base Pay Range: $129,600.00 - $220,300.00 Annually

Primary Location: USA-MI-Ann Arbor-KLA

KLA's total rewards package for employees may also include participation in performance incentive programs and eligibility for additional benefits including but not limited to: medical, dental, vision, life, and other voluntary benefits, 401(K) including company matching, employee stock purchase program (ESPP), student debt assistance, tuition reimbursement program, development and career growth opportunities and programs, financial planning benefits, wellness benefits including an employee assistance program (EAP), paid time off and paid company holidays, and family care and bonding leave.

Interns are eligible for some of the benefits listed. Our pay ranges are determined by role, level, and location. The range displayed reflects the pay for this position in the primary location identified in this posting. Actual pay depends on several factors, including state minimum pay wage rates, location, job-related skills, experience, and relevant education level or training. We are committed to complying with all applicable federal and state minimum wage requirements where applicable. If applicable, your recruiter can share more about the specific pay range for your preferred location during the hiring process.

About KLA Tencor

KLA Corporation is a global capital equipment company that provides process control solutions for semiconductor and related industries. The Company's products are also used in a number of other high technology industries, including the packaging, light emitting diode (LED), power device and compound semiconductor markets. Its products and services are used by bare wafer, integrated circuit (IC), lithography reticle (reticle or mask) and disk manufacturers around the world. The Company's inspection and metrology products and related offerings are categorized in various groups, including Chip Manufacturing, Wafer Manufacturing, Reticle Manufacturing, LED, Power Device and Compound Semiconductor Manufacturing, Data Storage Media/Head Manufacturing, Microelectromechanical Systems (MEMS) Manufacturing, and General Purpose/Lab Applications.
Learn more about KLA Tencor
Size
11,300 employees
Market Cap
$52 billion
Industry
Net Income
$1.3 billion
Founded
1997
5 Year Trend
+21.5%
Revenue
$6 billion
NASDAQ

Similar Jobs

More Jobs at KLA Tencor

More Information Technology Jobs

Find similar Sr. HPC Systems Architect (Linux) jobs: