PyTorch Engineer

LTM

$120K — $150K *
Technical Services
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • Proficiency in PyTorch and Triton with practical experience in developing stress workloads.
  • Strong understanding of computational memory management, DMA, and execution patterns.
  • Experience with performance analysis and optimization of both simulator and real hardware.
  • Ability to design scalable test harnesses for various workloads and configurations.
  • Familiarity with cross-functional collaboration in a technical environment.

Responsibilities

  • Design and implement high-intensity stress workloads using PyTorch and Triton.
  • Identify and troubleshoot performance issues and system bottlenecks in simulator and real setups.
  • Develop complex PyTorch workloads that push model-level execution limits.
  • Create custom Triton kernels to assess hardware performance under stress.
  • Document and streamline processes for integrating workloads into CI and monitoring tools.
  • Maintain and update a library of reusable PyTorch stress workloads.
  • Collaborate with firmware and SDK teams to address risk areas and refine stress tests.

Benefits

  • Opportunity to work on cutting-edge machine learning and hardware integration projects.
  • Collaborative work environment with cross-functional teams.
  • Access to advanced tools for performance testing and optimization.
  • Possibility for innovation in stress testing methodologies and changing the tech landscape.
Full Job Description
Role description

  • Design and implement highintensity stress workloads using PyTorch and Triton Exercise core MAIA execution paths including compute memory DMA and collectives
  • Enable early detection of performance cliffs stability issues and system bottlenecks across simulator and real hardware Improve platform maturity reduce latestage escapes and increase confidence for broader internal and external adoption
  • Develop PyTorch workloads stressing modellevel execution such as large GEMMs attention patterns MoElike behavior mixed precision and longrunning loops
  • Author custom Triton kernels to stress hardware execution units memory hierarchies and synchronization paths
  • Build parameterized stress harnesses scalable by problem size number of devices and runtime duration Integrate workloads with existing profiling monitoring and failure triage tooling
  • Collaborate with platform firmware and SDK teams to target known risk areas and emerging issues
  • Document usage patterns and provide reproducible scripts for lab and continuous integration CI usage
  • Develop and maintain a library of reusable PyTorch stress workloads
  • Create Tritonbased micro and macrokernels designed specifically for stress and saturation testing
  • Build and support test harnesses and scripts for singledevice and multidevice execution
  • Ensure workload designs align with platform risk areas and emerging hardwaresoftware issues
  • Collaborate crossfunctionally with platform firmware and SDK teams to refine stress tests
  • Provide comprehensive documentation describing workload intent configuration options and expected stress characteristics Support profiling monitoring and failure triage by integrating stress workloads with existing tools
  • Deliver reproducible and scalable testing solutions for lab and CI environments

Similar Jobs

More Jobs at LTM

More Technical Services Jobs

Find similar PyTorch Engineer jobs: