Research Engineer, Inference

Normal Computing

$120K — $150K *
Consumer Technology
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • 5-7 years of experience with large model inference technologies
  • Expertise in stochastic systems and probabilistic methods
  • Ability to optimize inference performance through techniques like quantization and kernel fusion
  • Strong programming skills in Python and systems programming languages
  • Demonstrated ability to translate theoretical concepts into practical hardware implementations

Responsibilities

  • Develop algorithms for transformer inference on analog processing hardware
  • Collaborate with hardware teams to influence chip architecture
  • Design numerical methods leveraging hardware's thermal noise effects
  • Create benchmarks for algorithm performance on real or simulated hardware
  • Translate model workload insights into hardware design requirements
  • Rapidly prototype and iterate as hardware evolves from design to production

Benefits

  • Collaborative work environment across hardware and software teams
  • Opportunity to influence hardware development decisions
  • Hands-on experience with cutting-edge technology
  • Platform to publish and share work in the field of efficient AI inference
  • Potential for professional growth in a pioneering tech landscape
Full Job Description
The Role

As a Research Engineer focused on inference, you will develop the computational methods that make AI inference run efficiently on Normal's thermodynamic hardware. The core challenge is not adapting standard GPU kernels to a new chip. It is rethinking how operations like attention, memory access, and long-context decoding behave when the underlying substrate uses stochastic analog computation in memory rather than conventional digital logic.

Normal's ASICs run the heaviest operations of large model inference inside memory itself. Your job is to develop the algorithms that exploit this natively: understanding what transformer workloads are well-suited to stochastic analog execution, designing numerical methods that map onto the hardware's physical dynamics, and validating them against real silicon or high-fidelity simulation.

This is a co-design role. The hardware and the algorithms are developed in parallel, which means you will influence architectural decisions, not just implement against a fixed spec. The strongest candidates have a deep understanding of both large model inference and the mathematics of stochastic systems, and have built things that run on real hardware, not just in theory.

What You'll Own

  • Algorithm Development: Develop algorithms for transformer inference workloads running on stochastic analog processing-with-memory hardware.
  • Hardware Co-Design: Work directly with hardware and architecture teams to shape what the chip can and should compute natively.
  • Numerical Methods: Design numerical methods that exploit thermal noise and analog dynamics rather than working around them.
  • Evaluation & Benchmarks: Build evaluation frameworks and benchmarks that characterize algorithm behavior on real hardware or simulation.
  • Workload Translation: Translate insights about model workloads into constraints and opportunities for hardware design.
  • Rapid Prototyping: Prototype and iterate rapidly as hardware evolves from simulation to silicon.


What Makes You a Great Fit

  • Deep understanding of large model inference: attention mechanisms, KV cache, long-context decoding, memory bandwidth constraints
  • Experience with inference optimization: quantization, sparsity, kernel fusion, or memory-efficient attention
  • Familiarity with stochastic systems, probabilistic methods, numerical analysis, or analog computation
  • Experience implementing algorithms close to hardware, not just in high-level frameworks
  • Comfort reasoning from first principles about what a novel substrate can do efficiently
  • Track record of taking ideas from theory to working implementation on real hardware
  • Strong programming skills in Python and at least one systems language
  • Collaborative instinct and ability to work across hardware, architecture, and software teams


Bonus Points

  • PhD in machine learning, applied mathematics, physics, electrical engineering, or a related field
  • Exposure to analog or mixed-signal systems, in-memory compute, or non-von-Neumann architectures
  • Experience working on hardware that did not yet exist when you joined
  • Publications or open-source work in efficient inference, stochastic algorithms, or novel computing

Similar Jobs

More Jobs at Normal Computing

More Consumer Technology Jobs

Find similar Research Engineer, Inference jobs: