Member of Technical Staff, Inference

Radical Numerics, Inc

$130K — $180K *
Pharmaceuticals & Biotech
5 - 7 years of experience
Job Overview by Ladders

Qualifications

  • 5-7 years of experience in AI inference systems with a focus on large language models or foundation models.
  • Strong performance engineering and kernel development background with a focus on GPU architectures, particularly using technologies like CUDA or Triton.
  • Ability to analyze and solve complex bottlenecks across multi-layered infrastructures including model, storage, memory, and network operations.
  • Proficient in Python and modern machine learning frameworks, such as PyTorch, with a solid software engineering foundation.
  • Experience collaborating with users or stakeholders to build and maintain production-level AI systems.
  • Excellent communication skills to interact with diverse teams spanning research, engineering, and biology.

Responsibilities

  • Drive performance improvements by identifying and resolving bottlenecks in the inference stack.
  • Develop high-performance GPU kernels and numerical operators for optimized model serving.
  • Partner with external clients to effectively deploy models and address significant technical challenges.
  • Build scalable deployment infrastructures for managing AI model operations across various environments.
  • Collaborate with research teams to facilitate the integration of new AI models into existing production systems.

Benefits

  • The opportunity to build infrastructure for next-gen biological AI models with significant real-world applications.
  • Engagement in transformative projects within therapeutics, diagnostics, synthetic biology, and biodefense sectors.
  • Collaboration with top-tier experts in distributed systems, AI architecture, and biological sciences.
  • Work alongside leading researchers from various prestigious labs, biotechs, and government programs.
Full Job Description
About the Role

As a Member of Technical Staff, Inference at Radical Numerics, you will build and optimize the systems that bring frontier biological AI models into production. Your work will focus on delivering state-of-the-art inference performance for large-scale genome and multimodal biological models across a wide range of real-world applications, including therapeutics, diagnostics, synthetic biology, and biodefense.

This is a highly technical role at the intersection of AI systems, distributed computing, and model deployment. You will work closely with research, infrastructure, and external partners to ensure our models can be efficiently deployed, scaled, and integrated into production environments. Success in this role requires deep expertise in large language model inference, kernel optimization, GPU systems, and performance engineering.

You should be excited by questions such as: How do we reduce inference latency for 100B MoE models? How do we maximize throughput across heterogeneous hardware environments? How do we optimize custom kernels for emerging hybrid model architectures? How do we deploy foundation models reliably across cloud, on-premise, and highly regulated environments? How do we enable our partners to transform biological research and development through production-grade AI systems?

What You9ll Do

Drive end-to-end performance improvements. Identify and eliminate bottlenecks across the inference stack, from model execution and memory management to networking, scheduling, and hardware utilization.

Develop high-performance inference primitives. Build and optimize GPU kernels, numerical operators, and serving infrastructure to maximize throughput, latency, and efficiency on modern accelerator platforms.

Partner with external customers and collaborators. Work directly with pharmaceutical companies, biotech organizations, research institutions, and government partners to deploy models in production environments and solve challenging technical problems.

Build scalable deployment infrastructure. Create systems for serving, monitoring, benchmarking, and operating foundation models reliably across cloud, enterprise, and secure environments.

Collaborate with research and platform teams. Ensure new model architectures can be efficiently deployed at scale and help translate frontier AI research into real-world impact.

What We9re Looking For

Expertise in large-scale AI inference systems. Proven experience optimizing, deploying, and operating LLMs or other foundation models in production environments.

Strong performance engineering and kernel development skills. Deep understanding of GPU architectures and experience with CUDA, Triton, or equivalent technologies for building high-performance numerical software.

Systems-level thinking. Ability to diagnose and solve bottlenecks across the full stack, including model architectures, serving systems, networking, memory management, and distributed infrastructure.

Hands-on builder. Strong software engineering fundamentals with proficiency in Python and modern ML frameworks such as PyTorch.

Customer and deployment orientation. Experience working closely with users, customers, or cross-functional stakeholders to deliver production AI systems that solve real-world problems.

Excellent technical communication. Ability to collaborate effectively across research, engineering, infrastructure, and scientific teams.

Nice to Have
  • Experience with inference frameworks such as vLLM, TensorRT-LLM, SGLang, DeepSpeed, or similar systems.
  • Contributions to open-source AI infrastructure, inference frameworks, compilers, or kernel libraries.
  • Experience with distributed systems, cloud infrastructure, and large-scale GPU clusters.
  • Familiarity with biological foundation models, computational biology, genomics, or scientific AI applications.
  • Experience operating AI systems in regulated, secure, or mission-critical environments
Why Radical Numerics

Help build the infrastructure that powers the next generation of biological AI models and deploy them into some of the world9s most important scientific and healthcare applications.

Work on some of the largest and most capable open biological AI models, helping transform breakthroughs in AI research into real-world impact across therapeutics, diagnostics, synthetic biology, and biodefense.

Join a team that brings together expertise in distributed systems, model architecture, numerics, AI safety, and biology.

Collaborate with leading researchers across AI labs, biotechs, pharmaceutical companies, hospital systems, government programs, and scientific institutions.

Similar Jobs

More Jobs at Radical Numerics, Inc

More Pharmaceuticals & Biotech Jobs

Find similar Member of Technical Staff, Inference jobs: