Please Note:To provide the best candidate experience amidst our high application volumes, each candidate is limited to 10 applications across all open jobs within a 6-month period.
The AGI (Artificial General Intelligence) Computing Lab is dedicated to solving the complex system-level challenges posed by the growing demands of future AI/ML workloads. Our team is committed to designing and developing scalable platforms that can effectively handle the computational and memory requirements of these workloads while minimizing energy consumption and maximizing performance. To achieve this goal, we collaborate closely with both hardware and software engineers to identify and address the unique challenges posed by AI/ML workloads and to explore new computing abstractions that can provide a better balance between the hardware and software components of our systems. Additionally, we continuously conduct research and development in emerging technologies and trends across memory, computing, interconnect, and AI/ML, ensuring that our platforms are always equipped to handle the most demanding workloads of the future. By working together as a dedicated and passionate team, we aim to revolutionize the way AI/ML applications are deployed and executed, ultimately contributing to the advancement of AGI in an affordable and sustainable manner. Join us in our passion to shape the future of computing!
Location: Daily onsite presence at our San Jose, CA office / U.S. headquarters in alignment with our Flexible Work policy.
What You'll Do- Lead the co-design of software and hardware solutions that optimize AI model inference performance, with a focus on overcoming memory bottlenecks.
- Analyze and optimize LLM and agentic AI workloads across the full software stack, identifying opportunities for hardware-aware acceleration.
- Profile and characterize model execution to expose memory wall limitations and guide architectural decisions for HBM and memory-centric compute.
- Collaborate with hardware teams to influence memory architecture, acceleration strategies, and compute placement based on real workload behavior.
- Develop, optimize, and benchmark inference and serving solutions using frameworks such as PyTorch and vLLM.
- Define best practices and provide technical mentorship across software-hardware co-design efforts.
What You Bring- Bachelor's with 15+ years, or Master's with 13+ years, or PhD's with 10+ years of industry experience.
- Strong experience writing high-performance AI framework software development for GPUs or other accelerators.
- Strong, end-to-end understanding of the AI infrastructure, AI software stack, from model definition through deployment and serving.
- Solid understanding of LLM model architectures and workflows, including modern transformer-based designs.
- Solid understanding of agentic AI architecture and workflows.
- Hands-on expertise with the PyTorch framework.
- Practical experience with vLLM for high-throughput model inference and serving.
- Solid understanding of the memory wall problem and its impact on AI system performance.
- Strong knowledge of memory architecture, including High Bandwidth Memory (HBM), and familiarity with memory-centric acceleration and compute approaches.
- Proficiency working in a Linux development environment.
- Solid command of development tooling, including agentic coding, GitHub and Jira.
#LI-VL1
What We OfferThe pay range below is for all roles at this level across all US locations and functions. Pay within this range varies by work location and may also depend on job-related knowledge, skills, and experience. We also offer incentive opportunities that reward employees based on individual and company performance.
This is in addition to our diverse package of benefits centered around the wellbeing of our employees and their loved ones. In addition to the usual Medical/Dental/Vision/401k, our inclusive rewards plan empowers our people to care for their whole selves. An investment in your future is an investment in ours.
Give Back With a charitable giving match and frequent opportunities to get involved, we take an active role in supporting the community.
Enjoy Time Away You'll start with 4+ weeks of paid time off a year, plus holidays and sick leave, to rest and recharge.
Care for Family Whatever family means to you, we want to support you along the way-including a stipend for fertility care or adoption, medical travel support, and virtual vet care for your fur babies.
Prioritize Emotional Wellness With on-demand apps and free confidential therapy sessions, you'll have support no matter where you are.
Stay Fit Eating well and being active are important parts of a healthy life. Our onsite Café and gym, plus virtual classes, make it easier.
Embrace Flexibility Benefits are best when you have the space to use them. That's why we facilitate a flexible environment so you can find the right balance for you.
Base Pay Range
$189,000-$301,000 USD