Red Hat

Senior Principal Machine Learning Engineer, vLLM

Red Hat$206K — $351K *
Information Technology
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • Strong understanding of machine learning and deep learning fundamentals, especially LLM optimizations, Computer Vision, NLP, and reinforcement learning
  • Proficiency with tensor math libraries like PyTorch and NumPy
  • Solid programming skills, particularly in Python for machine learning solutions
  • Ability to articulate and execute research ideas and algorithms
  • Familiar with mathematical tools, notably in linear algebra
  • Knowledgeable in Linear Algebra, Gradients, Probability, and Graph Theory
  • Exceptional communication skills for diverse team interactions
  • BS, MS in Computer Science/Engineering; PhD in ML related domains is a plus

Responsibilities

  • Design, develop, and test various inference optimization algorithms in projects like vLLM and LLM-compressor
  • Manage inference deployment pipelines from inception to execution
  • Benchmark and evaluate various performance approaches on specific hardware
  • Engage in technical discussions to propose innovative solutions
  • Stay current on advancements in open-source LLM model architecture and quantization
  • Monitor latest CPU/GPU architectures to enhance AI inference performance
  • Conduct prompt and thoughtful code reviews
  • Mentor engineers and promote a culture of innovation and learning

Benefits

  • Comprehensive medical, dental, and vision coverage
  • Flexible Spending Accounts for healthcare and dependent care
  • Health Savings Accounts for high deductible plans
  • 401(k) retirement plan with employer matching
  • Paid time off, holidays, and parental leave
  • Disability leave and paid family medical leave
  • Tuition reimbursement and employee stock purchase plans
Full Job Description
Job Summary

As a Senior Principal Machine Learning Engineer focused on model optimization algorithms, you will work closely with our product and research teams to develop SOTA deep learning software. You will collaborate with our technical and research teams to develop LLM training and deployment pipelines, implement model compression algorithms, and productize deep learning research. If you are someone who wants to contribute to solving challenging technical problems at the forefront of deep learning in the open source way, this is the role for you.

Join us in shaping the future of AI!

What you will do
  • Contribute to the design, development, and testing of various inference optimization algorithms in the vLLM, and related projects, such as llm-d, LLM-compressor and speculators.
  • Create and manage inference serving deployment pipelines
  • Benchmark, profile, and evaluate different parallelizations, quantization and sparsification approaches to determine the best performance for specific hardware and models
  • Participate in technical design discussions and provide innovative solutions to complex problems
  • Stay up-to-date with the latest advancements in the open source LLM model architecture, LLM Inference parallelizations/optimizations techniques, and quantization research
  • Stay up-to-date of latest CPU and GPU hardware architecture and features to boost AI inference performance
  • Give thoughtful and prompt code reviews
  • Mentor and guide other engineers and foster a culture of continuous learning and innovation
  • Continuous collaboration with internal and external open source comitters and contributors while contributing to vLLM and related projects


What you will bring
  • Strong understanding of machine learning and deep learning fundamentals with experience in one or more of LLM Inference Optimizations, Computer Vision, NLP, and reinforcement learning
  • Experience with tensor math libraries such as PyTorch and NumPy
  • Strong programming skills with proven experience implementing Python based machine learning solutions
  • Ability to develop and implement research ideas and algorithms
  • Experience with mathematical software, especially linear algebra
  • Understanding of Linear Algebra, Gradients, Probability, and Graph Theory
  • Strong communications skills with both technical and non-technical team members
  • BS, or MS in computer science or computer engineering or a related field. A PhD in a ML related domain is considered a strong plus.


#LI-MD2

#AI-HIRING
#vllm-1

The salary range for this position is $206,600.00 - $351,050.00. Actual offer will be based on your qualifications.

Pay Transparency

Red Hat determines compensation based on several factors including but not limited to job location, experience, applicable skills and training, external market value, and internal pay equity. Annual salary is one component of Red Hat's compensation package. This position may also be eligible for bonus, commission, and/or equity. For positions with Remote-US locations, the actual salary range for the position may differ based on location but will be commensurate with job duties and relevant work experience.

Benefits
• Comprehensive medical, dental, and vision coverage
• Flexible Spending Account - healthcare and dependent care
• Health Savings Account - high deductible medical plan
• Retirement 401(k) with employer match
• Paid time off and holidays
• Paid parental leave plans for all new parents
• Leave benefits including disability, paid family medical leave, and paid military leave
• Additional benefits including employee stock purchase plan, family planning reimbursement, tuition reimbursement, transportation expense account, employee assistance program, and more!

Note: These benefits are only applicable to full time, permanent associates at Red Hat located in the United States.

About Red Hat

Red Hat, Inc. is a leading provider of open source software solutions, including Linux, Kubernetes, and Ansible. The company was founded in 1993 and is headquartered in Raleigh, North Carolina. Red Hat operates in over 100 countries and has more than 13,000 employees worldwide. The company is committed to open source innovation and has a strong community of developers and partners. Red Hat was acquired by IBM in 2019 and is now part of IBM's Hybrid Cloud division.
Learn more about Red Hat
Size
13,000 employees
Industry
Founded
1993

Similar Jobs

More Jobs at Red Hat

More Information Technology Jobs

Find similar Senior Principal Machine Learning Engineer, vLLM jobs: