Senior Machine Learning Engineer

TetraMem Inc

$200K — $280K *
Information Technology
5 - 7 years of experience
Job Overview by Ladders

Qualifications

  • 5+ years of experience or PhD in Computer Science, Electrical Engineering, or related fields.
  • Strong expertise in machine learning, especially with edge AI and lightweight model deployment.
  • Proficient in ML frameworks like PyTorch, TensorFlow, and JAX.
  • Skilled in programming languages including C/C++ and Python, particularly for optimizing ML models.
  • Self-motivated, with the ability to thrive in a fast-paced startup environment.

Responsibilities

  • Develop, optimize, and deploy efficient machine learning models for audio processing in edge AI applications.
  • Implement ML models on embedded platforms, focusing on FPGA and custom ASIC solutions.
  • Collaborate with hardware and software teams to successfully integrate models into production systems.
  • Research and apply advanced ML techniques to improve model performance regarding efficiency and power consumption.
  • Enhance inference efficiency through techniques such as quantization, pruning, and knowledge distillation.
  • Provide mentorship and technical leadership to junior engineers within the team.
  • Engage in research dissemination by publishing findings, presenting at conferences, and contributing to open-source projects.

Benefits

  • Flexible work arrangements to support work-life balance.
  • Opportunities for professional growth through mentorship and leadership roles.
  • Exposure to cutting-edge technologies in machine learning and embedded systems.
  • Participation in conferences and potential publication opportunities to enhance professional visibility.
Full Job Description
Responsibilities:
  • Develop, optimize, and deploy lightweight machine learning models for edge AI applications, particularly for audio processing.
  • Implement and optimize ML models on embedded platforms, including FPGA and custom ASIC solutions.
  • Work closely with hardware and software teams to integrate ML models into production systems.
  • Research and implement state-of-the-art ML techniques to enhance model efficiency, latency, and power consumption for embedded AI applications.
  • Improve inference efficiency and model compression techniques, including quantization, pruning, and knowledge distillation.
  • Collaborate with cross-functional teams to drive innovation and contribute to the overall system architecture.
  • Provide technical leadership and mentorship to junior engineers.
  • Publish research findings, present at conferences, and contribute to open-source projects when applicable.

Requirements:
  • 5+ years of relevant industry experience (or a PhD) in Computer Science, Electrical Engineering, Machine Learning, or related fields.
  • Must have prior experience managing a team, serving in a Team Lead role, or demonstrating strong technical leadership and cross-functional coordination capabilities.
  • Strong hands-on experience in machine learning, with a focus on edge AI, on-device inference, and deploying lightweight models on resource-constrained devices.
  • Expertise in modern ML frameworks such as PyTorch, TensorFlow (including TensorFlow Lite), and JAX.
  • Proficiency in Python and C/C++, with practical experience in ML model optimization and production deployment.
  • Deep experience with model quantization (PTQ/QAT), pruning, knowledge distillation, sparsity, and other compression techniques for efficient edge inference.
  • Hands-on experience developing for or integrating with AI chip SDKs, neural accelerators (NPUs/DSPs), or hardware-specific toolchains (e.g., NVIDIA TensorRT, Qualcomm Neural Processing SDK, ARM Ethos, or similar).
  • Familiarity with edge inference runtimes (ONNX Runtime, ExecuTorch, TVM) and optimizing models for hardware constraints (latency, memory footprint, power consumption).

Experience in one or more of the following areas considered a strong plus:
  • Understanding of ML compiler and runtime design.
  • Experience working with tools such as Optimum, ONNX, TensorRT, TFLite/LiteRT, ncnn, or CoreML.
  • Familiarity with hardware acceleration techniques.
  • Experience in embedded system development.

Salary Range: $200,000 - $280,000 / year

Similar Jobs

More Jobs at TetraMem Inc

More Information Technology Jobs

Find similar Senior Machine Learning Engineer jobs: