About AlembicAlembic is pioneering a revolution in marketing, proving the true ROI of marketing activities. The Alembic Marketing Intelligence Platform applies sophisticated algorithms and AI models to finally solve this long-standing problem. When you join the Alembic team, you'll help build the tools that provide unprecedented visibility into how marketing drives revenue, helping a growing list of Fortune 500 companies make more confident, data-driven decisions.
About the RoleWe're looking for a
Machine Learning Applications Engineer with GPU, Python, and C++ expertise to help productionize cutting-edge causal AI models. You'll work closely with ML scientists to turn experimental research code into optimized, scalable, and well-structured software that powers Alembic's real-time analytics and inference systems.
This is a hands-on, performance-focused role where you'll operate at the intersection of applied ML, systems engineering, and high-performance computing.
Key Responsibilities- Translate early-stage ML research and prototypes into reliable, testable, and performant software components
- Use CUDA, Triton, and Numba to optimize GPU-accelerated workloads for inference and preprocessing
- Contribute to core libraries and performance-critical routines using modern C++ in hybrid Python/C++ environments
Develop modular, reusable infrastructure that supports deployment of ML workloads at scale
Collaborate with data scientists and engineers to optimize data structures, memory usage, and execution paths - Build interfaces and APIs to integrate ML components into Alembic's broader platform
Implement logging, profiling, and observability tools to track performance and model behavior
Must-Have Qualifications- 4-7 years of software engineering experience, including substantial time in Python and C++
- Hands-on experience with GPU programming, including CUDA, Triton, Numba, or related frameworks
Strong familiarity with the Python data stack (Pandas, NumPy, PyArrow) and low-level performance tuning
Experience writing high-performance, memory-efficient code in C++ - Demonstrated ability to work cross-functionally with researchers, platform engineers, and product teams
- Comfort transforming research-grade ML code into maintainable, production-grade software
Nice-to-Have- Experience with hybrid Python/C++ or Python/CUDA extension development (e.g., Pybind11, Cython, custom ops)
- Familiarity with ML serving or inference tools (e.g., TorchServe, ONNX Runtime, Triton Inference Server)
- Exposure to structured data modeling, causal inference, or large-scale statistical computation
- Background in distributed systems or parallel processing is a plus
What You'll Get- A pivotal role building GPU-accelerated software at the heart of a real-world AI product
- Collaboration with an elite team of ML scientists, engineers, and product leaders
- The opportunity to shape performance-critical infrastructure powering enterprise decision-making
- A culture rooted in technical rigor, curiosity, and product impact