Staff Platform Engineer - AI Infrastructure

Paytm Labs • $120K — $150K *

Toronto, ON M3C 0E3Hybrid

Enterprise Technology

8 - 10 years of experience

1 month ago

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

8+ years in software engineering, with 3+ years focused on building infrastructure platforms or ML/AI infrastructure.
Deep expertise in cloud platforms (AWS, GCP) and Kubernetes management.
Hands-on experience with GPU workloads and model serving using tools like vLLM and TensorRT-LLM.
Solid programming skills in Python, Go, or C++.
Proven proficiency with infrastructure-as-code frameworks (Terraform, Pulumi, CDK).
Experience in designing self-service platforms or internal developer tooling for team usage.
Knowledge of model optimization strategies like quantization and batching.

Responsibilities

Design and manage GPU infrastructure for optimal model hosting and cost-efficiency.
Build systems for scaling model serving with strong guarantees on latency and uptime.
Implement routing for various model types (text, voice, vision) on shared infrastructures.
Oversee the complete model lifecycle from deployment to monitoring and scaling.
Optimize inference processes using techniques like quantization and caching for improved efficiency.
Create self-service infrastructure platforms enabling teams to provision model endpoints via APIs.
Establish observability standards and reliability metrics like SLIs and SLOs for inference systems.

Benefits

Opportunities for professional development and continuous learning.
Flexible working arrangements to support work-life balance.
Access to cutting-edge technology and tools in AI and infrastructure.
Collaboration in a dynamic team environment with diverse expertise.

Full Job Description

About the Role

As a Staff Platform Engineer - AI Infrastructure, you will build and scale the infrastructure behind

Paytm's AI inference platform, serving internal teams and enterprise customers and supporting

new customer use cases from the ground up. You will own GPU infrastructure, model hosting

and serving, and multi-model routing across modalities. This includes running our own coding

and domain-specific models (voice, vision, risk, fintech workflows) as well as third-party models

on shared GPU and accelerator clusters.

You will also build self-service platforms that let teams provision, compute, deploy and

customize models, and manage resources through APIs and control planes, so they can use AI

without rebuilding infrastructure each time.

Your work will form the AI control plane for Paytm Intelligence (Pi): policy-driven routing, quotas,

observability, and usage and cost visibility. It will directly affect how fast we ship agents and AI

features, how reliably they run, and how efficiently we use our hardware across payments, risk,

fraud, collections, support, and developer experience.

What You'll Do

Design and operate GPU infrastructure for model hosting, including provisioning, scheduling, and cost optimization across cloud and on-premise environments
Build and scale model serving systems using vLLM, TensorRT-LLM, Triton, or equivalent, supporting real-time inference with strong latency and availability guarantees
Implement multi-model routing to serve multiple models across modalities (text, voice, code, vision) on shared infrastructure
Own the model lifecycle end to end: download, deploy, serve, monitor, swap, and scale
Drive inference optimization including quantization strategies (AWQ, GPTQ), batching, caching, and cold start reduction
Build self-service infrastructure platforms where teams provision compute, storage, and model endpoints through APIs and control planes
Implement infrastructure-as-code at scale using Terraform, Pulumi, or CDK
Build observability and reliability for inference systems: SLIs/SLOs, GPU utilization
monitoring, latency tracking, automated capacity planning, and alerting
Define platform standards and governance including multi-tenant isolation, cost attribution, and resource quotas
Lead architectural design and influence engineering direction across the AI infrastructure stack

What You'll Bring

8+ years of software engineering experience, including 3+ years building infrastructure platforms or ML/AI infrastructure
Deep experience with cloud infrastructure (AWS, GCP) and Kubernetes
Hands-on experience with GPU workloads and model serving (vLLM, TensorRT-LLM, Triton, or similar)
Strong software engineering fundamentals in Python, Go, or C++
Experience with infrastructure-as-code (Terraform, Pulumi, CDK)
Experience designing self-service platforms or internal developer tooling
Understanding of model optimization: quantization, batching, serving architectures
Proven ability to lead complex cross-team technical initiatives
Strong communication skills and the ability to influence technical direction

Nice to Have

Experience building or operating inference infrastructure at scale
Experience with CUDA, GPU scheduling, or hardware-level optimization
Experience with multi-model serving across different modalities
Experience with edge inference or on-device model deployment
Experience with model fine-tuning infrastructure (LoRA, QLoRA, PEFT)
Background in fintech or regulated industries

About Paytm Labs

Paytm Labs is the Canadian research and development division of Paytm, an Indian mobile payment and financial services company. Paytm Labs is based in Toronto, Ontario, and focuses on developing new technologies related to mobile payments, e-commerce, and financial services. The company was founded in 2014 and has since grown to over 500 employees. Paytm Labs is a subsidiary of One97 Communications, the parent company of Paytm. One97 Communications is headquartered in Noida, India, and was founded in 2000 by Vijay Shekhar Sharma.

Learn more about Paytm Labs

Size

500 employees

Industry

Enterprise Technology

Founded

2014

* Ladders Estimates

Similar Jobs

Principal AI Engineer
$130K — $180K *
Risepoint
Remote
Yesterday
AI Solutions Architect- Federal
$120K — $150K *
HiddenLayer
Remote
3 days ago
Sr. Manager, Technical Architect – Enterprise AI Solutions
$120K — $150K *
Canadian Solar
Kitchener, ON N2A 1A5
5 days ago
Associate Manager, AI Workforce Architect
$144K — $171K *
Avanade Inc.
Ottawa, ON K1G 3J6
5 days ago
Principal Architect
$148K — $223K *
Ensemble Health Partners
Remote
5 days ago
Associate Manager, AI Workforce Architect
$144K — $171K *
Accenture
Ottawa, ON K1G 3J6
6 days ago

Get Ready For Your
Next Interview

More Jobs at Paytm Labs

Staff Platform Engineer - AI Infrastructure
$120K — $150K *
Toronto, ON M3C 0E3
1 month ago
Enterprise Technology
Hybrid

More Enterprise Technology Jobs

Associate Data Scientist
$70K — $95K *
Jerry Insurance Agency, LLC
Miami, FL 33186 (Miami-Dade County)
Today
Manager, BizOps & Analytics
$90K — $130K *
Jerry Insurance Agency, LLC
Denver, CO 80219 (Denver County)
Today
Senior Manager, BizOps & Analytics
$120K — $160K *
Jerry Insurance Agency, LLC
San Francisco, CA 94112 (San Francisco County)
Today
Data Scientist
$90K — $130K *
Jerry Insurance Agency, LLC
Los Angeles, CA 90011 (Los Angeles County)
Today
Associate, BizOps & Analytics
$70K — $95K *
Jerry Insurance Agency, LLC
Los Angeles, CA 90011 (Los Angeles County)
Today

Find similar Staff Platform Engineer - AI Infrastructure jobs:

Nationwide Toronto, ON

Staff Platform Engineer - AI Infrastructure

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar Staff Platform Engineer - AI Infrastructure jobs:

Get Ready For Your
Next Interview