Architect - Platform Engineer

Quantiphi • $120K — $160K *

US-AnywhereRemote in United States

Enterprise Technology

8 - 10 years of experience

Today

Be an Early Applicant

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

10+ years of experience in technology roles, with a focus on infrastructure for AI workloads
Hands-on expertise with Slurm and distributed training environments
Deep knowledge of the NVIDIA GPU ecosystem, including CUDA and cuDNN
Strong foundation in Linux systems, performance tuning, and multi-GPU optimization
Experience deploying GenAI workloads, particularly LLM fine-tuning and RAG pipelines
Familiarity with Infrastructure-as-Code tools like Terraform and Ansible
Experience with cloud GPU environments such as GCP, Azure, and AWS.

Responsibilities

Design and implement scalable infrastructure for LLM and GenAI workloads in multi-GPU settings
Optimize GPU performance for distributed training tasks
Manage compute-intensive jobs using Slurm on OpenShift/Kubernetes
Collaborate with teams to deploy and support models in production
Develop reusable infrastructure templates with tools like Terraform and Helm
Drive the adoption of Infrastructure as Code (IaC) practices
Automate development processes through CI/CD pipelines.

Benefits

Impact at a fast-growing AI-first digital engineering company
Opportunities to upskill and tackle complex challenges with skilled colleagues
Work with innovative teams in a research-focused environment, over 60 patents filed
Gain exposure to cutting-edge AI, ML, data, and cloud technologies in Fortune 500 settings.

Full Job Description

Role:Architect - Platform Engineer

Experience Level:10+ yrs

Work Location:US East/Canada (Remote)

Role Overview:

We are looking for a highly skilled Architect - Platform Engineer to design, optimize, and scale infrastructure for GenAI and LLM workloads. This role is ideal for someone with deep hands-on experience in GPU profiling, distributed training, and high-performance compute environments. You will be working with Architects from other specialties such as Data engineering, Software engineering, ML engineering to create platforms, solutions and applications that cater to latest trends

You'll play a key role in building out GenAI platform foundations, supporting production-grade deployments, and partnering closely with data science, MLOps, and application teams to bring cutting-edge AI solutions to life.

Key Responsibilities:

Design and implement scalable infrastructure for LLM and GenAI workloads across multi-GPU environments
Perform GPU profiling, benchmarking, and performance optimization for distributed training workloads
Manage and schedule compute-intensive jobs using Slurm-based clusters and OpenShift/Kubernetes environments
Enable and optimize the NVIDIA GPU stack (CUDA, cuDNN, NCCL, Triton, RAPIDS, etc.)
Collaborate with cross-functional teams to deploy models in research and production environments
Build and support GenAI pipelines (fine-tuning, RAG, multi-modal inferencing, LLMOps)
Develop reusable infrastructure templates using tools like Terraform and Helm
Contribute to internal innovation (PoCs, workshops) and support client-facing delivery engagements
Develop and deliver automation software required for building & improving the functionality, reliability, availability, and manageability of applications and cloud platforms
Champion and drive the adoption of Infrastructure as Code (IaC) practices and mindset
Design, architect, and build self-service, self-healing, synthetic monitoring and alerting platform and tools
Automate the development and test automation processes through CI/CD pipeline (Git, Jenkins, SonarQube, Artifactory, Docker containers)
Build container hosting-platform using Kubernetes
Introduce new cloud technologies, tools; processes to keep innovating in the commerce area to drive greater business value.
Lead the technical discussion regarding architecture designing and troubleshooting with the clients and provide solutions proactively as required

Basic Qualifications:

Strong experience with Slurm and distributed training environments
Hands-on expertise with Red Hat OpenShift and/or Kubernetes
Deep knowledge of the NVIDIA GPU ecosystem (CUDA, cuDNN, NCCL, Nsight, Triton/TensorRT)
Strong foundation in Linux systems, performance tuning, and multi-GPU optimization
Experience deploying GenAI workloads (LLM fine-tuning, RAG pipelines, multi-modal systems)
Familiarity with Infrastructure-as-Code tools (Terraform, Ansible)
Experience with cloud GPU environments (GCP, Azure, AWS, OCI) and/or on-prem GPU clusters
Serve as a mentor or guide for senior resources / team leads.
Lead the technical discussion regarding architecture design

Other Qualifications (OQs):

Experience with NVIDIA NIMs, DGX systems, or GPU-accelerated containers
Knowledge of LLMOps frameworks and MLOps integration
Familiarity with vector databases and retrieval systems for RAG architectures
Comfortable working in client-facing environments and collaborating with AI solution teams

Healthcare Domain Experience (Nice to Have):

Experience working with FHIR R4, HL7 v2, or SMART on FHIR
Integration with EHR systems (e.g., Epic)
Understanding of HIPAA compliance and healthcare data privacy
Exposure to clinical workflows, CDS Hooks, or patient-facing applications
Experience building clinical decision support systems or healthcare interoperability solutions

What's in it for YOU at Quantiphi:

Make an impact at one of the world's fastest-growing AI-first digital engineering companies.
Up-skill and discover your potential as you solve complex challenges in cutting-edge areas of technology alongside passionate, talented colleagues.
Work where innovation happens - work with disruptive innovators in a research-focused organization with 60+ patents filed across various disciplines.
Stay ahead of the curve, immerse yourself in breakthrough AI, ML, data, and cloud technologies and gain exposure working with Fortune 500 companies.

If you like wild growth and working with happy, enthusiastic over-achievers, you'll enjoy your career with us!

About Quantiphi

Quantiphi is an artificial intelligence and machine learning services company that helps businesses transform their operations through the use of AI. The company provides a range of services, including data engineering, machine learning, computer vision, natural language processing, and predictive analytics. Quantiphi was founded in 2013 and is headquartered in King of Prussia, Pennsylvania.

Learn more about Quantiphi

Size

500 employees

Industry

Information Technology

Founded

2013

* Ladders Estimates

Similar Jobs

Systems Architect - DevSecOps
$130K — $160K *
Analytica
Bethesda, MD 20817 (Montgomery County)
Today
Lenel OnGuard Certified - Principal Security Systems Architect
$130K — $205K *
HP Development Company, L.P.
Spring, TX 77379 (Harris County)
Reposted Today
Senior Connectivity Architect
$120K — $150K *
CEdge Inc
Temecula, CA 92592 (Riverside County)
Today
Senior Connectivity Architect
$120K — $150K *
CEdge Inc
Boston, MA 02115 (Suffolk County)
Today
Platform Engineer
$123K — $166K *
Defense Unicorns
San Antonio, TX 78228 (Bexar County)
Today
Senior Ansible Automation & Platform Engineer
$100K — $130K *
General Motors
Warren, MI 48089 (Macomb County)
Today

Get Ready For Your
Next Interview

More Jobs at Quantiphi

Architect - Platform Engineer
$120K — $160K *
Remote
Today
Enterprise Technology
Remote in United States
Cloud Application Architect
$120K — $160K *
Remote
Reposted Today
Information Technology
Remote
Engagement Manager
$120K — $150K *
Princeton, NJ 08540 (Mercer County)
2 days ago
Pharmaceuticals & Biotech
In-Person
Architect ML - AI Researcher
$120K — $160K *
Remote
2 days ago
Healthcare
Remote in United States
Architect - Platform Engineering - USA
$120K — $160K *
Remote
3 weeks ago
Information Technology
Remote in United States

More Enterprise Technology Jobs

Principal Director, Enterprise Applications
$180K — $271K *
The Aerospace Corporation
El Segundo, CA 90245 (Los Angeles County)
Reposted Today
Gen AI Engineering Senior Director
$181K — $213K *
U.S. Bank
Chicago, IL 60629 (Cook County)
Today
SAP NS2 Sr. Program Manager
$131K — $271K *
SAP
Herndon, VA 20171 (Fairfax County)
Today
Gen AI Software Engineering Senior Manager
$185K — $300K *
Wells Fargo
Charlotte, NC 28269 (Mecklenburg County)
Today
AI/ML Architect
$150K — $200K *
Siemens
Austin, TX 78745 (Travis County)
Today

Find similar Architect - Platform Engineer jobs:

Nationwide Remote

Architect - Platform Engineer

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar Architect - Platform Engineer jobs:

Get Ready For Your
Next Interview