Staff / Principal Platform EngineerLocation: New York City Hybrid
Department: AI Platform & Infrastructure Team
Reports to: Vangie Shue - Principal Engineering Manager
About the RoleWe're looking for a Staff or Principal Platform Engineer to build and operate the foundational platform behind AppGate's AI products. You combine deep DevOps and cloud infrastructure expertise with hands-on experience operationalizing AI/ML systems, and you treat observability as a first-class engineering discipline.
You'll own the platform spanning APIs, cloud services and AI/ML infrastructure, and you'll make it fast, reliable and observable at scale. This is a high-leverage, hands-on role for a senior engineer who sets technical direction and still ships.
Key Responsibilities- Integrations: third-party connectors, APIs and platform integrations.
- Data Platform: real-time and batch data ingestion pipelines, feature stores and data quality.
- AI/ML Infrastructure: model serving, inference pipelines and experiment tracking.
- MLOps: model deployment, versioning, drift monitoring and lifecycle management.
- Platform Observability: SLOs, operational health metrics and reliability tooling for AI systems.
- Build the platform: design, build and operate the cloud infrastructure, services and pipelines that AppGate's AI products run on.
- Implement observability: instrument APIs, cloud services and AI/ML infrastructure with metrics, logging, tracing and alerting, and define SLOs and operational health metrics that teams trust.
- Operationalize AI/ML: build model serving and inference pipelines, experiment tracking and the MLOps tooling for deployment, versioning, drift monitoring and lifecycle management.
- Engineer for reliability: apply SRE practices to reduce toil, improve resilience and keep latency and uptime within target across the platform.
- Automate everything: drive infrastructure-as-code, CI/CD and self-service tooling so product teams ship safely and quickly.
- Set technical direction: define platform standards, architecture and best practices, and raise the engineering bar through design reviews and mentorship.
- Collaborate cross-functionally: partner with data scientists, product teams and leadership to align platform investment with AppGate's strategic vision.
Required Qualifications- Experience: extensive platform, infrastructure or SRE engineering experience, with a track record of operating production systems at scale. Staff-level candidates typically bring 8+ years and Principal-level candidates 12+ years, though we hire on demonstrated impact.
- DevOps depth: strong command of cloud platforms (AWS, GCP or Azure), containers and orchestration (Docker, Kubernetes), infrastructure-as-code (Terraform or equivalent) and CI/CD.
- Observability expertise: hands-on experience implementing observability across APIs, cloud services and distributed systems using tools such as Prometheus, Grafana, OpenTelemetry, the ELK stack or comparable, including SLO and error-budget practice.
- AI/ML infrastructure: experience building or operating model serving, inference pipelines and MLOps tooling such as MLflow, Kubeflow, SageMaker or equivalent, including model deployment, versioning and drift monitoring.
- Data platform skills: familiarity with real-time and batch ingestion pipelines, feature stores and data quality at production scale.
- Engineering craft: fluency in a primary backend language (Python, Go or similar) and a strong bias toward automation, testing and reliable, maintainable systems.
- Leadership: a record of setting technical direction, leading complex initiatives across teams and mentoring senior engineers.
- Mindset: pragmatic, rigorous and ownership-driven. You thrive in a small, fast-moving environment and enjoy building foundations others depend on.
Compensation - Staff: 185k-225k base
- Principal: 215k-270k base
- We offer performance bonuses and considerable equity.