L2 Support Engineer

Blitzy

• $100K — $140K *

Cambridge, MA 02139In-Person

Technical Services

Less than 5 years of experience

Today

Be an Early Applicant

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

5+ years in L2 support or similar role with hands-on experience in distributed systems.
Proficiency in Kubernetes and Docker environments.
Experience with major cloud providers (GCP, AWS, Azure) and their services.
Strong skills in monitoring with experience in APM/observability tools like Datadog.
Ability to troubleshoot across compute, network, and storage layers.

Responsibilities

Deploy and install platforms in customer environments, troubleshooting installation challenges.
Maintain stability of customer environments through ongoing upgrades and support.
Work closely with L1 support to triage and resolve customer-reported issues.
Diagnose failures throughout the entire system stack and maintain logs.
Establish runbooks and dashboards to minimize recurring issues and improve resolutions.
Document clear escalations and post-incident reviews with evidence and insights.
Communicate resolution status and updates to customers clearly and promptly.

Benefits

Flexible hours around US business time zones, with shared on-call responsibilities.
Fair compensation for on-call duties and recovery time after incidents.
Opportunities for growth in a supportive team focused on customer success.

Full Job Description

Location: Cambridge, MA (On-site)

Compensation: $100,000 - $140,000 salary + equity

The Role

The role is to support our clients and ensure a stable environment across the full lifecycle: installation, ongoing upgrades, and day-to-day operation. The L2 Support Engineer works alongside L1 to triage and resolve issues, and escalates unresolved defects to engineering. It operates across Kubernetes, Docker, and the major cloud providers.

What Success Looks Like

Customers' issues are resolved faster and escalated cleaner.
Recurring problems turn into runbooks, dashboards, and alerts, not repeat tickets.
Engineering trusts your escalations because they come with proof, not guesses.
Customers trust your communication because it's clear, honest, and on time.

Areas of Ownership

Deploy and install the platform into customer environments, and troubleshoot installation issues.
Support ongoing upgrades and day-to-day operation, keeping customer environments stable.
Work alongside L1 to triage and resolve customer-reported issues, driving them to resolution or escalation.
Diagnose failures across the stack: compute, networking, storage, and the services running on it.
Reproduce issues safely against live (often multi-tenant) environments using read-only diagnostics first.
Build and maintain dashboards, monitors, and runbooks so recurring issues get faster to fix: or stop recurring.
Write up clear, evidence-backed escalations and post-incident notes.
Communicate status and resolution to customers clearly and on time.

Required Experience

Distributed-systems debugging. Reason about a request crossing multiple services, queues, and network hops, and isolate which hop failed. You debug by forming a hypothesis and confirming it with evidence (logs, pod state, queue depth, DB rows), not by guessing.
Kubernetes & Docker.
Major cloud providers: GCP, AWS, and Azure. Hands-on with at least one deeply and able to work across the others: managed Kubernetes (GKE/AKS), cloud logging, IAM/auth basics, and cloud disk/storage behavior.
Strong monitoring & observability practice. Fluent with an APM/observability stack (Datadog or equivalent): log queries, correlating across services by request/trace IDs, reading traces, and building dashboards and alerts. You reach for the data before theorizing.

Additional Skills & Experience

Python and Redis literacy.
Basic message queueing. Command transport runs over a message queue (Redis/rq). Comfort inspecting queue depth, backlogs, and stuck/failed jobs; concepts transfer from any broker.
Networking & WebSockets. Many of our hardest issues are connection problems: WebSocket/Socket.IO drops, NAT/idle/LB timeouts, half-open sockets, DNS-vs-routing, TLS. Tell a transport fault from an application fault.
SQL / PostgreSQL. Query operational tables to confirm what the system recorded.
Source-control platforms. GitHub (incl. GitHub Enterprise Server), Azure DevOps, and/or GitLab, clone/push/pull, access tokens, app credentials, and their failure modes.
CI/CD, Helm & deploy integrity. Many "sudden regressions" are a bad or partial deploy: check what version is actually running before chasing architecture theories. Helm and container deploy pipelines expected. ArgoCD is a plus.
Secrets management. Comfort handling secrets, credentials, and certificates safely, ideally with Vault (strongly preferred).
Linux and Windows. Workloads run on both; comfort triaging on each OS (process inspection, filesystem, basic networking).
Methodical, evidence-first temperament. Hold several candidate causes at once, run the cheapest disconfirming check first, and never claim a root cause or fix you haven't proven.
Multi-tenant safety mindset. Environments are shared and customer-owned: default to read-only diagnostics and understand blast radius before changing anything.
Incident management & ticketing workflows: Jira or similar (a plus).
Prior customer-facing support or SRE/on-call experience (a plus).

Hours & On-Call: please read

This is a customer support role, and the hours can be unconventional. Customers operate primarily in US time zones, so coverage is anchored to US business hours (roughly ET-PT). If you're based outside the US, expect your working day to shift accordingly.

Incidents don't keep office hours. Expect a rotating on-call schedule and occasional evening, early-morning, or weekend escalations outside a standard 9-5. We structure for it: rotations are shared fairly, on-call is compensated/time-off-in-lieu per policy, and we protect recovery time after heavy incidents.

If you're not comfortable with US-aligned hours and periodic off-hours on-call, this likely isn't the right role, and that's completely fine.

* Ladders Estimates

Similar Jobs

Deskside and Server Support Engineer
$85K — $110K *
Zensar Technologies
Boston, MA 02115 (Suffolk County)
Today
Technical Support Engineer
$99K — $112K *
Verkada Inc.
New York, NY 10025 (New York County)
Today
Technical Account Manager
$105K — $125K *
Admarketplace, Inc
New York, NY 10025 (New York County)
Today
Technical Support Engineer - University Graduate 2026
$91K — $112K *
Verkada Inc.
New York, NY 10025 (New York County)
Today
Technical Support Engineer (Premium Team)
$72K — $104K *
Gong.io
New York City, NY 10025 (New York County)
Reposted Today
Technical Support Engineer
$90K — $120K *
apiiro
Remote
Reposted Today

Get Ready For Your
Next Interview

More Jobs at Blitzy

L2 Support Engineer
$100K — $140K *
Cambridge, MA 02139 (Middlesex County)
Today
Technical Services
In-Person
Field CTO
$275K — $425K *
Cambridge, MA 02139 (Middlesex County)
Today
Enterprise Technology
In-Person
Enterprise Account Executive
$125K — $200K *
Cambridge, MA 02139 (Middlesex County)
Yesterday
Enterprise Technology
In-Person
Developer Relations (DevRel) Engineer + Creator
$100K — $200K *
Cambridge, MA 02139 (Middlesex County)
3 days ago
Consumer Technology
In-Person
Director, Solution Consulting
$180K — $265K *
Cambridge, MA 02139 (Middlesex County)
4 days ago
Enterprise Technology
In-Person

More Technical Services Jobs

BI Consultant & Solutions Lead
$120K — $150K *
Confidential Company
San Diego, CA 92101 (San Diego County)
1 week ago
Jurisdictional Boiler Inspector - Alaska
$75K — $95K *
Liberty Mutual
Anchorage, AK 99504 (Anchorage County)
Today
Technical Consultant
$70K — $95K *
HireRight
Nashville, TN 37211 (Davidson County)
Today
Principal Customer Support Specialist (Remote)
$90K — $120K *
Avenu Holdings LLC
Remote
Reposted Today
Robotics Programmer
$70K — $95K *
Novarc Technologies, Inc.
Burnaby, BC V3J 1A1
Reposted Today

Find similar L2 Support Engineer jobs:

Nationwide Cambridge, MA

L2 Support Engineer

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar L2 Support Engineer jobs:

Get Ready For Your
Next Interview