Production Support - Azure Cloud

Macpower Digital Assets Edge

$80K — $120K *
Information Technology
5 - 7 years of experience
Job Overview by Ladders

Qualifications

  • 5-7 years of experience in Site Reliability Engineering or related field.
  • Proven expertise in Microsoft Azure cloud services and management.
  • Strong scripting skills in PowerShell, Python, Terraform, or Ansible.
  • Experience with monitoring and observability tools like Prometheus, Grafana, or Datadog.
  • Knowledge of ITIL and SRE principles for effective incident management.
  • Familiarity with CI/CD pipelines and DevOps practices using Azure DevOps or similar tools.
  • Understanding of container technologies such as Kubernetes and Docker.

Responsibilities

  • Monitor and resolve production incidents and service disruptions promptly.
  • Ensure system reliability and performance using SRE principles.
  • Automate processes and optimize cloud infrastructure on Microsoft Azure.
  • Analyze logs and metrics to proactively prevent incidents.
  • Collaborate with development and DevOps teams for issue resolution.
  • Implement CI/CD pipelines and monitoring strategies effectively.
  • Maintain and optimize Azure resources including VMs and AKS.

Benefits

  • Flexible work hours and remote work options.
  • Opportunities for professional development and certifications.
  • Collaborative work culture with cross-functional teams.
  • Access to the latest technology and cloud resources.
  • Participation in on-call rotations for critical issues.
Full Job Description
Responsibilities:
  • Monitor, troubleshoot, and resolve production incidents and service disruptions.
  • Ensure system reliability, availability, and performance using SRE (Site Reliability Engineering) principles.
  • utomate manual processes and optimize cloud infrastructure on Microsoft Azure.
  • nalyze logs, metrics, and alerts to proactively prevent incidents.
  • Collaborate with development, DevOps, and infrastructure teams for issue resolution.
  • Implement CI/CD pipelines, observability tools, and proactive monitoring strategies.
  • Maintain and optimize Azure resources (VMs, AKS, Storage, Networking, etc.).
  • Participate in on-call rotations for critical production support.

Preferred Qualifications & Skills:
  • Azure Cloud Expertise: Hands-on experience with Azure Monitor, App Insights, Log Analytics.
  • Scripting & Automation: Proficiency in PowerShell, Python, Terraform, or Ansible.
  • Monitoring & Observability: Familiarity with Prometheus, Grafana, Splunk, or Datadog.
  • Incident Management: Knowledge of ITIL, SRE principles, and Root Cause Analysis (RCA).
  • CI/CD & DevOps: Experience with Azure DevOps, GitHub Actions, or Jenkins.
  • Containers & Orchestration: Strong understanding of Kubernetes (AKS) and Docker.
  • Networking & Security: Knowledge of load balancers, firewalls, IAM, and VPNs.
  • Experience with large-scale distributed systems.
  • Familiarity with SQL/NoSQL databases.
  • Understanding of Cloud-Native architecture.

Similar Jobs

More Information Technology Jobs

Find similar Production Support - Azure Cloud jobs: