Lead Site Reliability Engineer

Less than 5 years experience  •  Accounting, Finance & Insurance

Salary depends on experience
Posted on 06/21/18
Chicago, IL
Less than 5 years experience
Accounting, Finance & Insurance
Salary depends on experience
Posted on 06/21/18

What You’ll Do:

 As a Lead Site Reliability Engineer (SRE), you will be responsible for the availability, automation, performance, efficiency, scaling, monitoring and emergency response of operating systems. You use your deep understanding of platforms, architecture, people, systems, and processes to both establish and continuously improve SLIs and SLOs for uptime, performance, deployment, monitoring, and troubleshooting. You are interested in setting direction and leading the day to day processes that shape our vision for reliability.

Your Day to Day

  • Maintain and support the product and data systems: proactively monitor events, investigate issues, analyze solutions, and drive problems through to resolution.
  • Define requirements and develop tools and reporting as needed by projects and operations.
  • Work with products to define application hardening and define opportunities for chaos engineering.
  • Use operational tools and monitoring platforms to gain in-depth knowledge, understanding, and ongoing monitoring of system availability, performance, and capacity.
  • Work with business partners to establish Service Level Indicators and Objectives (SLIs and SLOs)
  • Implement alerting strategy that makes alerts actionable and unique.
  • Provide follow-through to ensure issues are resolved to satisfaction
  • Drive continuous improvement and innovation within the team.
  • A sense of ownership, initiative and drive.

Basic Qualifications

  • Bachelor's degree or higher with previous experience in a technical support role.
  • You have been working in technology for 3+ year

Preferred Qualifications

  • Experience of   Java or .NET application development
  • Experience with SQL Server 2005/2008/2012/2016
  • Experience with browser related technologies
  • Experience with Linux and Windows.
  • Knowledge of monitoring tools and strategy.
  • Experience running incident postmortems.
  • Solid understanding of automated deployment processes
  • You have been working in technology for 3+ years.

REQ-008764

Not the right job?
Join Ladders to find it.
With a free Ladders account, you can find the best jobs for you and be found by over 20,0000 recruiters.