Lead Site Reliability Engineer in Beaverton, OR

$100K - $150K(Ladders Estimates)

NIKE, Inc   •  

Beaverton, OR 97005

Industry: Retail & Consumer Goods

  •  

5 - 7 years

Posted 42 days ago

Responsibilities


Will perform deep problem analysis, detect infrastructure or code defects, define, report, and create observability processes for Key Performance Indicators (KPIs), and work with product delivery teams to provide long term solutions to production issues in Beaverton, OR. Will serve as a full stack developer and use knowledge of datacenter infrastructure and cloud platforms. Will observe, diagnose, and develop fixes for production issues quickly and efficiently. Will develop and drive real time monitoring solutions that provide visibility into site health and key performance indicators. Will report and communicate high value metrics to leadership. Will engage in IT service management, including incident, problem, change and knowledge management. Will work across teams (business and technical) to continuously analyze system performance in production, troubleshoot consumer reported issues, and proactively identify areas in need of optimization. Will manage and lead application reliability practices for consumer facing web and mobile experiences. Will have direct subordinates of one to three individuals. Will engage in five percent or less domestic and international travel.


Qualifications

Education: Bachelor's degree in computer science, information technology, information systems or engineering. Will also accept a master's degree in computer science, information technology, information systems or engineering. Experience: Bachelor's degree and five years of overall progressive experience in application engineering. Will also accept a Master's Degree and three years of experience in application engineering. Skills/Requirements: 2 years of experience: (1) AWS services, including Cloud Watch monitoring and dashboards; (2) Lambda API Gateway; (3) DynamoDB; (4) Splunk monitoring to build dashboards, schedule reports, and create monitoring alerts; (5) New Relic; (6) creating synthetic scripts using Selenium; (7) querying Cassandra to debug production issues; (8) Snow and Jira; and (9) Python as a scripting language.


Valid Through: 2019-11-1