$100K - $150K(Ladders Estimates)
Nordstrom is hiring in our Site Operations Center as part of our Site Reliability Engineering group. We're obsessed with driving down our production issue count, ensuring that we learn from what issues we have, and strive to lower the time to repair any issues that occur.
If you want to be a part of a team of engineers that monitors and troubleshoots Nordstrom's infrastructure 24x7 and is the Eyes-on-Glass first point of contact for all issues, and will work closely with our Incident Response and Site Reliability teams, and thrive in an intense, fast paced, highly visible environment then we should talk.
A day in the life...
Providing Tier 1 support for application and infrastructure issues across the enterprise
Monitoring, triaging, and coordinating incident response when service failures, infrastructure issues, or deployment issues occur
Hands on analysis and troubleshooting of production
Identifying, defining, and building improvements to support tools, processes, and the service itself
Improve customer experience with delivering new service monitoring, alarming and scripting
You own this if you have...
Familiarity with site and infrastructure monitoring systems (like AWS Cloudwatch, Datadog)
UNIX/LINUX sysOps tasks, including expertise in administration, monitoring, troubleshooting, performance tuning, preventative maintenance and capacity planning.
Networking (TCP/IP, routing, network topologies and hardware, SDN, etc).
Broad understanding of large scale system architecture, automation, integration, and processes
Ability to debug and optimize code and to automate routine tasks.
Ability to work night/weekend shifts
4+ year of work experience with production Linux systems administration
2+ years with configuration management, source control and containerization tools
2+ year of work experience managing Cloud based infrastructure and automation
2+ year of experience with at least one scripting language ( eg Bash, Python, Ruby, Go )
Motivated, critical thinker with proven skills to troubleshoot and solve problems in a production support environment
Ability to successfully manage competing priorities in critical incident situations
Strong desire to learn and understand new technologies
Excellent verbal and written communication skills
Experience working with ITIL and Service Management best practices is a plus.
Bachelor's Degree or equivalent experience.
We've got you covered…
Our employees are our most important asset and that's reflected in our benefits. Nordstrom is proud to offer a variety of benefits to support employees and their families, including:
Valid Through: 2019-10-20