Site Reliability Engineer
Reports to: Site Reliability Manager
Location: Boise, ID
Job Summary: This position will be responsible for providing highly available, secure, and compliant networks, systems, and services in a cloud environment. Activities include provisioning, configuring, deploying, maintaining, monitoring and improving Kount's AWS infrastructure. Additional responsibilities include management of physical security, datacenter locations, and office facilities.
KEY RESPONSIBILITIES INCLUDE, BUT ARE NOT LIMITED TO:
- Troubleshooting and resolution of production issues
- Monitoring of performance and optimization of services
- Monitoring and review of logs for anomalous behavior and security concerns
- Schedule and perform maintenance for infrastructure and production services
- Document, develop, and improve operational practices and procedures
- Implement, deploy, and support internal and customer facing services
- Implement, deploy, and support third party and open source software
- Implement, deploy, and support physical and virtual servers
- Implement, deploy, and support networking infrastructure
- Implement, deploy, and support storage infrastructure
- Resource allocation to meet development and production needs (AWS and VMware)
- Provide operational support to cross-functional product development teams
- Assist with improvements to the continuous integration and deployment processes
- Assist specialists and cross-functional teams with product improvements
- Maintain configuration management and orchestration tools.
- Manage access to IT assets and resources
- Participate in the primary on-call rotation
QUALIFICATIONS AND MINIMUM REQUIREMENTS:
- 4-year college degree in Information Systems, Computer Science, or comparable IT experience.
- Minimum of three years' experience with IT network and systems administration
- Strong background with operating systems, storage devices, virtualization, and networking.
- Thorough knowledge of DNS, Internet protocols, web servers, and load balancing technologies.
- Must have ability to analyze complex issues at a detailed level.
- Must have a desire to work in a team environment yet be self-directed, proactive, and action-oriented.
- Must have excellent diagnostic and troubleshooting skills.
- Must have experience building and maintaining Unix and Linux Systems.
- Must have experience with cloud virtualization technologies such as AWS, Azure, Google Cloud Platform, VMware, or OpenStack
- Must have a working familiarity with provisioning, management, orchestration, deployment, monitoring, and continuous delivery.
- Must have strong scripting and automation skills using languages like Bash, Python, Perl, and Ruby.
- Must understand the basic concepts of version control (Git, SVN).
- Must have strong verbal, written and organizational skills.
- Experience with AWS EC2, ECS, CloudWatch, S3, Cloud Formations
- Experience with Puppet, MCollective, Splunk, Nagios, Graphite, Grafana
- Experience with Apache, PHP, Nginx, ActiveMQ, Java, Tomcat, Jenkins, Bamboo, TeamCity
- Experience with DynamoDB, Amazon RDS, Cassandra, Postgres, Elastic Search
- Experience with Cisco UCS, SAN, NAS, Juniper firewalls, ADCs
- Familiarity with PCI and information technology security standards
The work environment characteristics described here are representative of those an employee encounters while performing the essential functions of this job. Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions.
- Job performed at a desk in front of a computer
- Requires heavy use of keyboard and mouse
- Requires sitting for long periods of time
- Casual work environment