Job DescriptionEmbark on a journey of innovation with BAE Systems' Platform Engineering group, where we seek a Site Reliability Engineer (SRE) to deploy and monitor of IaaS, PaaS, and SaaS solutions using cutting-edge technologies. In this role, you work to implementing robust, repeatable, standardized, and advanced technological solutions. You are not just addressing today's needs but setting the foundation for future operations.
As an SRE, you will implement the latest in automation, services, and infrastructure. You will be working on addressing existing problems with an eye to future technological needs. Site Reliability Engineering (SRE) is an engineering discipline that combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. Site Reliability Engineers work a coordinated cross-capability team including Core Services, DevOps, Security, and IT Operations to achieve your project objectives. Software development in SRE focuses on optimizing existing systems, pro-actively identifying potential problems and eliminating work through automation. At BAE Systems, we don't just use technology; we evolve with it, offering you endless opportunities to stay at the forefront of the technological curve.
Key Responsibilities:
- Deliver Excellence: Work in a team of SREs to ensure seamless, continuous service delivery.
- Innovate and Implement: From concept to execution, bring service methodologies to life.
- Be the Problem Solver: Tackle complex service requests, ensuring swift and efficient resolutions.
- Measurement and Monitoring: Instrument and measure key performance indicators of the production environment to enable continuous improvement.
- Support the Infrastructure: Provide primary operational support and engineering for multiple large-scale distributed software applications
- Scale systems through automation
- Communicate and Collaborate: Coordinate with cross-team members to complete tasks.
Because of the need for consistent, in-person collaboration and/or the requirement to perform all work onsite due to the nature of this particular role, it will be performed fulltime on site. This means work will be conducted on location at a BAE Systems facility 100% of the time.
Join us. Be part of a team that's not just building technology but defining the future. At BAE Systems, your work is more than a job - it's a journey in innovation.
Apply now and propel your career into a future.
Required Skills and Education- Must be able to obtain a Secret clearance and be eligible for TS/SCI clearance in the near future.
- 5+ years in the realm of Site Reliability Engineering.
- A desire to deliver excellence and a focus on meeting user needs
- An understanding of cloud technologies and service lifecycle management.
- Experience in deploying capabilities and services through automation like Ansible or Helm Charts.
- Experience with storage technologies like NFS, JDFS, Ceph, Object Storage (S3).
- Familiarity with virtualization technologies such as VMware, Open Stack, or Azure Stack.
- Proactive approach to identifying and resolving performance issues.
- Security+ or equivalent or ability to obtain a Security+ within 90 days.
Preferred Skills and Education- 4 year degree in engineering-related discipline
- Active Secret or Higher Clearance
- Familiarity in Kubernetes and similar technologies.
- Experience working in cross-functional teams.
- Experience with coding beyond simple scripting.