Cloud Reliability Engineer

Marathon TS

$90K — $130K *
Information Technology
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • Associate's degree in Engineering or Computer Technology or advanced military training
  • Minimum 2 years of relevant experience
  • Proficiency with Windows and Linux operating systems
  • Experience with distributed computing and virtualization technologies
  • Knowledge of tools such as Docker, Ansible, and Heat templates
  • Background in supporting software and/or network operations
  • Active TS/SCI security clearance with willingness to undergo polygraph examination

Responsibilities

  • Ensure uptime of multi-tenant cloud infrastructure
  • Collaborate with engineering teams to streamline platforms
  • Configure and leverage advanced monitoring tools
  • Conduct incident response and root cause analysis
  • Provide hands-on system and network support
  • Monitor daily software and network operations
  • Engage with users for fault isolation and system analysis

Benefits

  • Supportive work culture with emphasis on diversity and inclusion
  • Opportunity to work in a high-performing team
  • Engagement with Department of Defense and Intelligence Community
  • 24x7 environment offers unique operational challenges
  • Hands-on experience with cutting-edge cloud and virtualization technology
Full Job Description
Overview
Marathon TS is seeking Cloud Reliability Engineer in Chantilly, VA to support our Department of Defense / Intelligence Community customer as part of a highly talented, highly motivated and high-performing team. As part of the Infrastructure Operations and Maintenance Support team you will be responsible for the availability, performance, monitoring, and incident response, among other things, of the Cloud Infrastructure that we support in a 24x7 environment.

Responsibilities
  • Ensure the uptime of our multi-tenant infrastructure
  • Work closely with the engineering teams to improve our platforms and eliminate complexity from architecture and processes
  • Configure and use state-of-the-art monitoring tools to gather insights and then act upon the results
  • Conduct incident response and in-depth root cause analysis.
  • This position is hands-on, requiring the ability to provide first level system and network support and problem resolution identification.
  • The candidate would be responsible for the monitoring the daily software and network operations in a distributed environment.
  • Also responsible for monitoring, working with users on fault isolation and resolution, as well as system analysis and reporting.
  • This job will include shift work to allow for complete 24x7 monitoring of software systems.
Qualifications
Required Qualifications:
  • You have at least an associate's degree in Engineering or Computer Technology or Advanced Military Training.
  • You have at least 2 years of relevant experience
  • You have experience working with Windows and Linux operating systems.
  • You have experience with distributed computing technologies.
  • You have experience with virtualization technologies (e.g. OpenStack, Citrix XenServer Red Hat Enterprise Virtualization, and/or VMWare), Docker Containers, Ansible, and Heat templates.
  • You have experience with front end processing and network gateway appliances and /or software.
  • You have experience working in a customer environment and/or a classified environment.
  • You have a background in supporting software and/or network operations with a clear understanding of networking fundamentals.
  • You have experience with Linux/Unix and Windows operating systems.
  • You hold a current CompTIA Security+, CASP or CISP certification. Computing Environment Certification (e.g. Linux+, RHCSA, RHCE, MCSA).
  • You are able to effectively communicate both with customers and technical staff.
  • You have an active TS/SCI security clearance, willing to undergo and pass a polygraph examination
  • You are willing to work in a 24x7 environment
Desired Qualifications:
  • Have an active TS/SCI with Polygraph
  • Have experience with infrastructure automation technologies including OpenStack, Ansible, Heat, Puppet, etc. Experience on Cloud Computing Fundamentals.
  • Have a good understanding of KVM Virtualization technologies.
  • Have previous experience with networking equipment.
  • Have experience with Intelligence or DoD programs, either within the military or as a civilian contractor, is desired.


#CJJOBS

Similar Jobs

More Jobs at Marathon TS

More Information Technology Jobs

Find similar Cloud Reliability Engineer jobs: