Job Summary- Seeking a Cloud Systems Administrator to sustain, administer, and optimize enterprise cloud environments built on Java and open-source technologies, including Kubernetes, Hadoop, and Accumulo.
- The candidate will directly support high-performance, data-intensive analytics running on secure, managed infrastructure that enables critical Intelligence Community missions
- The Cloud Systems Administrator will ensure the stability, availability, and performance of cloud-based systems by providing Tier 1 through Tier 3 operational support, troubleshooting complex Linux environments, and collaborating with engineering teams to continuously improve platform reliability and operational efficiency
- This position offers exposure to a broad range of cloud, Linux, automation, and big data technologies while supporting mission-critical operations in a dynamic, fast-paced environment
Primary Responsibilities- Administer, monitor, and maintain cloud-based enterprise platforms supporting mission-critical analytics and operational workloads
- Support cloud infrastructure built on Kubernetes, Hadoop, Accumulo, and other open-source technologies
- Ensure operational stability, availability, security, and performance of mission-critical systems
- Provide Tier 1, Tier 2, and Tier 3 operational support, including participation in a rapid on-call rotation supporting customer operations
- Work a mid-shift schedule, including Fridays and Saturdays, to support operational mission requirements
- Troubleshoot and resolve complex Linux operating system, networking, application, and infrastructure issues
- Install, configure, provision, and maintain Red Hat Enterprise Linux (RHEL) and CentOS systems using Kickstart and automated deployment methods
- Perform Linux/UNIX network troubleshooting utilizing tools such as nmap, tcpdump, firewall diagnostics, and network analysis utilities
- Administer DNS services, host files, DHCP, and related Linux networking components
- Monitor, troubleshoot, and optimize Hadoop and Accumulo clusters to ensure system performance and data integrity
- Develop and maintain automation scripts using Bash and Python to improve operational efficiency and system management
- Utilize configuration management tools such as Puppet to automate system provisioning and configuration management
- Collaborate with software engineers, cloud engineers, infrastructure teams, and cybersecurity personnel to resolve system performance, networking, and data flow issues
- Diagnose operational issues and escalate complex infrastructure, networking, storage, or application problems as appropriate
- Maintain technical documentation, standard operating procedures (SOPs), operational runbooks, and system configuration records
- Participate in continuous improvement initiatives to enhance cloud platform reliability, scalability, and operational effectiveness
Required Qualifications- Must have active Top Secret/SCI clearance with NSA Full Scope Polygraph
- Bachelor's degree in Systems Engineering, Computer Science, Information Systems, Engineering Science, Engineering Management, or a related technical discipline
- An additional two (2) years of relevant work experience may be substituted for the degree requirement
- Minimum of five (5) years of relevant experience supporting Linux-based enterprise environments
- Minimum of five (5) years administering Red Hat Enterprise Linux (RHEL), CentOS, or similar Linux operating systems, including command-line administration, system provisioning, and Kickstart deployments
- Minimum of five (5) years of Linux/UNIX networking experience, including troubleshooting using nmap, tcpdump, firewall technologies, and network diagnostics
- Minimum of two (2) years administering Red Hat DNS services, including DNS configuration, host file management, and DHCP administration
- Proficiency developing automation scripts using Bash and Python
- Experience utilizing configuration management tools such as Puppet
- Experience monitoring, administering, and troubleshooting cloud-based Linux systems, including large Hadoop and Accumulo clusters
- Demonstrated ability to diagnose, troubleshoot, and appropriately escalate network, system, application, and data flow issues
- DoD 8570 IAT Level I certification or higher is required (Security+ preferred)
- Candidates must possess one of the following certifications prior to employment. Certification will be verified during the interview or onboarding process
- AWS DevOps Engineer - Professional
- CDP Administrator - Private Cloud Base
- Certified Kubernetes Administrator (CKA)
Desired Qualifications- Experience administering Kubernetes, Docker, Hadoop, and distributed data processing platforms
- Experience supporting enterprise cloud environments utilizing AWS or hybrid cloud architectures
- Experience with monitoring and observability tools such as Prometheus and Grafana
- Experience using Jira for incident management, operational tracking, and workflow management
- Experience supporting HDFS and distributed storage environments
- Familiarity with SaltStack, Ansible, or other enterprise automation platforms
- Experience supporting virtualization technologies and OpenStack environments
- Knowledge of cybersecurity best practices, system hardening, vulnerability remediation, and secure systems administration
- Experience supporting Intelligence Community or Department of Defense operational environments
Exempt hourly position. 11 paid holidays, minimum of 3 weeks PTO, company sponsored group medical plan, company paid dental, vision, life insurance, and STD/LTD plans. Salary is dependent upon the candidate's experience and qualifications.
The pay range for this role is:
150,000 - 205,000 USD per year (NBP)