Software Integration Engineer (All Levels)Site Reliability Engineering (SRE) TeamLocations: Maryland and Utah
Clearance Required: Active TS/SCI with Polygraph
Citizenship: U.S. Citizenship Required
Position OverviewSoftware Integration Engineers to support a mission-focused Site Reliability Engineering (SRE) team. These funded positions support critical mission operations infrastructure and customer-facing environments through automation, monitoring, resiliency engineering, and troubleshooting activities.
This role offers the opportunity to work across modern Linux environments while supporting enterprise-scale monitoring, automation, and operational tooling.
Responsibilities- Support mission operations infrastructure tooling, including automation, alerting, monitoring, and resiliency initiatives.
- Configure, troubleshoot, and maintain customer programming use cases.
- Perform Linux system administration activities across multiple distributions.
- Develop and maintain automation solutions using Salt and Ansible.
- Create and support monitoring solutions utilizing Nagios, Thruk, Prometheus, Splunk, and Grafana.
- Troubleshoot system performance and operational issues in mission-critical environments.
- Support software integration efforts and infrastructure deployments.
- Manage and maintain code repositories and development workflows.
- Collaborate with engineering and operations teams to improve system reliability and efficiency.
Note: Candidates should expect to spend up to 50% of their day initially performing Linux system administration tasks to gain familiarity with the operational environment and supported systems.
Required Qualifications- Active TS/SCI with Polygraph security clearance.
- U.S. Citizenship.
- Experience administering Linux operating systems, including:
- Red Hat Enterprise Linux (RHEL)
- CentOS
- Rocky Linux
- SUSE Linux Enterprise Server (SLES)
- Ubuntu
- Experience with one or more of the following programming/scripting languages:
- Experience with Linux system administration, troubleshooting, and operational support.
- Ability to work effectively in a fast-paced mission environment.
Desired QualificationsExperience with any of the following technologies is highly desired:
- Thruk dashboards
- Nagios monitoring and plugin development
- Splunk dashboard integration and management
- Prometheus
- Grafana
- Salt
- Ansible
- GitBucket
- Jira
- Confluence
- Slurm
BenefitsOST has been providing mission-critical support to Government agencies since 1996 and offers a comprehensive benefits package including:
- 3 Weeks Paid Time Off
- 11 Federal Holidays
- Medical and Dental Coverage
- Short-Term Disability (STD)
- Long-Term Disability (LTD)
- Life Insurance
- Accidental Death & Dismemberment (AD&D) Coverage
- 401(k) with up to 4% Company Match