Liquid Robotics is looking for a talented, multidiscipline DevOps Engineer to support our Cloud & Infrastructure Engineering activities. You will work with DevOps & IT team on the architecture, system administration, security, automation and support of our mission-critical customer-facing systems and networks.
- Supporting and monitoring a 24/7 high-availability mission critical service in a fast-paced startup-like environment
- Participation in on-call rotation
- Promptly responding to alerts and other issues raised by customers or internal business units and working with internal teams to solve production issues
- Working on day-to-day tasks like managing user access and supporting internal business units
- Troubleshooting and fixing infrastructure issues from hardware layer to application layers in a Hybrid Cloud environment (AWS & Data Centers)
- Working on tailoring, monitoring and alerting systems to avoid false positives and update alert related settings
- Maintain, scale and help automate the management of LRI's hybrid cloud-based Wave Glider Management System
- Deploy software releases; perform routine system updates and patching
- Build resiliency and redundancy to deliver highly available services despite machine and data center failures and partitioned networks
- Collaborate with IT and others on security and compliance
- Support VPN services.
- Build and maintain emergency response processes and mechanisms
- A bachelor's degree in a related field of study or equivalent years of experience
- Strong Windows and Linux administration skills
- Experience with Virtualization (VMware, RedHat), Containers (Docker) and On-prem / Public Cloud (AWS) technologies
- Working knowledge of Linux shell, system internals, network, java applications, MySQL, MSSQL, MongoDB databases, VPNs
- Experience with administering and managing dev tools like Git, Gerrit, SVN, Jenkins etc.
- Working knowledge of Active Directory / LDAP and Identity Management Systems
- Experience administering Web Servers (including Nginx, IIS)
- Love for debugging: troubleshooting and debugging monitoring alerts and connectivity issues
- Ability to read and understand server/systems logs and produce meaningful issue analyses.
- Good analytics, troubleshooting skills and intuition about probable root cause
- Ability to obtain a security clearance for which the US government requires US citizenship
- Familiarity with SIEM, Python, Apache, and monitoring/alerting tools like Nagios, OP5
- At least 5 years of experience