Requisition Number 54219
Assurant Labs is pioneering new architectures and deployment strategies to scale solutions operating on petabytes of data and building fully automated environments.
We are looking for a Site Reliability Engineer to help us maintain and expand our cloud infrastructure, currently serving millions of users. Candidates must have exceptional communication skills, the ability to manage multiple tasks efficiently, excellent judgment, and the ability to be productive and organized in a fast-paced, team-oriented environment.
We have a completely Cloud based infrastructure with services that need to be up 24/7. We have over 30 million monthly active users accessing our APIs & growing. We leveraging Amazon Kinesis to process over 1 trillion mobile diagnostic metrics per day; over 2 petabytes in total. We are constantly evaluating new tech to work smarter.
You will beresponsiblefor:
- Collaborating with software engineering to design robust, scalable server infrastructure
- Monitoring the performance and uptime of applications and underlying systems
- Managing and scaling infrastructure as the company grows and evolves
- Discovering and implementing new ways to improve operational engineering practices and procedures
We need you to have:
- Experience building and managing enterprise-scale applications and infrastructure
- At least oneyear of hands-on AWSexperienceincluding:
- Creating and managing VPC topologies
- Managing and scaling cloud-native applications built with EC2, ELB, and RDS
- Managing S3 buckets, objects, and policies
- Monitoring with CloudWatch events, logs, and alerts
- Securing access using IAM policies and roles, STS tokens, and KMS
- Experiencerunning cloud-native applications in production including familiarity with:
- Service discovery
- Centralized configuration management
- Secrets management
- Cost optimization and right-sizing
- Experienceworking with specific configuration and infrastructure management tools such as Chef, Ansible, Terraform, etc.
- Experience creating and maintaining continuous integration / continuous deployment pipelines using tools such as Jenkins, TeamCity, TravisCI, GitLab, etc.
- Extensive knowledge of Linux systems administration and architecture
- Experience with scriptinglanguages (e.g., Ruby, Python, Bash)
- Experience with version control systems (e.g., Git, Mercurial)
- Passion for systems automation, reliability, and high scalability
- Exceptional collaborative, written, and verbal communication skills
- Ability to organize, manage and prioritize many tasks at a time
We’d like you to have:
- Experiencedeploying and maintaining serverless applications on AWS including:
- Deploying, scaling, monitoring, maintaining, and optimizing applications built with AWS Lambda, API Gateway, and DynamoDB
- A basic understanding of application development using Node.js
- Familiarity with serverless application frameworks (e.g., Serverless, Apex)
- Experienceworking withcontainers in production including:
- Migrating existing applications to containerized workflows
- Creating reliable, zero-downtime container deployment strategies
- Automating deployment of containerized applications via CI/CD pipelines
- Artifact management and access control using DockerHub, ECR, Quay, etc.
- Container scheduling (e.g., Kubernetes, ECS, Swarm)
- Overlay networking and load balancing (e.g., Weave, Calico, Flannel)