Role Overview:This role is for a Data Center/Cloud & Automation Subject Matter Expert (SME) focusing on the management and support of Linux servers, designing and maintaining AWS cloud infrastructure, and implementing robust automation solutions using Infrastructure as Code (IaC) tools like Terraform and Ansible.
Key Responsibilities:- Manage and support Linux servers, including installation, patching, user management, performance monitoring, and troubleshooting.
- Design, provision, and maintain AWS infrastructure (EC2, Certificate Manager, Security Groups, VPC, IAM, S3, Load Balancers) ensuring security, availability, and cost efficiency.
- Implement Infrastructure as Code using Terraform to build, update, and manage AWS resources consistently.
- Automate system configuration, deployments, and patching using Ansible playbooks and roles.
- Manage code version control through Github.
- Monitor systems and cloud resources, respond to incidents, and perform root cause analysis.
- Follow security best practices, access controls, and compliance requirements across OS and cloud platforms.
- Collaborate with application, network, and security teams to support deployments and changes.
- Maintain documentation, SOPs, and continuously improve automation and operational efficiency.
Required Skills:- Strong hands-on experience with Linux Administration (RHEL / Amazon Linux / Ubuntu), including user management, patching, troubleshooting, shell scripting, and performance monitoring.
- Practical knowledge of core AWS services such as EC2, VPC, Subnets, Certificate Manager, IAM, S3, Load Balancers, Auto Scaling, and CloudWatch, with security and cost optimization awareness.
- Experience in writing and managing Terraform modules, variables, and state files to provision and maintain AWS infrastructure.
- Ability to create and manage Ansible playbooks and roles for OS configuration, automation, and deployment tasks.
- Working knowledge of Git for code versioning and collaboration.
- Understanding of networking fundamentals.
- Proficiency in Bash scripting and automation of repetitive operational tasks.
- Experience in system and cloud monitoring, incident handling, and root cause analysis.
- Understanding of access control, encryption, patch management, and secure configuration practices.
Qualifications:Preferred Skills:- Windows Administration, Citrix (storefront, Delivery controllers, netscaler, PAM, XenApp Administration).
- Experience in Nutanix and VMware.
- Exposure to CI/CD Tools like Jenkins, GitHub Actions, GitLab CI.
- Basic knowledge of Docker and Kubernetes.
- Familiarity with advanced AWS Services such as RDS, Lambda, DynamoDB, Route53, or EKS.
- Experience with monitoring & logging tools like Prometheus, Grafana, ELK stack, or CloudTrail.
- Experience with Ansible AWX / Tower for centralized automation.
- Exposure to Terraform advanced usage, including multi-account setups, workspaces, and complex module design.
- Knowledge of vulnerability scanning, security audits, and compliance requirements.
- ITIL / ITSM Awareness.
- Basic knowledge of Python for automation tasks.
- Ability to create SOPs, runbooks, and technical documentation.