3 to 7 years of experience in production support role
Hands-on experience with OpenShift, Ansible, and automation scripting (Python, Shell, PowerShell)
Proficiency in monitoring tools like Grafana, AppDynamics, ITRS, and Splunk
Strong knowledge of databases, including SQL and query optimization
Experience with high-availability systems in 24x7 production environments
Familiarity with incident and problem management frameworks
Strong documentation and communication skills
Ability to work collaboratively in fast-paced environments
Responsibilities
Provide L2/L3 production support for critical applications and implement root cause analysis solutions
Monitor application performance and proactively resolve issues using monitoring tools
Develop and enhance automation scripts with Ansible and various scripting languages
Support Business Continuity Planning and ensure system resiliency
Collaborate with teams to implement process improvements
Maintain detailed documentation of systems and incident resolutions
Manage deployments on OpenShift to ensure stability and performance
Work with integration services for seamless system operations
Benefits
Hybrid work arrangement
Opportunities for professional growth and skill enhancement
Collaborative work environment with cross-functional teams
Engagement in high-impact projects that ensure system reliability
Access to advanced monitoring and automation tools
Full Job Description
Overview:
Production Support Engineer
Charlotte NC or Chandler
Hybrid
Job Description:
Production Support Engineer
Client is seeking a skilled Production Support Engineer to join our technology operations team.
The ideal candidate will bring hands-on experience in production support, automation, monitoring, and incident management, with a strong focus on ensuring system reliability and operational excellence.
This role requires strong problem-solving abilities, technical depth, and clear documentation skills to support high-availability business-critical applications.
Key Responsibilities:
Provide L2/L3 production support for critical applications, performing root cause analysis (RCA) and implementing long-term solutions.
Monitor application and infrastructure performance using tools such as Grafana, AppDynamics, Splunk, and ITRS, and proactively resolve issues.
Develop, maintain, and enhance automation scripts and services using Ansible, Python, Shell, PowerShell, and database queries to streamline support workflows.
Support Business Continuity Planning (BCP) and disaster recovery strategies to ensure high availability and resiliency of systems.
Collaborate with development, DevOps, and infrastructure teams to implement process improvements and enhance operational maturity.
Maintain detailed documentation of processes, system designs, and incident resolutions, and communicate technical solutions effectively to stakeholders.
Manage deployments on OpenShift and containerized platforms, ensuring stability, performance, and scalability.
Work with integration services and wires to ensure seamless system operations and connectivity.
Required Skills and Experience:
3 to 7 years of experience in production support role
Hands-on experience with OpenShift, Ansible, and automation scripting (Python, Shell, PowerShell, etc.).
Proficiency in monitoring and observability tools such as Grafana, AppDynamics, ITRS, and Splunk.
Strong knowledge of databases (SQL, queries, optimization).
Experience supporting large-scale, high-availability systems in a 24x7 production environment.
Familiarity with incident management, problem management, and BCP/DR frameworks.
Strong documentation, articulation, and communication skills.
Ability to work collaboratively in cross-functional teams in fast-paced environments.
Preferred Qualifications:
Previous work in financial services technology or production engineering roles.
Experience with global automation functions.
Familiarity with payment systems (e.g., Fedwire, CHIPs and RTP) or liquidity platforms is a plus.