Job DescriptionThe Impact You Will Have in This RoleAs a Senior Application Support Engineer, you will play a critical role in supporting and improving the reliability of DTCC's risk management applications.
This role goes beyond traditional support-you will apply Site Reliability Engineering (SRE) principles to improve system stability, reduce incidents, and drive proactive improvements across a complex environment spanning mainframe, distributed systems, and cloud platforms.
You will support a large-scale portfolio of over 100 applications and work as part of a globally distributed team, partnering across regions to ensure seamless production operations and high system availability.
You will partner closely with Application Development, Infrastructure, and Operations teams to ensure production stability for mission-critical systems supporting financial risk and settlement processes.
Your Primary Responsibilities:
- Act as a Lead Application Support Engineer with SRE responsibilities, partnering with Application Development, Infrastructure and Operations teams to improve system reliability, resilience and observability.
- Lead the resolution of critical production incidents, providing clear impact analysis, root cause identification and preventative actions.
- Drive incident, problem and major incident management maintaining ownership through resolution and post-incident review.
- Proactively identify reliability risks and implement improvements to prevent recurrence and reduce operational toil.
- Review and maintain runbooks, knowledge articles and operational documentation to ensure production readiness and consistency.
- Execute change, release and deployment activities, including production code releases and vendor application upgrades.
- Perform and support Disaster Recovery activities, including testing, execution, and audit/BCM evidence collection.
- Identify and implement automation and alert rationalization opportunities to improve operational efficiency and service stability.
- Embed reliability, risk and control considerations into day-to-day operations, escalating issues appropriately.
Qualifications - Bachelor's degree preferred or equivalent practical experience
- 6-8 years of experience in Application Support, Production Support, or SRE roles
Talent Needed for Success - Strong understanding of application support and SRE principles including reliability engineering, observability and incident prevention.
- Experience working in Linux and Windows environments including process inspection, log analysis and extensive troubleshooting.
- Familiarity with monitoring and observability tools such as Splunk and Grafana and the ability to interpret system behavior and alerts.
- Knowledge on application support - basic programming skills, log reading and analysis
- Distributed Application troubleshooting and Support skills. Strong problem-solving skills with the ability to think creatively.
- Familiarity working with relational databases (DB2, Oracle, Snowflake)
- Working knowledge of SQL and ability to execute queries for analysis and troubleshooting.
- Experience with ITSM and operational tooling (e.g., ServiceNow) for incident, problem, and change management.
- Familiarity with job scheduling, containerized platforms and modern application environments (e.g., Autosys, OpenShift).
- Understanding of security fundamentals including certificate and password management.
- Exposure to capital markets and financial industry is required.
- Exposure to messaging, networking or mainframe concepts is a plus.
- Experience in handling issues related AWS and associated services .
- Exposure to artificial intelligence concepts and their usage in production support
- Demonstrates clear written and verbal communication, ownership and leadership in fast-paced production environments.
- Comfortable operating with urgency and collaborating with global , distributed teams.
- Proactive mindset with a focus on continuous improvement, automation and operational excellence
Preferred Skills - Exposure to mainframe environments (job monitoring, batch processing, JCL)
- Experience with AWS or cloud-based applications
- Basic scripting or programming experience (Python or similar)
- Understanding of messaging systems, networking, or middleware
- Exposure to AI/automation concepts in production support
The salary range is indicative for roles at the same level within DTCC across all US locations. Actual salary is determined based on the role, location, individual experience, skills, and other considerations.