Role Overview:This role is for an Application Monitoring Engineer within the Mortgage Data Services (MDS) Application Monitoring Team, focusing on APIs, Services, and User Facing Workflows. The primary responsibility involves the monitoring and triage of customer and partner-facing services, including APIs, microservices, portals, and document workflows, particularly when issues impact real-time business operations.
Key Responsibilities:- Monitor and triage customer and partner facing services, including APIs, microservices, portals, and document workflows.
- Rapidly triage customer visible incidents to minimize business impact.
- Verify service recovery following restarts or rollbacks.
- Coordinate escalation of issues to appropriate application and platform teams.
- Communicate clear status updates during high severity incidents.
- Perform L1 tasks such as service availability monitoring, API/UI alert triage, and ticket creation/escalation.
- Execute L1.5 tasks including deeper log analysis, error pattern identification, authorized service restarts/retries, and partner impact analysis/status confirmation.
Required Skills:- Proficiency in Application & Service Monitoring, including REST API monitoring (availability, latency, error rates), web application and UI health monitoring, microservices and service-to-service dependency awareness, and certificate/endpoint/authentication health checks.
- Strong understanding of Platform & Service concepts, such as event-driven and API-based architectures, document ingestion and processing flows, upstream/downstream service dependencies, and environment awareness (Prod vs non-Prod behavior).
- Expertise in Monitoring & Tooling Skills, including synthetic and real user monitoring concepts, log aggregation and application error diagnostics, user experience-based alerting, and integration with incident and on-call escalation tools.
- Solid Operational & Process Skills for rapid incident triage, service recovery verification, coordinated escalation, and clear status communication.
- Awareness of Security & Compliance, including API security controls (auth failures, rate limits), monitoring for abnormal traffic/error spikes, understanding data sensitivity/access restrictions, and adherence to least privilege access/security runbooks.
- Experience with Agile Way of Working.
- Knowledge of Cyber Security - GRC - Data Security principles.
Qualifications:- 10+ years of experience in application monitoring or a related field.