Job DescriptionJob Title: Sr. Observability EngineerLocations: [Waltham, MA - Hybrid]
About the RoleWe are seeking a skilled Monitoring & Alerting Engineer to design, build, and enhance tools and applications that power SS&C Intralinks' monitoring and alerting ecosystem. In this role, you will develop clean, secure, and efficient code; create integrations across monitoring platforms; and drive automation that improves system reliability and operational efficiency.
How You Will Make an Impact - Develop and enhance tools, applications, scripts, plugins, and integrations for monitoring and alerting platforms.
- Configure and optimize monitoring for servers, databases, networks, cloud services, infrastructure, and applications.
- Automate repetitive setup, reporting, and maintenance tasks using automation tools and scripting languages such as Python or Bash.
- Design and implement complex, reusable components and integrations based on broad requirements.
- Research and evaluate new monitoring technologies; present recommendations to management.
- Conduct code reviews and perform code analysis to ensure quality and security.
- Collaborate with product owners, SMEs, and engineering teams to deliver high-quality monitoring solutions.
- Maintain documentation for monitoring architecture, standards, and processes.
- Provide regular reports on system health and capacity planning.
Monitoring, Analysis & Integration (MAI) Team plays a critical role in ensuring the reliability, performance, and resilience of the organization's IT infrastructure
As a team, we help improve the reliability, performance, and efficiency of IT infrastructure and applications. This role elevates the organization's entire monitoring and incident-response capability by delivering scalable, automated, and intelligent observability solutions.
- Designs and implements advanced monitoring frameworks that reduce blind spots across infrastructure, applications, and services.
- Enables earlier issue detection, reducing mean time to detect (MTTD) and mean time to resolve (MTTR).
- Prevents outages through predictive and trend-based monitoring.
Required Experience - 5+ years of experience with modern monitoring and observability tools (e.g., Zabbix, CloudWatch, Dynatrace, Datadog, Splunk, ServiceNow).
- 3+ years of experience with a programming language such as Python, Java, C#, or C/C++.
- 1+ years of Linux experience (Bash, cron jobs, scripting, SSH, etc.).
- Experience working with REST APIs/JSON and integrating third-party services.
- Experience with automation tools and methodologies.
- Understanding multithreading, distributed systems, and concurrency.
- 3+ years of experience with major cloud providers (AWS, Azure, GCP).
- Knowledge of GitHub or other source control platforms.
- Security-minded approach to development and operations.
- Experience in regulated industries
- Communication with engineering and operations teams
- Collaboration in incident management
- Ability to influence architectural design for monitoring
What Sets You Apart (preferred qualifications)
- Bachelor's or Master's degree in Computer Science, Software Engineering, Mathematics, or a related field.
- ITIL knowledge or certification.
- TOGAF experience or certification
Join SS&C, where innovation meets global opportunities. Click here to apply.
SS&C Technologies offers a comprehensive total rewards package designed to support your wellbeing, growth, and future. Our benefits include medical, dental, and vision coverage; a 401(k) plan with company match; paid time off, holidays, and parental leave; and professional development reimbursement opportunity.
Applications will be accepted on an ongoing basis until the position is filled.