Full Job Description
Must Have Technical/Functional Skills
Education: Bachelors degree in Computer Science, Information Technology, or a closely related field.
Technical Skills: Proficiency in Linux/Unix, SQL, and programming languages like Java or Python. Experience with monitoring tools (e.g., Dynatrace, Datadog) and ticketing systems (e.g., JIRA, ServiceNow) is highly valued.
Problem-Solving: Strong analytical abilities to digest complex issues under pressure in fast-paced production environments.
Communication: Excellent verbal and written communication skills to update stakeholders, business partners, and clients on system status and incident resolutions.
Roles & Responsibilities
Incident Management & Troubleshooting: Act as the first line of defense, monitoring live systems and responding to alerts, bugs, or system crashes.
Root Cause Analysis (RCA): Investigate underlying issues to identify the root cause and implement preventative measures to stop recurring problems.
Cross-Functional Collaboration: Partner with software development, QA, and infrastructure teams to deploy code fixes, patches, and major releases during maintenance windows.
System Monitoring & Optimization: Build and manage application performance monitoring solutions to guarantee system health.
Documentation: Maintain runbooks, incident tracking, and standard operating procedures (SOPs) for rapid issue resolution.
Location: Need to Work from Office all 5 days a week.
Salary Range: $90,000 to $105,000 per year