Tata Consultancy Services

DevOps & Site Reliability Lead

Tata Consultancy Services$120K — $160K *
Information Technology
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • 5-7 years of hands-on experience with Talend, including development and troubleshooting
  • Strong understanding of Big Data technologies, specifically Hadoop and Apache Spark
  • Experience in Major Incident Management (MIM) in a 24x7 on-call environment
  • Proven ability to work directly with customers and cross-functional teams
  • Strong skills in coordinating and managing offshore teams
  • Knowledge of ITIL processes related to incident, problem, and change management
  • Excellent communication and documentation skills.

Responsibilities

  • Act as an SRE for Big Data and ETL platforms, ensuring high availability and reliability
  • Provide operational support and incident management, including triage and resolution of production issues
  • Serve as the primary contact for customers, providing updates and operational insights
  • Collaborate with application teams to support ETL jobs and data processing workflows
  • Coordinate with offshore teams for operations and continuous improvement
  • Monitor and troubleshoot the performance of Talend, Hadoop, and Spark ecosystems
  • Implement and support automation to enhance platform stability
  • Participate in problem management and post-incident reviews to improve processes.

Benefits

  • Discretionary Annual Incentive
  • Comprehensive Medical, Dental, and Vision Coverage
  • Maternal and Parental Leaves
  • Insurance options including Auto, Home, and Identity Theft Protection
  • Commuter benefits and training reimbursement
  • Paid vacation, sick leave, and holidays
  • Legal and financial assistance including 401K and student loan refinancing.
Full Job Description
Must Have Technical/Functional Skills

  • We are seeking a Site Reliability Engineer (SRE) with strong expertise in Talend and Big Data platforms to support and operate large-scale data processing environments.
  • The role requires close collaboration with customers, application teams, and offshore delivery teams to ensure platform reliability, incident management, and operational excellence. Experience with Databricks is a strong plus.


Key Responsibilities

  • Act as an SRE for Big Data and ETL platforms, ensuring high availability, performance, and reliability of data pipelines and applications.
  • Provide operational support and incident management (MIM), including triage, root cause analysis, and resolution of production issues.
  • Serve as a primary point of contact for customers, providing timely updates, issue resolution, and operational insights.
  • Collaborate closely with application teams to support ETL jobs, data processing workflows, and platform enhancements.
  • Coordinate with offshore teams for day-to-day operations, incident resolution, and continuous improvement initiatives.
  • Monitor, troubleshoot, and optimize Talend, Hadoop, Spark, and Big Data ecosystems.
  • Implement and support monitoring, alerting, runbooks, and automation to improve platform stability and reduce manual effort.
  • Participate in problem management, change management, and post-incident reviews to drive preventive measures.
  • Support capacity planning, performance tuning, and reliability improvements across the data landscape.


Required Skills & Qualifications

  • Strong hands-on experience with Talend (development, support, and troubleshooting).
  • Solid understanding of Big Data technologies, including:


o Hadoop ecosystem

o Apache Spark

  • Proven experience handling Major Incident Management (MIM) and production support in a 24x7 or on-call environment.
  • Experience working directly with customers, business stakeholders, and cross-functional teams.
  • Strong coordination skills to manage and guide offshore teams.
  • Knowledge of ITIL processes, especially Incident, Problem, and Change Management.
  • Excellent communication, documentation, and stakeholder management skills.


Roles & Responsibilities

  • Act as an SRE for Big Data and ETL platforms, ensuring high availability, performance, and reliability of data pipelines and applications.
  • Provide operational support and incident management (MIM), including triage, root cause analysis, and resolution of production issues.
  • Serve as a primary point of contact for customers, providing timely updates, issue resolution, and operational insights.
  • Collaborate closely with application teams to support ETL jobs, data processing workflows, and platform enhancements.
  • Coordinate with offshore teams for day-to-day operations, incident resolution, and continuous improvement initiatives.
  • Monitor, troubleshoot, and optimize Talend, Hadoop, Spark, and Big Data ecosystems.
  • < li>Implement and support monitoring, alerting, runbooks, and automation to improve platform stability and reduce manual effort.
  • Participate in problem management, change management, and post-incident reviews to drive preventive measures.
  • Support capacity planning, performance tuning, and reliability improvements across the data landscape.


TCS Employee Benefits Summary:

  • Discretionary Annual Incentive.
  • Comprehensive Medical Coverage: Medical & Health, Dental & Vision, Disability Planning & Insurance, Pet Insurance Plans.
  • Family Support: Maternal & Parental Leaves.
  • Insurance Options: Auto & Home Insurance, Identity Theft Protection.
  • Convenience & Professional Growth: Commuter Benefits & Certification & Training Reimbursement.
  • Time Off: Vacation, Time Off, Sick Leave & Holidays.
  • Legal & Financial Assistance: Legal Assistance, 401K Plan, Performance Bonus, College Fund, Student Loan Refinancing.


#LI-RJ2

Salary Range-$120000-$160,000 a year

About Tata Consultancy Services

Tata Consultancy Services (TCS) is an Indian multinational information technology (IT) services and consulting company, headquartered in Mumbai, Maharashtra, India. It is a subsidiary of Tata Group and operates in 149 locations across 46 countries. TCS is the largest Indian company by market capitalization and is ranked 11th on the Forbes Global 2000 list of the world's biggest public companies. TCS is also the second-largest IT services company in the world by revenue and the largest employer of women in India. The company provides services in areas including IT, consulting, and business solutions.
Learn more about Tata Consultancy Services
Size
469,261 employees
Industry

Similar Jobs

More Jobs at Tata Consultancy Services

More Information Technology Jobs

Find similar DevOps & Site Reliability Lead jobs: