Tata Consultancy Services

AWS Data Engineer

Tata Consultancy Services$125K — $140K *
Information Technology
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • Hold at least one AWS certification (e.g., AWS Certified Solutions Architect Associate, AWS Certified Data Analytics Specialty, AWS Certified Developer Associate)
  • Hands-on experience with AWS services for data processing/storage (S3, EC2, EMR, Glue, Athena, Lambda)
  • Strong proficiency in PySpark for big data processing and analytics
  • Practical experience with Apache Iceberg for data lakes
  • In-depth knowledge of Apache Hive for data warehousing

Responsibilities

  • Design, develop, and deploy scalable data solutions on AWS
  • Build and maintain ETL/ELT data pipelines using PySpark
  • Develop and optimize big data processing jobs on AWS EMR or Glue
  • Design and manage data warehousing solutions with schema design and query optimization
  • Implement secure cloud infrastructure components including VPCs and security groups
  • Design and manage containerized data processing applications on EKS
  • Optimize AWS resources and big data applications for performance and efficiency

Benefits

  • Flexible work arrangements
  • Opportunities for professional development and certifications
  • Collaboration with a skilled and diverse team
  • Access to cutting-edge technology and tools
  • Comprehensive health benefits
Full Job Description
Roles & Responsibilities

Job Title: AWS Certified Data Engineer

Job Description:

We are seeking a highly skilled and motivated AWS Certified Engineer to design, build, and optimize scalable data solutions within the Amazon Web Services (AWS) ecosystem. The ideal candidate will have strong expertise in big data processing using PySpark and a deep understanding of data warehousing concepts, including Hive and modern table formats like Iceberg. This role involves developing, deploying, and managing robust, efficient, and secure data pipelines and analytics solutions on AWS, leveraging core networking and compute services.

Responsibilities:
• AWS Solution Design & Implementation: Design, develop, and deploy scalable and cost-effective data solutions on AWS, leveraging services such as S3 (for data lakes), EC2, EMR, Glue, Athena, Lambda, Redshift, and Kinesis.
• Data Pipeline Development: Build and maintain robust ETL/ELT data pipelines using PySpark for data ingestion, transformation, and loading into various data stores, including those utilizing open table formats like Iceberg.
• Big Data Processing: Develop and optimize big data processing jobs using PySpark on AWS EMR or AWS Glue, handling large datasets efficiently and integrating with Iceberg table formats.
• Data Warehousing: Design, implement, and manage data warehousing solutions, including schema design, data modeling, and query optimization, with a focus on Hive and modern data lake table formats like Iceberg for historical data and analytical queries.
• Cloud Infrastructure & Networking: Implement secure and robust cloud infrastructure components, including VPCs, subnets, routing, and security groups, to ensure proper connectivity and isolation for data solutions.
• Containerized Workloads: Design, deploy, and manage containerized data processing applications on Amazon Elastic Kubernetes Service (EKS).
• Performance Tuning & Optimization: Optimize AWS resources and big data applications (Spark, Hive, Iceberg) for performance, cost, and efficiency.
• Data Governance & Security: Implement best practices for data security, access control, and compliance within AWS, including IAM policies, S3 bucket policies, and encryption.
• Monitoring & Troubleshooting: Set up monitoring, alerting, and logging for data pipelines and AWS infrastructure; troubleshoot and resolve issues promptly.
• Automation: Develop and maintain automation scripts using Python and shell scripting for infrastructure provisioning, deployment, and operational tasks.
• Collaboration: Work closely with data scientists, analysts, and other engineering teams to understand data requirements and deliver reliable data solutions.

Qualifications :
• AWS Certification: Hold at least one AWS certification (e.g., AWS Certified Solutions Architect Associate, AWS Certified Data Analytics Specialty, AWS Certified Developer Associate).
• AWS Services Expertise: Hands-on experience with key AWS services for data processing and storage including:
• Storage: S3 (for data lakes), EC2
• Data Processing: EMR, Glue, Athena, Lambda
• Networking: VPC, Subnets, Routing, Security Groups
• Containerization: EKS
• Big Data Processing: Strong proficiency in PySpark for developing complex data transformations and analytics.
• Data Lake Table Formats: Practical experience with Apache Iceberg for managing and querying data lakes.
• Data Warehousing: In-depth knowledge and practical experience with Apache Hive for data storage, querying, and schema management.
• Programming Languages:
• Python: Expert-level proficiency in Python for scripting, data manipulation, and AWS automation (Boto3).
• Shell Scripting: Proficient in shell scripting for automation and operational tasks.
• Database & SQL: Strong SQL skills for data querying and manipulation.
• Data Concepts: Solid understanding of ETL/ELT processes, data modeling, distributed computing, and data governance.

Good to Have Skills
• Containerization Orchestration: Experience with Kubernetes for deploying and managing containerized applications.
• CI/CD: Experience with CI/CD tools and practices (e.g., AWS CodePipeline, GitHub Actions, GitLab CI) for automating deployment of data solutions.
• Orchestration: Experience with workflow orchestration tools like Apache Airflow.
• Version Control: Proficient in using Git for source code management.
• Other Big Data Technologies: Exposure to other big data technologies like Apache Kafka, Flink, or Presto.

Certifications
• AWS Certified Solutions Architect Associate/Professional
• AWS Certified Data Analytics Specialty
• AWS Certified Developer Associate

Salary Range: $125,000 to $140,000 per year

About Tata Consultancy Services

Tata Consultancy Services (TCS) is an Indian multinational information technology (IT) services and consulting company, headquartered in Mumbai, Maharashtra, India. It is a subsidiary of Tata Group and operates in 149 locations across 46 countries. TCS is the largest Indian company by market capitalization and is ranked 11th on the Forbes Global 2000 list of the world's biggest public companies. TCS is also the second-largest IT services company in the world by revenue and the largest employer of women in India. The company provides services in areas including IT, consulting, and business solutions.
Learn more about Tata Consultancy Services
Size
469,261 employees
Industry

Similar Jobs

More Jobs at Tata Consultancy Services

  • Tata Consultancy Services
    Engineering Manager
    $120K — $150K *
    Parsippany, NJ 07054 (Morris County)
    Information Technology
    In-Person
  • Tata Consultancy Services
    Engineer
    $100K — $120K *
    Pittsburgh, PA 15237 (Allegheny County)
    Information Technology
    In-Person
  • Tata Consultancy Services
    Analyst
    $90K — $125K *
    Addison, TX 75001 (Dallas County)
    Information Technology
    In-Person
  • Tata Consultancy Services
    Engineer
    $100K — $120K *
    Pittsburgh, PA 15237 (Allegheny County)
    Information Technology
    In-Person
  • Tata Consultancy Services
    Fiber Solution Architect
    $130K — $145K *
    Tampa, FL 33647 (Hillsborough County)
    Telecommunications & Hardware
    In-Person

More Information Technology Jobs

Find similar AWS Data Engineer jobs: