Full Job Description
We are seeking a Senior Data Engineer to join our team. This individual will own 3-5 data domains end-to-end, operate hundreds of pipelines, and drive architectural improvements that impact how AWS leadership makes decisions. You will partner with service teams across AWS to design data contracts, build ingestion flows, and deliver analytical models that serve the entire organization.
The ideal candidate is a technical leader who thrives in ambiguity, takes a long-term architectural view, and consistently delivers exemplary solutions. You are an expert with SQL, ETL, and data processing, with experience leveraging cloud-based data services such as AWS EMR, Glue, Redshift, and Lambda. The candidate should have hands-on experience with AI/ML technologies, including LLMs, and a strong understanding of designing and building Agentic Frameworks - including autonomous agents, multi-agent orchestration, and tool integration. You are comfortable with ambiguity in a fast-paced environment, able to think big while paying careful attention to detail, and passionate about building data platforms using AI to accelerate the next generation of analytics at AWS scale
Key job responsibilities
Identify limitations and opportunities in data processing tools, drive improvements and innovation, define data processing guidelines, and ensure best practices in all pipelines designed and reviewed. For example: redesigning ingestion frameworks to handle new AWS service telemetry data, or building reusable transformation patterns adopted across multiple teams.
Define and own data architecture at the team level - ensuring architecture effectively matches business problems and data challenges with security, scalability, and cost effectiveness. Show good judgment making technical trade-offs between short-term technology needs and long-term business needs.
Produce exemplary code - solutions that are easily usable by customers, inventive, secure, easily maintainable, appropriately scalable, and extensible. Build solutions that are easy for others to contribute to. Work to simplify, optimize, and remove bottlenecks.
Define and own infrastructure architecture at the team level. Anticipate data management and access patterns, evolve the technology stack to remove bottlenecks, and deliver systems that are secure, scalable, and long lasting. Define team-level guidelines and best practices for infrastructure management and automation.
Solve complex ambiguous problems - for example, designing cross-domain data models that unify billing, usage, and service telemetry data, or combining multiple datasets to solve problems that couldn't be solved before. Spot areas that might lead to customer confusion, data misinterpretation, or gaps in data contracts.
Effectively split project work into parallel tasks that can be performed by themselves and others and reassembled successfully. Drive to completion projects with dependencies on peers or other teams.
Influence related teams' data architecture and software design. Provide technical assessments for promotions. Actively mentor and develop others. Build consensus when confronted with discordant views.
Drive data engineering best practices - Data Discovery, Naming Conventions, Operational Excellence, Data Security. Ensure team's data is auditable, available, and accessible.
Proactively fix data architecture deficiencies and propose larger projects which may require the work of other teams. Drive improvements through code review, design discussions, team planning, and operational reviews.
Participate in on-call rotation and own operational health of data systems - establish monitoring, alarming, runbooks, and SLA tracking. Drive continuous improvement in reliability and incident response.
BASIC QUALIFICATIONS
- 7+ years of data engineering experience
- Experience with data modeling, warehousing and building ETL pipelines
- Experience with SQL
- Experience in at least one modern scripting or programming language, such as Python, Java, Scala, or NodeJS
- Experience mentoring team members on best practices
- Experience with MPP databases such as Amazon Redshift
- Experience building/operating highly available, distributed systems of data extraction, ingestion, and processing of large data sets
PREFERRED QUALIFICATIONS
- Experience with big data technologies such as: Hadoop, Hive, Spark, EMR
- Experience operating large data warehouses
- Experience providing technical leadership and mentoring other engineers for best practices on data engineering
- Bachelor's degree in computer science, engineering, analytics, mathematics, statistics, IT or equivalent
- Knowledge of distributed systems as it pertains to data storage and computing
The base salary range for this position is listed below. Your Amazon package will include sign-on payments and restricted stock units (RSUs). Final compensation will be determined based on factors including experience, qualifications, and location. Amazon also offers comprehensive benefits including health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage), 401(k) matching, paid time off, and parental leave. Learn more about our benefits at https://amazon.jobs/en/benefits.
USA, WA, Seattle - 154,600.00 - 209,100.00 USD annually