Role Data Engineer + Gen AI Location NYC (Hybrid) Duration: 12+ Months Interview Mode In-Person Interview
Tech Stack Data Engineering with PySpark + Data Pipeline Orchestration Tool like Airflow & Glue or similar , AI / Agentic AI, FastAPI, Google Cloud Platform, Flask, Python, SQL, Docker & Kubernetes, Fast API
Responsibilities:
As a Data Engineer, you'll be responsible for designing and building high performance, and scalable data platforms
You will be leading team of multiple very enthusiastic and skilled engineers to drive the product development and adoption
You will be required to effectively collaborate with product teams from business group and understand the product roadmap and vision and translate that into engineering artefacts
You will work with a variety of teams and individuals, including platform engineers, usecase owners, analytical users to understand their needs and come up with innovative solutions
You will follow the Amex-way of building engineering products that leads to engineering excellence by adopting DevOps principals
Qualifications:
Bachelor's degree in computer science, Engineering, or a related field. Master's degree would be a plus
Great to have:- Google Cloud Platform professional certification - Data Engineer/Cloud Architect will be preferable
Experience with Google Cloud Platform services (Big Query, Dataflow, Dataproc, Pub/Sub, Cloud Storage, Composer, etc.).
6+ years of software development experience with hands-on expertise in coding in Java/Python/Scala.
3+ years of experience in creating low code/no code ETL tool for setting up large scale data transformation on Google Cloud Platform Cloud
Strong SQL, RDBMS skills. Expert in writing complex SQLs for different databases such as Hive , MySQL , Postgres etc. Proficiency in working with NoSQL databases as well
Experience working with Spark , Big Data and Hive
Experience in Git Management including PR reviews, maintaining code hygiene
In-depth understanding of data warehousing concepts, dimensional modelling, and data integration techniques
Experience in optimizing high volume data processing jobs.
3+ years experience in writing APIs
Knowledge of High availability and DR setup.
Hands-on experience on CICD pipelines, Automated test frameworks, DevOps and source code management will be a big plus (XLR, Jenkins, Git, Stash, Jira, Confluence, Splunk etc.)
Experience working in Agile/SAFe framework for development
Understanding of Generative AI concepts, including LLMs (Large Language Models), prompt engineering, embeddings, and vector databases.
Experience integrating GenAI capabilities into data platforms (e.g., semantic search, AI-driven analytics, automated data insights).
Excellent communication and analytical skills
Excellent team-player with ability to work with global team