No Relocation Assistance Offered
Job Number #173747 - Piscataway, New Jersey, United States
At Colgate-Palmolive, Data Engineers focus on expanding and optimizing our data, data pipeline architecture, data flow, and collection for multi-functional teams. The Modern Data Engineer will be an experienced pipeline builder and data wrangler who enjoys optimizing data systems from the ground up to power the next generation of predictive and generative AI applications.
You will support our software developers, data analysts, and data scientists on data initiatives, ensuring efficient data delivery architecture is consistent throughout ongoing projects. You must be self-directed and comfortable supporting the dynamic data needs of multiple teams. You will be excited by the prospect of re-designing our company's data architecture to support our next generation of products, specifically by building robust infrastructure for Large Language Models (LLMs) and preparing high-quality, "AI-ready" datasets. This team is a group of innovators and technologists that love to learn and collaborate!
Work visa sponsorship is not available for this position.What you'll do- Pipeline Architecture: Build and maintain optimal data pipeline architecture that supports both traditional analytics and advanced AI/ML workloads.
- AI-Ready Datasets: Assemble large, sophisticated data sets that meet functional and non-functional business requirements, ensuring data is cleaned, structured, and enriched with metadata for optimal use in model training and fine-tuning.
- RAG & Generative AI Infrastructure: Design and engineer data pipelines to support Retrieval-Augmented Generation (RAG) systems, including text chunking, embedding generation, and vector database synchronization.
- ETL/ELT Excellence: Build the infrastructure required for efficient extraction, transformation, and loading of data from a wide variety of data sources.
- Process Optimization: Identify, design, and implement internal process improvements, including automating manual processes, optimizing data delivery, and re-designing infrastructure for greater scalability.
- Cross-Functional Tooling: Build data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
- Data Harmonization: Successfully harmonize, process, and extract value from large, disconnected datasets to uncover actionable insights.
Required QualificationsEducation: Bachelor's degree required; Graduate degree in Computer Science, Statistics, Informatics, Information Systems, or another quantitative field is a plus.- Experience: 2+ years of experience in a Data Engineer role.
- SQL Proficiency: Strong, hands-on SQL experience, including complex query authoring, performance tuning, and deep familiarity working with a variety of relational databases.
- Data Pipelines: Proven track record of designing, building, and maintaining robust, scalable data pipelines, ETL/ELT workflows, and overarching data architectures.
- Analytical Skills: Strong analytic skills related to working with both structured and unstructured datasets, with a proven ability to process text, image, or document data for AI ingestion.
- Systems Management: Familiarity with building processes supporting data transformation, data structures, metadata, dependency, and workload management.
Preferred Qualifications- Experience with Cloud services such as GCP or AWS.
- Experience with modern relational SQL, NoSQL, and Cloud Data Warehouses (e.g., Snowflake, PostgreSQL).
- Familiarity with AI/LLM orchestration frameworks (e.g., LangChain, LlamaIndex) and Vector Databases (e.g., Pinecone, Milvus, Weaviate, pgvector).
- Experience with Data Flow, Data Pipeline, and workflow management tools such as Cloud Composer or Airflow.
- Experience with Visualization tools (Sigma, DOMO, etc)
- Experience supporting and working effectively with multi-functional teams in a dynamic environment.
- Experience performing root cause analysis on internal and external data and processes to answer specific business questions and drive improvements.
Compensation and BenefitsSalary Range $76,800.00 - $121,000.00 USD
Pay is determined based on experience, qualifications, and location. Salaried employees may also be eligible for discretionary bonuses, profit-sharing, and long-term incentives for Executive-level roles.
Benefits: Salaried employees enjoy a comprehensive benefits package, including medical, dental, vision, basic life insurance, paid parental leave, disability coverage, and participation in the 401(k) retirement plan with company matching contributions subject to eligibility requirements. Additional benefits include a minimum of 15 vacation/PTO days (hourly employees receive a minimum of 120 hours) and 13 paid holidays (vacation days are prorated based on the employee's hire date within the calendar year). Paid sick leave is adjusted based on role and location in accordance with local laws. Detailed information regarding paid sick leave entitlements will be provided to employees upon hiring and may be subject to adjustments based on changes in legislation or company policies.
#LI-Hybrid