SUMMARY: The Data Engineer plays a pivotal role in operationalizing the most urgent data and analytics initiatives for the Bank's digital business initiatives by building, managing and optimizing data pipelines and then moving these data pipelines effectively into production for key data and analytics consumers.
ESSENTIAL DUTIES AND RESPONSIBILITIES include the following. Other duties and special projects may be assigned.
- Design, create and maintain data pipelines will be the primary responsibility of the data engineer.
- Drive automation through effective metadata management.
- Assist with renovating the data management infrastructure to drive automation in data integration and management.
- Utilize modern data preparation, integration and AI-enabled metadatamanagement tools and techniques.
- Track data consumption patterns.
- Perform intelligent sampling and caching.
- Monitor schema changes.
- Recommend and automate integration flows.
SENIOR LEVEL RESPONSIBILITIES:
- Work with data science teams and with business (data) analysts to refine their data requirements for various data and analytics initiatives.
- Propose appropriate (and innovative) data ingestion, preparation, integration and operationalization techniques.
- Train counterparts such as data scientists, data analysts, LOB users or any data consumers in data pipelining and preparation techniques.
- Ensure that data users and consumers use the data provisioned to them responsibly through data governance and compliance initiatives. Participate in vetting and promoting content created in the business and by data scientists to the curated data catalog for governed reuse.
- Become a data and analytics evangelist by promoting the available data and analytics capabilities and expertise to business unit leaders and educating them in leveraging these capabilities in achieving their business goals.
To perform this job successfully, an individual must be able to perform each essential duty satisfactorily. The requirements listed below are representative of the knowledge, skill, and/or ability required. Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions.
EDUCATION and/or EXPERIENCE:
A bachelor's degree in computer science, statistics, applied mathematics, data management, information systems, information science or a related quantitativefield is required. An advanced degree in computer science (MS), statistics, applied mathematics (Ph.D.), information science (MIS), data management, information systems, information science (post-graduation diploma or related) or a related quantitative field or equivalent work experience is preferred.
The ideal candidate will have a combination of IT skills, data governance skills, analytics skills and banking domain knowledge with a technical or computer science degree
At least six years or more of work experience in data management disciplines including data integration, modeling, optimization and data quality, and/or other areas directly relevant to data engineering responsibilities and tasks.
At least three years of experience working in cross-functional teams andcollaborating with business stakeholders in the banking business domain, insupport of a departmental and/or multi-departmental data management and analytics initiative.
Must also possess the following:
- Strong experience with advanced analytics tools for object-oriented/object function scripting using languages such as R, Python, Java, and Scala
- Strong ability to design, build and manage data pipelines for data structures encompassing data transformation, data models, schemas,metadata and workload management. The ability to work with both IT and business in integrating analytics and data science output into businessprocesses and workflows.
- Strong experience with popular database programming languages including SQL and PL/SQL for relational databases and certifications on upcoming NoSQL/Hadoop oriented databases like MongoDB and Cassandra for nonrelational databases.
- Strong experience in working with large, heterogeneous datasets in building and optimizing data pipelines, pipeline architectures and integrated datasets using traditional data integration technologies. These should include ETL/ELT, data replication/CDC, message-oriented data movement, API design and access and upcoming data ingestion and integration technologies such as stream data integration, CEP and data virtualization.
- Strong experience in working with SQL on Hadoop tools and technologies including HIVE, Impala, Presto and others from an open source perspective and Hortonworks Data Flow (HDF), Dremio, Informatica, Talend among others from a commercial vendor perspective.
- Strong experience in working with and optimizing existing ETL processesand data integration and data preparation flows and helping to move them in production.
- Strong experience in working with both open-source and commercial message queuing technologies (Kafka, JMS, Azure Service Bus, Amazon Simple queuing Service), stream data integration technologies such as Apache Nifi, Apache Beam, Apache Kafka Streams, Amazon Kinesis, others and stream analytics technologies (Apache Kafka, KSQL, Apache Spark).
- Basic experience working with popular data discovery, analytics and BI software tools like Tableau, and OBI for semantic-layer-based data discovery.
- Strong experience in working with data science teams in refining and optimizing data science and machine learning models and algorithms.
- Basic experience in working with data governance teams and specifically business data stewards and the CISO in moving data pipelines into production with appropriate data quality, governance and securitystandards and certification.
- Demonstrated ability to work across multiple deployment environments including cloud, on-premises and hybrid, multiple operating systems and through containerization techniques such as Docker, Kubernetes, AWS Elastic Container Service and others.
- Proficiency in agile methodologies and the capability of applying DevOps and increasingly DataOps principles to data pipelines to improve the communication, integration, reuse and automation of data flows between data managers and consumers across an organization
- Deep domain knowledge or previous experience working in the bankingbusiness would be a plus.
COMMUNICATION/LANGUAGE SKILLS: Ability to speak, read and write in the English language; Spanish (or other languages) helpful. Ability to read and comprehend simple instructions, short correspondence and memos; write simple correspondence; and effectively present information in one-on-one and small group situations to customers, clients, & other employees of the organization.
MATHEMATICAL SKILLS: Ability to add, subtract, multiply, and divide in all units of measure, using whole numbers, common fractions, and decimals; and compute rate, ratio, and percent; and to draw and interpret bar graphs. An understanding of higher mathematical concepts may be required for specific departments.
REASONING AND ANALYSIS ABILITY: Ability to apply common sense understanding to carry out instructions furnished in written, oral, or diagram form; and deal with problems involving several concrete variables in standardized situations.
- Strong experience supporting and working with cross-functional teams in a dynamic business environment.
- Required to be highly creative and collaborative. An ideal candidate would be expected to collaborate with both the business and IT teams to define the business problem, refine the requirements, and design and develop data deliverables accordingly. The successful candidate will also berequired to have regular discussions with data consumers on optimally refining the data pipelines developed in nonproduction environments and deploying them in production.
- Required to have the accessibility and ability to interface with, and gain the respect of, stakeholders at all levels and roles within the company.
- Is a confident, energetic self-starter, with strong interpersonal skills.
- Has good judgment, a sense of urgency and has demonstrated commitment to high standards of ethics, regulatory compliance, customer service and business integrity.