Lead and participate in the design and development of our data pipeline, a sophisticated backend responsible for the ingestion and delivery of billions to trillions of records.
Duties & Responsibilities
- Design, develop, and maintain optimal data pipeline architecture
- Assemble large, complex data sets that meet functional / non-functional business requirements
- Develop and implement data auditing strategies and processes to ensure data quality, accuracy, and integrity
- Develop, deploy, and support real-time automated data streams from numerous sources into the data platform
- Work with stakeholders including the Executive, Product, Data, and Design teams to assist with data-related technical issues and support their data infrastructure needs
- Work with Data Scientists and analytics experts to strive for greater functionality in our data systems and increase the value of our data
- This position has NO official supervisory responsibilities
- This position may be called upon to lead small technical teams, and serve as a lead systems architect for individual projects
- Bachelor's degree and minimum six years of professional software development experience
- Minimum two years' experience building and optimizing big data pipelines, architectures, and data sets
- Minimum two years' experience with one or more of: Java, Scala, or Python
- Minimum two years' experience with Apache Spark and NoSQL (HBase, Cassandra, etc.)
- Minimum two years' experience with stream-processing systems such as Kafka, Spark-streaming, etc.
- Minimum two years' experience with Azure or AWS cloud services
- Minimum two years' experience with Linux/Unix, scripting, and administration
- Minimum two years' experience with Docker
- Minimum two years' experience with SQL and experience working with RDBMSs (Postgres/PostGIS, SQL Server, etc.)
- Minimum two years' experience with source control systems (Git), branching & merging, CI, etc.
- Master's degree in Computer Science or a related field
- Minimum three years' experience in Data Engineer role
- Knowledge of the maritime industry a big plus
- Ability to work efficiently with large spatio-temporal data sets strongly desired
- GIS experience a plus
- Experience with Kubernetes
- Experience with Event Driven development
- Experience with Log Management systems like Splunk, Logstash, etc.
- Experience with CI/CD pipelines (e.g. Jenkins, etc.)
- Experience with Machine Learning is a plus
- Experience working within Scrum methodology
- Strong written and oral communications skills (English)
- Be a team player. Have opinions and listen to others.
- Passionate about software development! Willing to learn and to help others.