Data Engineer III
As a data engineer at Kapitus, you will be working with other engineers to design well-crafted data systems. If you are someone who believes in a strong engineering culture, agile practices and quality code Kapitus is the kind of environment you are looking for. Your contributions will empower Kapitus to continue helping small business owners find lending options so they can make payroll during tough times, scale to meet market demands, and making large purchases to scale their operations.
What You’ll Do:
- Create and maintain an optimal data pipeline architecture.
- Assemble large, complex data sets that meet functional / non-functional business requirements.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Build the infrastructure required for ETL/ELT from a wide variety of data sources using SQL, NoSQL and AWS ‘big data’ technologies.
- Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
- Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
- Work with data and analytics experts to strive for greater functionality in our data systems.
- BS or MS degree in Computer Science, Information Systems, or another quantitative field. Minimum 3 years work experience (5+ desirable).
- Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases including NoSQL.
- Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
- Strong analytic skills related to working with unstructured datasets.
- Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores. Experience with multiple data warehousing services, e.g., Redshift, Snowflake.
- Datamart design and implementation.
- Experience with Model and Data Governance.
- Strong organizational skills.
- Experience supporting and working with cross-functional teams in a dynamic environment; willingness to assume leadership with a remote team.
- Technology experience:
- o Excellent proficiency in Python, including object-oriented design and automated tests
- o Excellent proficiency in another programming language.
- o Experience using or supporting modern Machine Learning libraries and services
- o Big data tools: Spark, Kafka, Hive, etc.
- o Relational SQL and NoSQL databases, including Postgres and Cassandra.
- o Data pipeline and workflow management tools: Airflow, etc.
- o AWS cloud services (and/or GCP/Azure): EC2, S3, RDS, Athena, Glue, Redshift, Lambda, Kinesis.
- o Stream-processing systems: Storm, Spark-Streaming, etc.
- o Container technologies: Docker, Kubernetes
- You have experience using Bitbucket/GitHub or other version control software