Own TextNow’s data warehouse, pipeline, and integration points between various business systems.
Develop tools to monitor, debug, and analyze data pipelines to ensure data quality and reliability. Troubleshoot data issues and build customized reports to investigate key business questions.
Explore available technologies and develop solutions to continuously improve our data quality, workflow reliability, and scalability. Perform capacity planning and cost estimates of proposed solutions.
Design, Develop and support new and existing batch and real-time data pipelines and recommend improvements and modifications.
Be a champion of TextNow’s data ecosystem by working with engineering and infrastructure to implement data strategy for governance, security, privacy, quality and retention that will satisfy business policies and requirements.
Communicate strategies and processes around data modeling and architecture to multi-functional groups and senior level management. Identify, design, and implement internal process improvements.
Implement the best practices and standards for data definitions.
Manage data infrastructure to grow and support the Data Science team in relation to the construction of performant ML-based data products.
Who You Are:
Have 5-7 years of experience working with data warehouse/data lake and ETL architectures, cloud datawarehouses (Redshift, Snowflake, RDS) and experience in Python, SQL, preferably at companies with fast-growing and evolving data needs
Someone who takes action and ownership with a hands-on experience with AWS and services like Redshift, Kinesis, EMR, RDS, EC2 etc., and familiarity with schemas, metadata catalogs etc.
Have 2+ years of experience with open-source technologies such as Airflow, Spark (including pyspark)
Hands on experience with designing real-time data streaming pipelines using spark structured Streaming, Kafka and/or Kinesis is a plus.
Respectfully candid with the ability to initiate and drive projects to completion with support of the team. Highly organized, structured work approach and dependable. Expected ability to communicate and collaborate on data engineering tasks with internal partners.
A bold risk-taker and self-starter who loves improving performance of queries and data jobs and scaling the system for exponential growth in data