$80K — $100K *
Join our fast-growing, dynamic team at Socially Determined and help us deploy our data-driven approach to improving the health of communities and enhancing the performance of our clients nationwide. Our purpose-built analytics platform allows our mission-driven team to understand and effectively address Social Determinants of Health (SDOH) through innovative analytics and evidence-based interventions. We are a fast-paced start-up company led by an experienced executive team with tremendous opportunities for growth. Socially Determined is headquartered in Washington, DC.
We seek an experienced Data Architect to help design, implement and operate a scalable, automated high-performance data collection, ingestion, and management service in our HITRUST-certified environment. Using AWS services and integrated tools, the ideal candidate will leverage best-practice techniques for designing a data lake structure and ETL processes optimized for acquiring, organizing, and preparing raw healthcare data to feed our secure analytic processes.
The Data Architect will be fluent in data lake concepts, including structuring storage constructs for different domains of raw data, standardizing and cleaning data, and transforming text data into Apache Parquet format. The Data Architect will be proficient in Apache Spark distributed computing and will help develop the design and development of secure Spark processes for analyzing large Parquet datasets. This candidate will establish best practices for examining and querying Apache Parquet data using Redshift Spectrum and AWS Athena for rapid analysis of large datasets.
The Data Architect must have expert level SQL and and ETL skills, with experience designing complex transformations of data types to establish consistent and automated secure analytic processes, especially using Python and Scala languages for data manipulation. The Data Architect will specifically help create an automated ingestion process where raw data is acquired, organized, secured, cataloged, and transformed to parquet to enable Spark machine learning and other data analytic processing. The Data Architect will be experienced in SQL and proficient at designing relational database schemas, tables, and relationships in AWS Aurora PostgreSQL and other RDBMSs, and will lead the design of loading subsets of data lake data (text or parquet) into PostgreSQL for data warehousing and reporting functions. The Data Architect will understand MongoDB JSON databases and how to organize data effectively in JSON documents to achieve high scalability and performance for certain analytics processes. The optimum candidate will have geospatial experience, specifically PostGIS and Tiger, and will be proficient at standardizing geographic components across our data architecture.
Valid through: 12/16/2020