5-7 years of hands-on experience with Databricks and Unity Catalog
Proven track record in setting up distributed Databricks workspaces
Solid background in SQL and Python for scripting and automation
Experience with Azure for infrastructure management and provisioning
Understanding of data governance frameworks and compliance standards
Familiarity with Privacera for data security and access control
Knowledge of data requirements for machine learning and AI applications
Responsibilities
Implement data provisioning patterns to meet business needs while adhering to standards and policies
Manage and configure Databricks workspaces, including policies and cluster sizing
Develop and troubleshoot Python notebooks and UDFs while masking data
Ensure compliance with data security regulations using Databricks and Privacera
Navigate complex approval processes across architectural and governance teams
Design and implement data architecture using Databricks and Unity Catalog
Create and optimize data pipelines for ETL processes focused on data integrity
Benefits
Flexible work hours
Opportunities for professional development and training
Access to cutting-edge technologies
Collaborative work environment across multiple teams
Health and wellness programs
Full Job Description
Your Role
Implement data provisioning patterns based on business requirements, following predefined processes, policies, standards, and metadata management rules
Create and manage distributed workspaces in Databricks, set up workspace policies, provision Databricks clusters and manage data infrastructure sizing and capacity
Create Python notebooks, implement data masking processes, create UDFs (SQL/Python), troubleshoot data pipelines
Ensure data security and compliance with regulations using Databricks and Privacera's features
Navigate multi-step enterprise approval process across architecture, security, and governance teams
Design and implement data architecture leveraging technologies such as Databricks, Unity Catalog, and Privacera.
Develop, optimize, and manage data pipelines for ETL processes using Databricks, with a focus on data integrity and quality
Design and maintain data models and schemas, incorporating Unity Catalog and Collibra data governance practices
Establish a robust data governance strategy, defining standards, metadata management, lineage, and quality practices
Operationalize Machine Learning models in Batch and Real Time Data Pipelines, leveraging relevant governance setups
Collaborate with cross-functional teams including data scientists, engineers, and analysts to translate business requirements into scalable solutions
Foster collaboration and clarity in complex, ambiguous environments.
Job Description - Grade Specific
Your skills and experience:
Extensive hands-on platform engineering experience with Databricks and Unity Catalog
Proven success in implementing distributed Databricks workspaces, SQL, and Python, scripting, and automation
Experience with Azure infrastructure provisioning
Familiarity with data governance frameworks and compliance standards
Familiarity with Privacera data security and access control management
Familiarity with data requirements of common ML/AI use cases
Awareness of data governance frameworks, enterprise data compliance requirements, metadata modeling, data architecture, and enterprise-scale data discovery solutions