Position OverviewThe AI Data Engineer designs, builds, and operates enterprise-grade data and AI platforms using GitOps principles. This role combines data engineering, AI enablement, platform engineering and IT Operations, with a strong emphasis on stability and repeatability.
This role directly supports and enables Netwrix products and internal platforms, ensuring that AI and data capabilities align with Netwrix's security-first, governance-driven mission.
The AI Data Engineer will work with data generated by or integrated into Netwrix solutions such as:
- Netwrix Data Security Platform components, including data access governance, data classification, auditing and identity-centric security telemetry.
- Platform Governance products (Drata, Salesforce and NetSuite), which generate configuration, change, and audit data requiring structured ingestion and analysis.
- Identity, endpoint, and infrastructure security products (e.g., Active Directory security, endpoint protection, privileged access, configuration management).
- Internal AI Agents and Experience Platforms where data must be securely scoped, versioned and observable across multiple domains/tenants.
Key ResponsibilitiesGitOps-Driven Platform & Pipeline Engineering (GitHub, Azure DevOps, Terraform)
- Design, build and operate data and AI platforms as code, using Git-based workflows as the source of truth.
- Implement pull-request-driven change control, automated testing and CI/CD pipelines.
- Define and maintain Infrastructure-as-Code for data and AI systems to ensure consistency, traceability, and rollback capability.
AI & ML Data Pipeline Engineering (Azure ML Feature Store, Databricks Feature Store)
- Design and maintain scalable ETL/ELT pipelines that support:
- AI/ML model training and retraining
- Feature engineering and feature stores
- Batch and near-real-time inference workflows
- Design pipelines backwards from business requirements while accounting for data freshness, latency and reliability.
GenAI & RAG Enablement (Azure OpenAI and AI Search, internal Netwrix data sources; internal AI agents, secured APIs)
- Support Retrieval-Augmented Generation (RAG) and internal AI agents by curating, indexing and refreshing select data sources.
- Build and operate pipelines for:
- Embedding generation and lifecycle management
- Vector database ingestion and maintenance
- Context retrieval and prompt-adjacent data flows
Data Quality, Governance & Observability (Azure Monitor, ML monitoring, Application Insights)
- Implement proactive monitoring for:
- Data quality and schema integrity
- Pipeline performance and failure modes
- Distribution shifts and data drift impacting AI systems
- Integrate security, privacy and compliance controls directly into pipelines by design. Must partner with both Product and Corporate Security teams.
- Maintain clear, auditable data lineage, ownership and documentation.
MLOps & Production Readiness (Azure ML Model Registry, Runbooks, operational handoff documentation)
- Partner with Product and Engineering teams to operationalize models by:
- Integrating data pipelines into MLOps workflows
- Supporting model versioning, retraining and rollback strategies
- Enabling observability across data and model performance
- Ensure AI systems/integrations are supportable by IT Operations and Solutions team members; train or provide guidance at a regular cadence.
Cloud & Platform Engineering (Azure Storage, Azure Kubernetes Service, Azure Container Registry)
- Build and operate Azure-based data and AI platforms, including storage, compute, orchestration and containerized services.
- Optimize platforms for cost efficiency, performance, reliability and scale in a global, mostly remote work environment.
- Support hybrid or restricted environments where AI systems must meet enterprise or regulatory constraints.
Cross-Functional Collaboration
- Work closely with:
- IT Operations & Platform Engineering (Intune, Entra ID)
- Security & Governance teams (Netwrix, Drata)
- Data Science and AI Engineering (Azure ML, Azure OpenAI)
- Product and business stakeholders (Salesforce, NetSuite)
- Translate AI and business requirements into durable, enterprise-ready architectures.
- Produce clear architecture diagrams, runbooks, and operational documentation.
Required Qualifications- Bachelor's Degree in Computer Science, Data Engineering, Engineering, or equivalent practical experience.
- 5 - 7 years of experience in data engineering, platform engineering, or infrastructure roles.
- Strong proficiency in Python and SQL, with working fluency in JSON, YAML, and shell scripting.
- Experience using Gitbased workflows, Infrastructure as Code, and CI/CD pipelines to build and operate data and AI platforms in production environments.
- Experience operating workloads in Azure and AWS.
- Has performed direct and operational applications of large language models (LLMs) and GenAI platforms (OpenAI, Anthropic Claude, Google Gemini) within enterprise controlled environments.