Full Job Description
Join our DevOps Engineering team as a Senior Database Engineer to design, build, and engineer cloud-native database platforms across a modern, multi-engine data stack. This is an engineering role, not a DBA role, focused on building scalable systems, writing infrastructure-as-code, and embedding databases into software delivery pipelines.
You'll work closely with DevOps and Product Engineering to build high-performing data infrastructure that supports critical applications and analytics. You will own and evolve a diverse ecosystem spanning AWS RDS, Aurora, DynamoDB, Redshift, Azure SQL, PostgreSQL, Snowflake, and NoSQL engines, integrating AI-driven automation and MLOps-ready data foundations to support critical applications and machine learning workflows.
Key Responsibilities
Multi-Engine Cloud Data Architecture & Platform Engineering
- Design, build, and engineer hybrid data solutions spanning relational (PostgreSQL, Aurora, RDS, Azure SQL), columnar (Redshift, Snowflake), and NoSQL (DynamoDB, DocumentDB, OpenSearch) engines - selecting the right engine per workload.
- Architect cloud-native data lakehouse platforms on AWS using S3, Lake Formation, Glue, and open formats (Apache Iceberg, Delta Lake, Parquet), with Azure Data Lake as a secondary target.
- Implement and manage Medallion Architecture (Bronze / Silver / Gold) patterns to support raw ingestion, curated analytics, and business-ready datasets.
- Build and optimize hybrid data platforms spanning operational databases (PostgreSQL / RDS / Aurora / DynamoDB) and analytical systems (Snowflake / Redshift).
- Develop and maintain semantic layers and analytics models to enable consistent, reusable metrics across BI, analytics, and AI use cases.
- Engineer efficient data models, ETL/ELT pipelines, and query performance tuning for analytical and transactional workloads.
- Engineer replication topologies, partitioning strategies, and data lifecycle automation as code - not manual DBA operations.
- Build automated schema migration pipelines (Flyway/Liquibase) and data versioning workflows integrated into CI/CD replacing manual schema change management.
- Design and implement API-first data access patterns, enabling engineering teams to interact with databases through well-defined, versioned interfaces rather than direct connection strings.
Advanced Data Pipelines, Streaming & Orchestration
- Engineer ELT/ETL pipelines using AWS-native services (Glue, Kinesis, MSK, Step Functions, EventBridge) and modern tooling (dbt, Airflow) for batch, micro-batch, and near-real-time workloads.
- Build streaming data pipelines using AWS Kinesis Data Streams, Kinesis Firehose, and MSK (Managed Kafka) for event-driven, low-latency ingestion across multiple database targets.
- Implement data quality checks, schema enforcement, lineage, and observability across pipelines.
- Optimize performance, cost, and scalability across ingestion, transformation, and consumption layers.
- Implement change data capture (CDC) using AWS DMS, Debezium, or native engine features to synchronize data across SQL, NoSQL, and analytical systems.
NoSQL & Document Store Engineering
- Design and optimize DynamoDB schemas using single-table design patterns, GSIs, LSIs, and DynamoDB Streams for event-driven architectures.
- Architect DocumentDB (MongoDB-compatible) clusters for document workloads requiring flexible schema and hierarchical data models.
- Build and manage OpenSearch / ElasticSearch clusters for full-text search, log analytics, and observability use cases.
- Evaluate and recommend the right NoSQL engine (DynamoDB vs DocumentDB vs OpenSearch vs ElastiCache) based on access patterns, latency, and cost profile.
- Implement TTL policies, DynamoDB Accelerator (DAX), and ElastiCache (Redis/Memcached) for high-throughput caching layers.
AI-Enabled Data Engineering & MLOps Foundations
- Apply AI and ML techniques to data architecture and operations, including intelligent data quality validation, anomaly detection, schema drift detection, and query workload pattern analysis - using AWS SageMaker and Amazon Bedrock.
- Design and build ML-ready data foundations: SageMaker Feature Store, training dataset pipelines, experiment tracking, and inference data pipelines using AWS-native MLOps services.
- Integrate LLM capabilities via Amazon Bedrock for AI-assisted data documentation, query generation, lineage summarization, and automated data cataloging.
- Implement vector database solutions (pgvector on Aurora/RDS, OpenSearch k-NN) to support AI similarity search and retrieval-augmented generation (RAG) use cases.
- Build AI-powered observability using ML-driven anomaly detection on pipeline metrics, query performance trends, and data quality SLAs.
Software Engineering, DevOps & Infrastructure as Code
- Build and manage all data infrastructure as code using Terraform and AWS CDK - covering RDS, Aurora, DynamoDB, Redshift, Glue, MSK, Kinesis, Snowflake, and supporting IAM/networking components.
- Integrate database changes into CI/CD pipelines (GitHub Actions, AWS CodePipeline) with automated schema testing, data contract validation, deployment, and rollback.
- Develop internal platform tooling using Python, SQL, and AWS SDK (boto3) - building self-service capabilities that allow engineers to provision governed database environments on demand.
- Implement database-as-code practices: automated schema migrations, snapshot/restore testing pipelines, and environment clone automation - eliminating manual DBA provisioning tasks.
- Build and publish internal data platform APIs and SDKs that abstract database complexity from application teams.
Security, Governance & Compliance Engineering
- Engineer enterprise-grade data governance across all engines: RBAC, column/row-level security, field-level encryption, dynamic data masking, and comprehensive audit logging, implemented as code, not manual configuration.
- Define and enforce data contracts and ownership using AWS Lake Formation, Glue Data Catalog, and Snowflake governance - versioned and managed in source control.
- Partner with Security and Compliance teams to ensure audit readiness and regulatory alignment (SOC 2, HIPAA, GDPR where applicable).
- Manage AWS IAM policies, KMS encryption, VPC security groups, and private endpoints (PrivateLink, VPC Endpoints) for least-privilege access and network isolation.
- Implement secrets management using AWS Secrets Manager and Parameter Store with automated credential rotation for all database engines.
Qualifications
Experience
- 7+ years of experience in database platform engineering, data engineering, or cloud infrastructure engineering in production environments.
- Proven experience as a lead or senior engineer on multi-engine database platforms spanning both SQL and NoSQL workloads - with a software engineering, not administration, mindset.
- Strong track record of designing and operating data platforms at scale in AWS environments, with databases managed as code from day one.
AWS & Cloud Databases
- Deep hands-on expertise with AWS RDS (PostgreSQL, MySQL, Oracle), Aurora (Serverless v2, Global Database), and RDS Proxy.
- Production experience with DynamoDB: single-table design, GSI/LSI strategy, Streams, DAX, and capacity planning.
- Working knowledge of AWS Redshift, Glue, Lake Formation, Kinesis, MSK, and EventBridge for pipeline and lakehouse architectures.
- Familiarity with Azure SQL, Azure Data Factory, or Azure Synapse is a plus.
Snowflake
- Strong hands-on Snowflake experience: performance tuning (clustering, materialized views, query profiling), cost optimization (warehouse sizing, auto-suspend, credits), security (RBAC, dynamic masking, network policies), and data sharing.
SQL, NoSQL & Data Modeling
- Deep SQL expertise across multiple engines (PostgreSQL, T-SQL, Snowflake SQL, DynamoDB PartiQL).
- Strong understanding of Medallion Architecture, semantic layers, and analytics engineering best practices.
- Proven NoSQL data modeling: DynamoDB single-table design, document store schema design, and search index architecture.
Pipelines & Orchestration
- Experience building and operating advanced ELT/ETL pipelines using dbt, AWS Glue, Airflow, or similar orchestration frameworks.
- Hands-on experience with streaming ingestion using Kinesis, MSK (Kafka), or equivalent event-driven technologies.
- Familiarity with CDC patterns and tools (DMS, Debezium) for cross-engine data synchronization.
AI & ML Data Foundations
- Understanding of ML pipeline requirements: feature engineering, training dataset preparation, model versioning, and inference data patterns.
- Exposure to AWS SageMaker, Bedrock, or equivalent ML platforms from a data infrastructure perspective.
- Awareness of vector databases and embedding-based retrieval (pgvector, OpenSearch k-NN) is a strong plus.
Infrastructure & Automation
- Proficiency with Terraform for database and cloud infrastructure as code; AWS CDK experience is a plus.
- Proficiency with Python (boto3, SQLAlchemy, pandas) and SQL for data transformation, automation, and tooling.
- Experience integrating database workflows into CI/CD pipelines using GitHub Actions, CodePipeline, or similar.
What Success Looks Like
Within the First 90 Days
- Fully onboarded and delivering enhancements across Snowflake, RDS, Aurora, and DynamoDB environments.
- Conducted a comprehensive audit of existing database architectures and delivered a prioritized improvement roadmap.
- Delivering optimized queries, schemas, and automation for key systems.
- Established IaC coverage for at least one previously manually-provisioned database environment.
Ongoing Outcomes
- Measurable improvements in query performance, pipeline reliability, and data platform scalability across all database engines.
- Zero manual database provisioning - all environments managed through infrastructure as code and CI/CD pipelines.
- Continuous collaboration across teams to enhance data availability and governance.
- AI-powered automation reducing manual operational overhead in database monitoring, anomaly detection, and data quality management.
- ML-ready data foundations enabling Data Science teams to ship faster with governed, reproducible datasets.
Bonus Experience (Nice to Have)
- AWS certifications: AWS Database Specialty, AWS Solutions Architect, AWS Data Engineer Associate.
- Snowflake SnowPro Core or Advanced Data Engineer certification.
- Experience with Apache Iceberg, Delta Lake, or Hudi for open table format lakehouse architectures.
- Hands-on experience with SageMaker Feature Store, Model Registry, or MLflow for MLOps workflows.
- Familiarity with data observability platforms (Monte Carlo, Bigeye) or custom observability with Great Expectations / dbt tests.
- Experience with graph databases (Neptune) or time-series databases (Timestream) in AWS.
- Exposure to Databricks on AWS or Azure for unified data and AI workloads
Additional Information
What does IntegriChain have to offer?
- Mission driven: Work with the purpose of helping to improve patients' lives!
- Excellent and affordable medical benefits + non-medical perks including Student Loan Reimbursement, Flexible Paid Time Off and Paid Parental Leave
- 401(k) Plan with a Company Match to prepare for your future
- Robust Learning & Development opportunities including over 700+ development courses free to all employees
#LI-ZG1
Videos To Watch
https://youtu.be/2Q_ODlJxDcQ?feature=shared