Data Engineer

Hotspex Media Inc

$80K — $110K *
Information Technology
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • 2+ years of experience in data engineering or analytics engineering
  • Proficiency in advanced SQL, including complex joins and optimization
  • Hands-on experience with production stored procedures in BigQuery, PL/pgSQL, or T-SQL is required
  • Working knowledge of dbt or strong SQL/Git fundamentals
  • Experience in a scripting language such as Python, Java, or Scala
  • Familiarity with pipeline orchestration tools like Airflow or Dagster
  • Understanding of dimensional modeling principles, including facts and dimensions

Responsibilities

  • Connect and optimize data across BigQuery, Postgres, and Airtable for end-user consumption
  • Refactor ad-hoc SQL into well-documented and version-controlled routines
  • Optimize data warehouse performance and costs through effective partitioning and clustering
  • Develop and maintain production stored procedures for various data processes
  • Design and document dbt models for transforming marketing data into coherent schemas
  • Schedule and monitor data pipelines using Airflow or similar tools
  • Implement and maintain data quality checks and incident response protocols

Benefits

  • Hybrid work model with remote options for candidates outside the Greater Toronto Area
  • Access to a small, high-autonomy team with direct leadership support
  • Opportunity to own significant data transformation projects
  • Collaborative environment with cross-functional teams
  • Focus on continuous improvement and the use of innovative AI tools for productivity
Full Job Description
  • Reports to: Director of AI & ML
  • Location: Hybrid with the option for Remote if Outside Greater Toronto Area (must be legally authorized to work in and based in Canada)
  • Team: Small, high-autonomy team with direct access to leadership.
  • Impact: Owns design, build, operation of Hotspex's data transformation and storage layer.


About the Role & Mission
  • Connect data across BigQuery, Postgres, and Airtable; expose clean datasets to AI, Workflow, Analytics consumers
  • Build and maintain dbt models transforming marketing platform data into conformed dimensional schemas (Kimball facts/dimensions)
  • Own SQL surface: queries, stored procedures, views, materialized views, scheduled routines
  • Optimize warehouse performance and cost: query tuning, partitioning, clustering, incremental models
  • Orchestrate pipelines with Airflow or similar


Core Competencies
  • SQL Engineering: Writes, tunes, maintains complex SQL across BigQuery and Postgres
  • Stored Procedures & Routines: Designs and owns stored procedures, scripted procedures, UDFs, scheduled jobs
  • dbt / Transformation Modeling: Builds and maintains dbt models with tests, docs, incremental patterns
  • Pipeline Orchestration: Schedules and monitors pipelines via Airflow or similar
  • Cross-Functional Partnership: Delivers consumable data products for AI, Workflow, Analytics
Job Specific Competencies
  • Advanced SQL: Complex joins, window functions, CTEs, query optimization, execution plans on BigQuery and Postgres
  • Stored Procedures & Routines: Production stored procedures, scripted procedures (BigQuery scripting / PL/pgSQL), UDFs, scheduled queries with error handling, idempotency, observability
  • dbt Modeling: Sources, staging, intermediate, marts; tests; documentation; incremental strategies; macros
  • Pipeline Orchestration: Airflow, Dagster, Prefect, or equivalent
  • Data Modeling: Kimball facts/dimensions, slowly changing dimensions, conformed schemas
  • Warehouse Optimization: Partitioning, clustering, materialized views, cost tuning on BigQuery
  • Airtable Integration: Schema mapping, sync patterns, base-as-source


Job Responsibilities
Connect & Optimize Data
  • Own connectivity between BigQuery, Postgres, and Airtable; ensure consumers (AI, Workflow, Analytics) get the schema they need
  • Refactor ad-hoc SQL into versioned, tested, documented routines
  • Optimize cost and performance: partitioning, clustering, materialization
  • Detect and fix performance regressions before downstream impact
SQL & Stored Procedure Ownership
  • Own every production stored procedure, scripted procedure, scheduled query across BigQuery and Postgres
  • Author new stored procedures for batch transforms, reporting routines, AI/ML feature prep
  • Maintain stored-procedure inventory with ownership, dependencies, runbooks
dbt Model Build & Maintenance
  • Design schemas and write dbt models transforming marketing platform data (Google Ads, Meta, LinkedIn, etc.) into conformed dimensional schemas
  • Implement dbt tests (uniqueness, not-null, referential integrity, custom rules) on every production model
  • Maintain incremental models for high-volume tables; tune for cost and freshness
  • Own dbt documentation and lineage
Pipeline Orchestration
  • Schedule, monitor, and version pipelines in Airflow or similar
  • Alert routing, retry policy, backfill patterns
  • Coordinate with Workflow Eng on hand-off points between n8n and orchestrated data pipelines
Data Quality, Monitoring & Reliability
  • Implement automated tests (dbt tests, freshness checks, row-count anomaly detection)
  • Detect and acknowledge data quality incidents within 1 business hour (SLA)
  • Author runbooks for common failure modes
  • Track and reduce incident frequency; report trends quarterly
Cross-Functional Partnership
  • Partner with Workflow Automation Engineer on ingestion contracts: landing schemas, refresh patterns
  • Partner with Junior AI Engineer on data needs for RAG, embeddings, AI services: feature tables, serving views
  • Translate PM/CS and Product requirements into dimensional models
  • Owns: SQL design, stored procedure logic, transformation modeling, performance choices
  • Does not own: automation logic (Workflow Eng), AI service code (Jr AI Eng), client-facing strategy
Documentation & Knowledge
  • Use Claude Code for stored procedure docs, model READMEs, schema references
  • Version-controlled repos, clean Markdown, proper Git hygiene
  • Document data contracts: ingestion → transformation → consumption
Continuous Improvement
  • Use AI tooling (Claude Code, Cursor) to accelerate SQL authoring, refactoring, documentation
  • Track and report query cost reduction and model freshness improvement quarterly
  • Resolve categories of technical debt: consolidating duplicated SQL, retiring shadow tables
Explicitly Out of Scope
  • n8n automation design and ownership (Workflow Automation Engineer)
  • Rust service development, RAG pipelines, embedding models (Junior AI Engineer)
  • Looker dashboard authoring and LookML feature development
  • Strategic analytics presentations to leadership
  • ML model engineering, training, prompt engineering as a discipline


Required Qualifications
  • 2+ years data engineering, analytics engineering, or database development
  • Strong SQL - complex joins, window functions, CTEs, query optimization (must demonstrate)
  • Hands-on stored procedure experience - production stored procedures (BigQuery scripted procedures, PL/pgSQL, T-SQL, PL/SQL, or equivalent). Non-negotiable.
  • Working knowledge of dbt (or strong SQL/Git fundamentals to ramp quickly)
  • Python or other scripting language for data tasks (Java, Scala, TypeScript also acceptable)
  • Airflow or similar pipeline orchestration experience (Dagster, Prefect, dbt Cloud schedules, Cloud Composer)
  • Dimensional modeling fundamentals - facts, dimensions, grain, conformed schemas
  • Git fundamentals - branches, PRs, code review participation
  • Documentation discipline - version-controlled Markdown


Strongly Preferred
  • BigQuery production experience (partitioning, clustering, scripted procedures, scheduled queries)
  • Postgres production experience (PL/pgSQL, indexes, query plans)
  • Airtable production experience (schema design, sync patterns, API integration)
  • Production dbt experience (Cloud or Core)
  • Marketing/advertising data sources (Google Ads, Meta, LinkedIn)
  • AI tooling (Claude Code, Cursor, ChatGPT) as daily accelerator


Nice to Have
  • Looker / LookML exposure (consumer-side; not required to own)
  • n8n or other workflow orchestrators
  • RAG / vector search data prep
  • Agency, media, or analytics domain


Technology Stack
  • Languages: SQL (advanced), Python (or equivalent), optionally JavaScript for dbt/BigQuery UDFs
  • Data: BigQuery, Postgres, Airtable, dbt, Redis (cache awareness)
  • Orchestration: Airflow (or Dagster, Prefect, Cloud Composer), dbt Cloud Run
  • Integration consumer-side: n8n
  • Cloud: GCP
  • Observability: Cloud Monitoring, Looker (consumer-side)
  • Tools: Linear, GitHub, Claude Code, Cursor


What this Role is NOT
  • Not a data analyst - no ad-hoc analysis, dashboarding, stakeholder reporting
  • Not analytics engineering / dashboards - LookML and Looker dashboards not owned
  • Not workflow automation - n8n belongs to Workflow Automation Engineer
  • Not ML / AI engineering - model development belongs to AI team
  • Not a DBA - no infrastructure provisioning or cluster management

Similar Jobs

More Jobs at Hotspex Media Inc

  • Data Engineer
    $80K — $110K *
    Toronto, ON M3C 0E3
    Information Technology
    In-Person

More Information Technology Jobs

Find similar Data Engineer jobs: