Site Reliability Engineer (SRE) Apache Flink & Kubernetes

Purple Drive Technologies

$120K — $150K *
Information Technology
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • Strong hands-on experience with Apache Flink in production environments.
  • Expertise in Kubernetes, including Helm, Operators, and Custom Resource Definitions (CRDs).
  • Proficiency in scripting languages such as Python, Bash, and Go.
  • Experience with monitoring and observability tools like Prometheus, Grafana, and the ELK stack.
  • Solid understanding of cloud platforms including AWS, GCP, and Azure.
  • Strong knowledge of networking, security, and container orchestration.
  • Familiarity with CI/CD pipelines and DevOps practices.
  • Excellent problem-solving, debugging, and communication skills.

Responsibilities

  • Design, implement, and maintain scalable Apache Flink deployments on Kubernetes.
  • Develop automation tools and scripts to streamline deployment, monitoring, and maintenance of Flink jobs and infrastructure.
  • Ensure high availability, scalability, and reliability of production systems.
  • Collaborate with development and infrastructure teams to optimize application performance.
  • Build and manage monitoring/alerting systems using Prometheus, Grafana, ELK stack, or similar tools.
  • Work with cloud platforms to design and manage infrastructure.
  • Apply best practices for networking, security, and container orchestration.
  • Troubleshoot complex production issues and drive root cause analysis.
  • Contribute to CI/CD pipelines for deployment automation.
  • Participate in on-call rotations to ensure uptime and reliability.

Benefits

  • Local preference for candidates is preferred.
  • Opportunities for professional growth and development.
  • Collaboration with cross-functional teams in a dynamic work environment.
  • Access to cutting-edge technologies and tools to enhance job performance.
Full Job Description
**************LOCAL PREFERRED

We are seeking a highly skilled Site Reliability Engineer (SRE) with strong expertise in Apache Flink, Kubernetes, and automation. The ideal candidate will be responsible for designing, deploying, and maintaining scalable, resilient systems, while ensuring high availability and performance in production environments. This role requires a solid background in distributed systems, container orchestration, and DevOps practices.

Key Responsibilities

  • Design, implement, and maintain scalable Apache Flink deployments on Kubernetes.
  • Develop automation tools and scripts to streamline deployment, monitoring, and maintenance of Flink jobs and infrastructure.
  • Ensure high availability, scalability, and reliability of production systems.
  • Collaborate with development and infrastructure teams to optimize application performance.
  • Build and manage monitoring/alerting systems using Prometheus, Grafana, ELK stack, or similar tools.
  • Work with cloud platforms (AWS, GCP, Azure) to design and manage infrastructure.
  • Apply best practices for networking, security, and container orchestration.
  • Troubleshoot complex production issues and drive root cause analysis.
  • Contribute to CI/CD pipelines for deployment automation.
  • Participate in on-call rotations to ensure uptime and reliability.
Required Skills & Qualifications

  • Strong hands-on experience with Apache Flink in production environments.
  • Expertise in Kubernetes (Helm, Operators, CRDs).
  • Proficiency in scripting languages (Python, Bash, Go).
  • Experience with monitoring & observability tools (Prometheus, Grafana, ELK, etc.).
  • Solid understanding of cloud platforms (AWS, GCP, Azure).
  • Strong knowledge of networking, security, and container orchestration.
  • Familiarity with CI/CD pipelines and DevOps practices.
  • Excellent problem-solving, debugging, and communication skills.

Similar Jobs

More Jobs at Purple Drive Technologies

  • Mechanical Engineer - Thermal
    $75K — $95K *
    Delaware, OH 43015 (Delaware County)
    Technical Services
    In-Person
  • Data Modeler
    $90K — $120K *
    Wilmington, DE 19805 (New Castle County)
    Finance & Insurance
    In-Person
  • Front end developer
    $80K — $110K *
    Norfolk, VA 23503 (Norfolk City County)
    Information Technology
    In-Person
  • Data Modeler
    $100K — $130K *
    Los Angeles, CA 90011 (Los Angeles County)
    Finance & Insurance
    In-Person
  • Quality Engineer
    $80K — $110K *
    Irving, TX 75061 (Dallas County)
    Information Technology
    In-Person

More Information Technology Jobs

Find similar Site Reliability Engineer (SRE) Apache Flink & Kubernetes jobs: