IT Site Reliability Consultant

Cesna Group

$100K — $130K *
Information Technology
8 - 10 years of experience
Job Overview by Ladders

Qualifications

  • Bachelor's degree in computer science or IT-related field.
  • 10+ years in IT Infrastructure or similar roles, preferably in the automotive sector.
  • 5+ years in site reliability engineering or infrastructure architecture roles.
  • Proven expertise in on-prem environments including VMware and Hyper-V.
  • Strong knowledge of network architecture, storage systems, and disaster recovery.
  • Familiarity with automation tools like Ansible and Terraform.

Responsibilities

  • Design secure and scalable on-prem infrastructure solutions.
  • Evaluate and enhance existing systems for reliability.
  • Develop localized reliability standards and procedures.
  • Lead automation efforts for infrastructure management and incident response.
  • Serve as technical lead during major infrastructure incidents.
  • Ensure compliance with regional regulations and global standards.
  • Mentor engineers and promote a culture of reliability.

Benefits

  • Vehicle Support Allowance for Hyundai, Kia, or Genesis vehicles.
  • 13 vacation days, adjusted based on hire month.
  • 15 paid holidays throughout the year.
  • Competitive health care plans starting the month after employment.
  • 401(k) plan with matching contributions up to 5% after 90 days.
Full Job Description
Job Summary
  • We are seeking an experienced Site Reliability Architect to lead the design, implementation, and continuous improvement of our on-premises IT infrastructure , supporting operations across the United States, Canada, Mexico, and Brazil.
  • This role will focus on enhancing system reliability, performance, and scalability across our regional data centers and enterprise environments. You will translate global reliability standards into region-specific strategies, optimize incident response, automate operational workflows, and ensure high availability of mission-critical systems?all while fostering cross-functional collaboration and compliance with local regulations.
Job Description
  • Essential Functions(To perform within this position successfully, the incumbent must be able to perform each essential duty satisfactorily. Other duties may be assigned.)
  • 1. Reliability Architecture & Design:
  • - Design resilient, scalable, and secure on-prem infrastructure solutions aligned with global SRE principles
  • - Evaluate existing systems for reliability gaps and propose architectural improvements
  • - Develop region-specific reliability standards based on global guidelines and local operational needs
  • - Integrate observability tools and telemetry systems to monitor infrastructure health and performance
  • 2. Automation & Operational Efficiency:
  • - Lead automation initiatives for infrastructure provisioning, configuration management, and incident response
  • - Collaborate with Infrastructure and Security teams to streamline operational workflows and reduce manual effort
  • - Define and implement service-level objectives (SLOs), indicators (SLIs), and error budgets for key systems
  • - Drive continuous improvement through post-incident reviews and reliability-focused retrospectives
  • 3. Regional IT Operation Support:
  • - Serve as a technical escalation point for major infrastructure incidents across the region
  • - Conduct thorough root cause analyses and implement corrective actions to prevent recurrence
  • - Maintain and update runbooks, incident playbooks, and recovery procedures
  • - Participate in regional change control board and ensure reliability considerations are embedded in all changes
  • 4. Infrastructure Governance & Compliance:
  • - Ensure infrastructure reliability practices comply with regional regulations and global standards
  • - Maintain accurate documentation of system architecture, configurations, and operational procedures
  • - Support audits and compliance reviews by providing technical insights and documentation
  • - Champion reliability-focused governance across infrastructure projects and operational processes
  • 4. Partner Relationship Management:
  • - Work closely with Hyundai AutoEver and other IT partners to align reliability goals and service delivery
  • - Provide technical leadership in regional infrastructure projects, ensuring reliability is prioritized
  • - Mentor infrastructure engineers and promote a culture of reliability and operational excellence
  • - Evaluate and onboard new tools and vendors that enhance regional reliability capabilities
  • Supervisory Responsibilities: Yes
Qualifications
  • Basic Qualifications (The requirements listed below are representative of the knowledge, skills, and/or ability required and preferred for this position.)
  • Required Education & Experience:
  • - Bachelor's degree in computer science, Information Technology, or a related field.
  • - 10+ years of experience as an IT Infrastructure Engineering or similar role in a corporate environment, preferably in the automotive industry.
  • - 5+ years of experience in site reliability engineering or infrastructure architecture roles
  • Required Knowledge, Skills, & Abilities:
  • - Excellent verbal and written communication skill in English
  • - Strong expertise in on-prem environments including virtualization (VMware vSphere, Microsoft Hyper-V)
  • - Deep understanding of network architecture, protocols, and security (routing, switching, firewalls)
  • - Experience with storage systems (SAN, NAS), backup/recovery strategies, and disaster recovery planning
  • - Proficiency in infrastructure automation tools (Ansible, Terraform, Puppet, etc.)
  • - Familiarity with observability platforms (Prometheus, Grafana, ELK stack, etc.)
  • - Solid grasp of Windows Server and Linux (Red Hat, Rocky, Ubuntu, CentOS)
  • - Proven incident management and root cause analysis capabilities
  • - Knowledge of regulatory frameworks (GDPR, SOX) and IT governance practices
  • Preferred Education & Experience:
  • - Master's degree in a relevant technical or business discipline
  • - Advanced certifications such as VMware VCAP, Cisco CCNP/CCIE, or Microsoft Certified: Azure Solutions Architect Expert
  • - ITIL Foundation certification and experience with ITIL-based operations
  • - Multiregional project leadership and cross-cultural stakeholder engagement experience
  • - Bilingual speaker (English and Korean) is a plus.
What's On Offer
  • Perks & Benefits
  • Vehicle Support Allowance Program: Car allowance toward a new or leased Hyundai, Kia, or Genesis vehicle, plus a monthly gas allowance
  • Vacation: 13 vacation days, prorated based on the month of hire
  • Holidays: 15 paid holidays
  • Health Care: Competitive health care plan effective the first of the month following the start date of full-time employment
  • 401(k): Eligible after 90 days of employment; matches 100% of the first 5% of contributions


Apply online or feel free to contact us directly for more information about the opportunity. Due to the high volume of applicant, we regret to inform that only shortlisted candidates will be notified. Thank you for your understanding.

Similar Jobs

More Jobs at Cesna Group

More Information Technology Jobs

Find similar IT Site Reliability Consultant jobs: