Lead IBM Workload Automation Architect

Prophecy Technologies

$100K — $130K *
Information Technology
8 - 10 years of experience
Job Overview by Ladders

Qualifications

  • 10+ years in enterprise workload automation, with 7+ years of IBM TWS/IWS/IWA experience in distributed environments.
  • Proven leadership in architecting and defining technology standards and designs.
  • Strong Linux/UNIX engineering experience, particularly in production environments.
  • Advanced scripting skills in shell, Python, and/or Perl.
  • Familiarity with ITSM processes and governance, especially in regulated environments.

Responsibilities

  • Own the architecture and standards for TWS/IWS across various environments.
  • Provide oversight for third-party job scheduling platforms and ensure integration standards.
  • Lead enterprise-wide installations and define strategies for migrations.
  • Implement reliability engineering practices for workload automation.
  • Establish and enforce security and compliance measures on the TWS/IWS platform.

Benefits

  • Opportunities for professional development and technical training.
  • Flexible work hours accommodating global teams across EMEA and US.
  • Supportive team environment encouraging knowledge sharing and mentorship.
  • Exposure to innovative technologies and enterprise-scale projects.
Full Job Description
Role Overview:

We are seeking a Lead Workload Automation Engineer / Architect to define and drive the enterprise architecture, strategy, and operational model for IBM Tivoli/IBM Workload Scheduler (TWS/IWS) across distributed environments (on-prem and cloud). This role sets platform standards and reference designs, leads modernization and major upgrades/migrations, governs reliability and security practices, and serves as the senior technical partner for application, databases, and infrastructure organizations to deliver resilient, scalable scheduling services for mission-critical workloads. The role also involves assisting and supervising two job scheduling teams.

Key Responsibilities:
  • Own the end-to-end architecture for the TWS/IWS platform, including standards, patterns, and reference implementations.
  • Provide technical oversight for additional (3rd-party) job scheduling platforms, establishing operating standards and integration patterns.
  • Lead enterprise-scale installations, upgrades, and migrations, defining cutover/rollback strategies.
  • Lead assessments of legacy scheduler instances and batch frameworks for retirement, consolidation, or migration.
  • Define reliability engineering practices for workload automation: availability targets, capacity planning, performance tuning, monitoring/alerting.
  • Design and validate high-availability and disaster recovery solutions, including DB2 HADR.
  • Establish governance for workload onboarding and job design.
  • Architect and productionize automation for platform operations and self-service using shell/Python/Perl.
  • Own security and compliance posture: access model, least-privilege controls, audit evidence, vulnerability remediation.
  • Manage and develop two teams (platform engineering and operations), setting priorities and overseeing delivery.
  • Be available for major outages and critical events related to job scheduling, including QEND activities.
  • Participate in an on-call rotation and provide after-hours/weekend support.
  • Support a global operating model by working flexibly across EMEA and US business hours.
  • Serve as escalation point for complex incidents; lead root-cause analysis and drive problem management.
  • Mentor and guide engineers; lead technical design reviews and knowledge sharing.
  • Deep dive into other job scheduling teams like Automate, AS400 and Robot and assist in supervising these teams in IT Operations.

Required Skills:
  • Hands-on experience with end-to-end architecture for the TWS/IWS platform (components, topology, environments, integrations), including standards, patterns, integrations and APIs (REST/SOAP), event-based scheduling, and real-time/on-demand workload patterns.
  • Experience with Tivoli Dynamic Workload Console (TDWC/TDWB) and critical path monitoring.
  • Experience integrating file transfer solutions (e.g., SFTP/PGP/GPG, managed file transfer platforms) into batch workflows.
  • Experience with SAP and other enterprise application integrations via TWS extended agents.
  • Experience building dashboards/metrics and integrating with observability platforms (e.g., Grafana/Graphite).
  • Familiarity with Databases: DB2 (HADR), Oracle/Postgres.
  • Experience defining platform standards, leading upgrades/migrations, and coordinating cross-team delivery.
  • Strong Linux/UNIX engineering and production troubleshooting experience.
  • Advanced automation/scripting skills (shell plus Python and/or Perl).
  • Demonstrated ability to lead complex incident response and root-cause analysis.
  • Strong change leadership in regulated production environments aligned with ITIL processes.
  • Excellent stakeholder communication and ability to influence across teams.
  • Workload Automation: IBM TWS/IWS/IWA, TDWC/TDWB, dynamic scheduling, JSDL.
  • Operating Systems: Linux, UNIX (AIX/SunOS), Windows (agent support).
  • Scripting: Shell, Python, Perl.
  • ITSM/Monitoring: ITIL processes; integrations with tools such as ServiceNow, AppDynamics, OBM, Grafana/Graphite.
  • Security: LDAP/SSO concepts, role-based access, audit/patch compliance.

Qualifications:
  • High School Diploma or equivalent.
  • 10+ years of experience in enterprise workload automation, including 7+ years of hands-on IBM TWS/IWS/IWA administration in distributed environments.
  • Bachelor's degree or 10+ years of equivalent IT industry service experience.
  • For senior/lead equivalent roles, 8+ years of relevant ITSM/major incident operations experience may be required.
  • IT Technology Certification is a plus.
  • Proven experience in a lead/architect capacity: defining platform standards/reference designs, guiding cross-team implementations, and making architecture decisions.

Preferred Skills:
  • DB2 administration experience, including High Availability Disaster Recovery (HADR); familiarity with Oracle/Postgres and SQL.
  • Experience with TWS/IWS integrations and APIs (REST/SOAP), event-based scheduling, and real-time/on-demand workload patterns.
  • Familiarity with cloud patterns and automation (e.g., infrastructure-as-code concepts, container/VM scheduling considerations).
  • Hands-on experience across ITSM processes (Incident, Problem, Change, Knowledge) in an enterprise environment.
  • ServiceNow experience, including incident lifecycle management, documentation standards, and reporting.
  • Working knowledge of ITIL concepts and IT service management best practices.
  • Experience navigating AI applications, understanding communication and appropriate usage.
  • Strong analytical and problem-solving skills.
  • Ability to manage multiple tasks in a high-volume, high-urgency operations environment.
  • Strong written and verbal communication skills, including confident facilitation.
  • Able to write and review technical documentation and knowledge articles.

Similar Jobs

More Jobs at Prophecy Technologies

More Information Technology Jobs

Find similar Lead IBM Workload Automation Architect jobs: