The Technology Development Group (TDG) infrastructure team is looking for a Site Engineer to help plan, implement and maintain a multi-site storage system and compute cluster.
You will have the opportunity to work in a team that supports the growing, heterogeneous Technology Development Group. The role will be focused on enabling the groups's individual objectives by provisioning storage and compute systems. Responsibilities Design, plan and implement a state of the art multi-year storage and cluster system Design and implement DR and HA methods Deploy COTS software onto compute cluster Setup monitoring for the compute and storage systems Integrate storage and compute systems with COTS tools Schedule and automatically update software deployments Identify and resolve bottlenecks
BS/MS in Computer Science or equivalent
Experience with configuration managers such as ansible, chef or puppet is a plus Experience with monitoring tools such as zabbix, nagios or graphite is a plus