Federal Reserve Bank

INFRASTRUCTURE & HPC SYSTEMS ENGINEER

Federal Reserve Bank$90K — $120K *
Information Technology
5 - 7 years of experience
Job Overview by Ladders

Qualifications

  • Bachelor's degree in Computer Science, Engineering, Mathematics or related field, or equivalent experience
  • Minimum of 5 years of experience in HPC administration and systems engineering
  • Extensive experience with Linux operating systems in HPC settings
  • Proficiency in scripting languages such as Python and Bash
  • Knowledge of job scheduling systems and resource management techniques

Responsibilities

  • Respond to and maintain Windows and Linux server environments in research settings
  • Design, deploy, and manage HPC clusters
  • Monitor system performance and ensure efficient operation
  • Implement security protocols and conduct regular maintenance
  • Develop automation tools and scripts for system management
  • Provide technical support and troubleshoot user endpoints
  • Collaborate with research partners to address computational needs

Benefits

  • Medical, prescription, dental, and vision insurance with no waiting period
  • 401k/Thrift Plan with employer match and pension plan
  • Generous paid vacation, sick leave, and holidays
  • Monthly commuter allowance and flexible spending accounts
  • Tuition reimbursement available
  • Onsite fitness center and cafeteria services
Full Job Description
You will ensure integrity, reliability, and availability of agile research computing environments by managing Windows/Linux server infrastructure, high-performance computing (HPC) clusters, and cloud/colocation/on-premises services. You will provide advanced specialized technical support to end users while developing automation tools and optimizing computational workflows to meet evolving rigorous research needs. You foster trust, open communication, shared goals and collaboration with stakeholders across the Federal Reserve System and externally. The salary grade for this position is 16. Final salary and offer will be determined by the applicant's background, experience and skills, as well as internal equity and alignment with market data Job Description: Infrastructure & Operations - You will respond to problems and maintains Windows and Linux server environments in research settings - Design, deploy, configure, and administer HPC clusters and associated systems - Monitor system health, performance metrics, and resource utilization to ensure optimal, efficient operation - Implement robust security protocols and perform regular maintenance including upgrades and patching - Manage job scheduling and workload optimization using tools like Slurm - Support and troubleshoot user endpoints, servers, and services in various environments (i.e. cloud, on-premises, collocation) - Participate in planning, budgeting, and monitoring of various environments Development & Automation - You develop tools and scripts to automate management and creation of systems and services in various environments - Create and maintain automation scripts to streamline system administration tasks - Optimize scientific applications and computational workflows for performance - Implement container technologies (Docker) for reproducible research - Support GPU computing and accelerator technologies for specialized workloads - Design and implement innovative HPC solutions to address evolving research requirements - Define and track performance metrics to ensure efficient current and future use of resources End User Support & Technical Assistance - You will respond to research end user requests to diagnose problems and provide specialized technical support - Troubleshoot highly complex hardware and software issues in multi-user research environments - Resolve problems quickly and accurately with thorough follow-up to ensure complete resolution - Assist staff with IT-related problem resolution and use of facilities Partnership & Collaboration - You partner closely with researchers to understand computational needs and translate them into technical solutions - Collaborate with network, security, and data teams to ensure integrated operations - Build and maintain relationships with vendors and technology partners - Collaborate as technical advisor on infrastructure planning and technology roadmaps - Participate in product and technology evaluations, testing, and pilot activities to provide sound recommendations - Engage in Federal Reserve System, academic, and other HPC communities to stay current with emerging technologies and effective practices Documentation & Training - Develop comprehensive documentation for systems, policies, and procedures - Create user guides and training materials for researchers utilizing HPC resources - Conduct workshops and training sessions on effective use of HPC resources and research computing tools Education and Experience: - Bachelor's degree in computer science, engineering, mathematics, or related field, or equivalent combination of education and experience. - Minimum of 5 years of relevant experience in HPC administration and systems engineering. Knowledge and Skills: - You will have extensive experience with Linux operating systems (Red Hat/CentOS) in an HPC environment. - Command line skills and proficiency in scripting languages (Python, Bash). - Experience with job scheduling systems (SLURM) and resource management. - Knowledge of parallel file systems and storage technologies (e.g. ceph, GPFS, Lustre, BeeGFS). - Familiarity with parallel programming models (MPI, OpenMP) and scientific computing frameworks. - Experience with configuration management and automation tools (Terraform). - Demonstrated specialized problem-solving abilities and analytical thinking. - Solid appreciation for research, sound judgment and healthy professional skepticism, understands sensitivities, considers big picture in addition to tactical details. - Ability to communicate effectively with PhD economists as well as with various levels of personnel and different types of specialists, strong interpersonal and listening skills, approachable. - Agile and comfortable working in evolving rigorous research environments. - Research support-oriented, responsive to time-sensitive matters and custom needs. We offer a great benefits package that features: - Medical (4 options), Prescription, Dental (3 options), and Vision Insurance with no waiting period - 401k/Thrift Plan with generous employer match - Employer-funded Pension Plan - Paid Vacation/Sick Time & Holidays - Monthly $200 Commuter Allowance - Flexible Spending Accounts and Healthcare Spending Accounts - Flexible Work Schedule available in most departments - Life Insurance and Long Term Disability Insurance - Tuition Reimbursement (undergraduate and graduate) - Parental Leave - Free onsite 24/7 Fitness Center including training classes, Peloton bikes and locker room / shower facilities - Onsite Cafeteria & Coffee Shop - Additional Convenience Benefits, Discounts and More... Additional Information: The above statements are intended to describe the general nature, level of work and the requirements of this position. They are not intended to be an exhaustive list of all responsibilities associated with this position or the personnel so classified. While this job description is intended to be an accurate reflection of this position, management reserves the right to revise this or any job description at its discretion at any time. By applying to this position, you agree you will be available to work on-site in a full-time capacity. #LI-Onsite Full Time / Part Time Full time Regular / Temporary Regular Job Exempt (Yes / No) Yes Job Category Information Technology Family Group Work Shift First (United States of America)

About Federal Reserve Bank

Industry
Founded
1913

Similar Jobs

More Jobs at Federal Reserve Bank

More Information Technology Jobs

Find similar INFRASTRUCTURE & HPC SYSTEMS ENGINEER jobs: