We want your ideas and leadership applied from the hardware of the cluster to the libraries and tools that Algo uses to generate strategies, and everything in between. You’ll work closely with storage, ML, and strategy experts to set direction and implement it. A large part of your responsibility will be to anticipate the needs of HRT’s Algo Developers and ensure that we have plans in place to meet those needs over the months and years ahead!
We are open to leadership at multiple levels, depending on your capabilities - IC, small team, or group.
Responsibilities:
- Ensure HRT’s research cluster is the best among our competitors, with a focus on supporting Algo research efforts.
- Help set technical direction for the cluster and software infrastructure that supports Algo workflows, communicating frequently with Algo stakeholders.
- Maintain and improve user and cluster efficiency as we scale up compute resources
- Keep abreast of changes in the HPC landscape and bring new technologies into HRT as appropriate
- Contribute individually and through leadership and coordination of the above.
The Skills:
- Skilled in software design, testing, deployment, and monitoring in a large distributed compute cluster.
- Proven ability to lead significant technical projects and/or teams.
- Excellent debugging and problem solving skills - particularly in complicated distributed compute environments and while using large data sets.
- Great communication capabilities.
- C++ experience is required, python experience helpful.
- Knowledge of UNIX operating systems (we use Debian Linux), system/processor performance, and network communication.
The Profile:
- You possess a bachelor's degree in Computer Science, Engineering or a related field.
- You have 3-5+ years experience designing and/or managing large compute clusters.
- You are comfortable with implementing your own ideas, especially for POC.
- Can look at code, figure out how it works, and how to make it better.
- Can describe software designs at a high level (the abstract interface), low level (step-by-step algorithm), or anywhere in between.
- You like to work with people who challenge you and make you better at what you do.
- In your spare time you: code, tinker, read, explore, break things, and have an insatiable curiosity for all things computer related... you'll find like-minded people here.