NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables outstanding creativity and discovery, and powers what were once science fiction inventions from artificial intelligence to autonomous cars. NVIDIA is looking for phenomenal people like you to help us accelerate the next wave of artificial intelligence.What you'll be doing:
What we need to see:
- Manage the availability, scalability and performance of Storage and Data Protection Infrastructure
- Drive consistent standards for all hardware, configurations, and processes.
- Implement tools and processes for efficient and effective operational management of the environment -- change management, monitoring, alerting, incident handling, customer request handling, etc.
- Implement and utilize asset inventory, configuration management database, capacity management, performance management, resource optimization, and security (access control, authorization) for all technologies in scope.
- Provide 12X7 On-call support.
- Develop and maintain scripts to automate repetitive tasks of storage domain by using Python or Perl
Ways to stand out from the crowd:
- BS in Computer Science with 8+ years of relevant experience or equivalent work experience
- MS in Computer Science with 4+ years proven ability or Ph.D with 1 year experience would be a plus or equivalent work experience
- Understanding file systems ZFS, XFS, NTFS along with the Enterprise Storage (NAS/SAN)/Data Protection/Backup Technologies
- Experience with Parallel distributed file systems(GPFS, Lustre,)
- Craft, implement and support object storage solutions, such as IBM COS, and SwiftStack
- Solid experience in backup and restore technologies (Veritas, Cohesity, Commvault, etc.)
- Solid grasp of standard networking protocols and components such as: HTTP, DNS, TCP/IP, SMTP, the OSI Model, Subnetting and Load Balancing.
- Extensive knowledge of core Enterprise LINUX (Red Hat/CentOS) with a focus upon building, maintaining, securing and performance tuning systems..
- Scripting / programming in some administrative language (Shell, Perl, Python, Powershell etc) is a must
- Accomplished projects related to the distributed compute, storage, software defined storage.
- Good Understanding of Infrastructure Security.
- Strong collaborative and interpersonal skills, specifically a shown ability to effectively guide and influence within a multifaceted matrix environment
- Able to schedule, prioritize, accomplish R&D-related activities and communicate actions and results as needed
- Solid attention to detail and excellent written and verbal communication skills are required.
- Meticulous organizer with an ever positive, can-do attitude
- Demonstrate use of out-of-box thinking for creative solutions to highly sticky problems
- You'll be a fun and enthusiastic teammate who enjoys a challenge and celebrates success
- Demonstrable understanding of configuration management - Salt, Ansible
- Experience with at least one of the job schedulers such as LSF, SLURM, Mesos/Marathon, Kubernetes, Docker Swarm
- Some experience at the large scale data center - 1000+ nodes
NVIDIA is widely considered to be one of the technology world's most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative and autonomous, we want to hear from you!
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression , sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.