AWS Neuron is the complete software stack for the AWS Inferentia and Trainium cloud-scale machine learning accelerators and the trn* and inf* servers that use them. This position is for a Software Engineer that will lead the development of various services that will aid in optimization, analysis and release of machine learning workloads and artifacts. This candidate must have had experience leading distributed systems and machine learning related projects, preferably starting from architecture through several generations of delivery to customers. Deep knowledge of optimization, resource management, scheduling are needed. The ideal candidate will have experience working on services like EC2, EKS, Lambda in AWS or similar services on other cloud providers.
Key job responsibilities
* This engineer will lead the design and implementation of new tools, pipelines and automation, will work with developers, system architects, hardware engineers and users both within and external to Amazon to ensure compatibility of this new toolset with existing and next-generation AI accelerators.
* Design, implement, and maintain CI/CD pipelines to automate the software release process.
* Collaborate with development teams to integrate new software releases.
* Infrastructure Management:
Manage and automate infrastructure provisioning.
Ensure high availability and scalability of systems through effective infrastructure management.
* Monitoring and Optimization:
Implement monitoring solutions to track system performance. Identify bottlenecks and optimize system performance.
* Security and Compliance:
Implement security best practices in the DevOps pipeline. Conduct regular vulnerability assessments and risk management.
A day in the life
As you design and code solutions to help our team drive efficiencies in software architecture, you'll create metrics, implement automation and other improvements, and resolve the root cause of software defects. You'll also:
* Build high-impact solutions to deliver to our large customer base.
* Participate in design discussions, code review, and communicate with internal and external stakeholders.
* Work cross-functionally to help drive business decisions with your technical input.
* Work in a startup-like development environment, where you're always working on the most important stuff.
BASIC QUALIFICATIONS
- 3+ years of non-internship professional software development experience
- 2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- Experience programming with at least one software programming language
- Knowledge of system performance, memory management, and parallel computing principles
- Experience in debugging, profiling, and implementing software engineering best practices in large-scale systems, or experience debugging, profiling, and implementing best software engineering practices in large-scale systems
- Experience with AWS or cloud technologies
PREFERRED QUALIFICATIONS
- 3+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
- Bachelor's degree in computer science or equivalent
- Knowledge of fundamentals of networking, security, databases (relational or NoSQL), operating systems (Unix, Linux, and/or Windows)
- Fundamentals of Machine learning and LLMs, their architecture along with work experience on certain LLM models.
The base salary range for this position is listed below. Your Amazon package will include sign-on payments and restricted stock units (RSUs). Final compensation will be determined based on factors including experience, qualifications, and location. Amazon also offers comprehensive benefits including health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage), 401(k) matching, paid time off, and parental leave. Learn more about our benefits at https://amazon.jobs/en/benefits.
USA, CA, Cupertino - 165,200.00 - 223,600.00 USD annually