System Development Engineer

Amazon   •  

Herndon, VA

Industry: e-Commerce


5 - 7 years

Posted 331 days ago

This job is no longer available.

Be great fun to work with. Our company credo is “Work hard. Have fun. Make history.” The right candidate will love what they do and instinctively know how to make work fun.

Be passionate about customers. We start with the customer and work backwards. The right candidate will be passionate about delighting customers at all levels.

Dive deep to solve problems. Our services are highly available, scalable, durable, and secure. The right candidate will be ready to roll up their sleeves to dig out root causes and solve complex problems before they impact customers.

Know how to automate. We work fast and we work at scale. The right candidate will drive our culture of tooling and automation to scale our capabilities as our customers and services scale.

Have Intelligence Community (IC) experience. Customers in the IC have unique needs. The right candidate will fundamentally understand those needs and be able to craft solutions from AWS products and services.

You should have or be most of the following:
· Experience running and maintaining a 24x7 Internet-oriented, Linux based, production environment, preferably across multiple data centers, involving (preferably) hundreds of machines
· Demonstrable expertise around specifying, designing, and/or implementing system health, performance monitoring tools, and software management tools for 24x7 environments
· A solid grasp of networking fundamentals, preferably including hands-on experience with load balancers, switches, routers, etc.
· Familiar with the challenges surrounding efficient operations and failure mode analysis in large complex distributed systems
You will be expected to deliver on these kinds of things in the first six to twelve months on the job:
· Through participation in all phases of the development of a large distributed system; providing hardware, manageability, operability and performance perspectives on all aspects of the system
· Define and/or refine hardware requirements and selected designs, balancing raw up-front dollar cost with operability and TCO, from the data center infrastructure up specify and participate in the development and delivery of operability-related features such as system health monitoring, diagnostics, repair, and other self-healing automation
· Develop or further existing application and system management tools and processes that reduce manual efforts and increase overall efficiency
· Adapt and improve operations management systems and processes to accommodate rapid and increasing growth in systems and traffic
· Participate in the design and execution of production acceptance tests and new hardware evaluations
· Maintain fleet inventory management, including producing, maintaining, and evolving capacity plans for various components
· Monitor the health of the fleet, automating system health, maintenance tasks, and reporting systems as needed
· Perform various system maintenance tasks, including configuration of new machines
· Manage directly assigned tasks andon-callduties gracefully


· Demonstrated proficiency in Linux, hands on and related debugging.
· Ability to take loosely defined requirements and turn it into a functioning product with minimal supervision.
· Minimum of four years support engineering or system admin experience.
· Experience running services on Linux/Unix with troubleshooting skills.
· Good working knowledge/experience on highly distributed virtual environment, networking, s/w build and deployment process.
· Bachelor’s degree in Information Science / Information Technology, Computer Science, Engineering, Mathematics, Physics, or a related field.
· This position requires that applicant selected be a U.S. citizen and MUST posses and maintain a TS/SCI US Government clearance with polygraph. TS/SCI eligibility is not required to start; however, the applicant selected will be subject to a Single-Scope Background Investigation (SSBI) and must meet eligibility requirements for access to classified national security information. Applicants with a current SSBI, SBPR, or PPR, may be eligible for crossover in accordance with ICPG 704.4.


· Exposure to Virtualization (VMware, Xen, Hypervisor)
· Exposure to Cloud computing
· Exposure to security concepts / best practices
· Expertise with IPsec, VPN, Load Balancing, Iperf, MTR, Routing Protocols, SSH, Network Monitoring / Troubleshooting tools
· Experience managing full application stacks from the OS up through custom applications
· Experience managing full application stacks from the OS up through custom applications
· Some programming / scripting experience (Java, Perl, Ruby, C#, and/or PHP)
· Strong ownership, urgency, and drive to launch services.

Job ID: 624688