Staff Site Reliability Engineer

FireEye   •  

Austin, TX

Industry: Information Technology

  •  

5 - 7 years

Posted 51 days ago

Job Description

Working as part of a global SRE organization you will be responsible for the cloud-based infrastructure that supports FireEye's advanced security services. You will proactively analyze cloud services for performance improvements and automation opportunities. In addition, you will work on customer provisioning, service management, networking, and monitoring.

Responsibilities:

· Design, write, and deliver software to improve the availability, scalability, latency, and efficiency of FireEye's cloud services.

· Solve problems relating to mission critical services with a focus on using automation to prevent problem recurrence; with the goal of automating response to all non-exceptional service conditions and building better technologies vs. manual resolution.

· Influence and create new designs, architectures, standards, and methods for large-scale distributed systems.

· Engage in service capacity planning and demand forecasting, software performance analysis, and system tuning.

· Participate in on call rotation responding with urgency to incidents that may arise.

· Work as part of a team serving multiple stakeholders, balancing priorities to deliver on time while communicating status to internal customers.

· Take initiative to identify and address opportunities for improvement within the organization.

Qualifications

· 5+ years of relevant experience.

· Systematic problem-solving approach coupled with a strong sense of ownership and drive.

· Experience in one or more of: C, C++, Java, Python, Go, Ruby, Scala, NodeJS.

· Expertise in designing, analyzing, and troubleshooting large-scale distributed systems.

· Experience with designing, building, operating, and troubleshooting Kubernetes.

· Familiarity with running web services at scale; understanding of cloud native technologies and networking.

· Experience developing tools and APIs to reduce manual interaction with systems and applications using a variety of coding and scripting standards.

· Understanding of Unix/Linux systems from kernel to shell and beyond, taking in system libraries, file systems, and client-server protocols along the way.

· Networking: knowledge and understanding of network theory, such as different protocols (TCP/IP, UDP, DNS, routing, OSI layers, load balancing, etc.).

· Experience with Infrastructure as a Service (IaaS) and Platform as a Service (PaaS) technology stacks such as Amazon Web Services (AWS).

· Strong written and verbal communication skills.

· US Citizenship required due to product compliance requirements.

Additional Information

· BS degree in Computer Science or related technical field, or equivalent practical experience is desirable

· Familiarity with Configuration Management solutions (Ansible, Chef, Terraform ,etc.)

· Experience with algorithms, data structures, complexity analysis and software design.