As an infrastructure engineer in the Networking team, you will build creative engineering solutions to operational problems. You will help operate one of the largest Envoy-based service meshes in the industry, which gives you and the team the opportunity to regularly talk about our work in the community and at conferences, including EnvoyCon, KubeCon, Google Next and re:Invent. Our team regularly gives back to the service mesh community by developing new components and providing patches to the upstream Envoy project.
- Build and deploy open-source Envoy with Lyft private extensions to the entire fleet, and create systems to make that process faster, iterative, and reliable
- Investigate how network configurations are being tuned and figure out how to set it automatically or abstract it away (e.g., adaptive concurrency and admission control)
- Proactively identify potential outages and build systems to triage and fix.
- Further automate and reduce the operational burden on our Kubernetes-based service mesh
- Contribute to designing and building configuration, testing and deployment automation frameworks
- Drive incident responses to long-term conclusion via mentoring the team on operational best practices and potential long-term systemic fixes
- Build and foster partnerships throughout the organization with a devotion to exceptional customer experience
- Never settle for the status quo, deliver operational excellence for Networking, Service Mesh and Edge
- Help establish roadmap and architecture based on technology and our needs
- Write well-crafted, well-tested, readable, maintainable code
- Participate in code reviews to ensure code quality and distribute knowledge
- Share your knowledge by giving brown bags, tech talks, and promoting appropriate tech and engineering best practices
- Can help lead large projects from idea to positive execution
- Unblock, support and communicate with internal partners to achieve results
- 5+ years of software engineering industry experience
- Experience working with and/or operating Envoy/HAProxy/Nginx or any other networking proxy
- Experience with monitoring and logging management products such as ELK, Wavefront, SignalFx, CloudWatch, StackDriver, etc.
- Familiarity with networking disciplines, such as load balancers, API gateways, DNS management, HTTP2, GRPC, etc.
- Familiarity with container technology such as Docker and Kubernetes
- Experience debugging complex problems that span over multiple systems and every layer of the stack, and expertise in incident response methodologies, planning, testing, and execution
- Proficiency in high-level programming languages and scripting languages such as Golang and Python. Proficiency in low-level C++ is a bonus, but not required
- Expert level knowledge of an enterprise cloud provider (AWS, GCP, Azure)
- Hands-on experience implementing and maintaining configuration controls through infrastructure-as-code
- Take pride in reducing technical debt; your attention to small details and keeping code/configuration clean and maintainable is something you value
- Value root causing operational issues and implementing systemic solutions and automation to make sure they no longer happen
- Great medical, dental, and vision insurance options
- Mental health benefits
- In addition to 12 observed holidays, salaried team members have unlimited paid time off, hourly team members have 15 days paid time off
- 401(k) plan to help save for your future
- 18 weeks of paid parental leave. Biological, adoptive, and foster parents are all eligible
- Pre-tax commuter benefits
- Lyft Pink - Lyft team members get an exclusive opportunity to test new benefits of our Ridership Program