Tesla is currently looking for a seasoned Site Reliability Engineer (SRE) to join the Platform Engineering team to build the platform and tools that helps keep Tesla agile, flexible and stable while accelerating the advent of sustainable transport.
Tesla’s Platform Engineering Team is building the platform and tools that keeps Tesla moving. As a member of the Platform Engineering team, you will be expected to have intimate knowledge of Docker and or Kubernetes, infrastructure as code using tools like Terraform, and public cloud technologies such as AWS or GCP. We write applications in golang primarily, so experience with golang is a plus though python and other scripting languages is needed as well. Be prepared to join a team that sets the standards for other teams in the organisation, a team of incredible out-of-the-box thinking engineers that not only solve the hardest problems in the industry but also push their engineering teams at large to be better.
- Authoring technical documentation for workflows/processes/best practices
- Manage our on and off-prem kubernetes clusters to support our growing workloads
- Take part in a 24x7 on-call rotation
- Influence architectural decisions with focus on security, scalability and high-performance
- Setup and maintain monitoring, metrics & reporting systems for fine-grained observability and actionable alerting
- Set the technical direction for our engineering teams
- 5+ years of managing services in a distributed, internet-scale *nix environment.
- Ability to prioritize tasks and work independently
- Advanced or expert-level Linux administration.
- Track record of practical problem solving under pressure.
- Excellent communication, and documentation skills.\BS or MS degree in Computer Science or Engineering, or equivalent experience.
- Advanced experience with configuration management systems such as Ansible, Puppet or Terraform.
- Demonstrable knowledge of TCP/IP, Linux operating system internals, filesystems, disk/storage technologies and storage protocols.
- Experience with AWS, or other cloud infrastructure providers.
- Experience managing container-based workloads, using Kubernetes or other orchestration software
- Proficiency in a high-level language like Python, Go, Ruby and/or Java
- Excellent communication skills to collaborate with teams globally
- Ability to manage competing priorities, and work well under pressure
- Self-driven with an analytical mind with a bias for action
- Knowledge of big data platforms such as Hadoop a plus