SAP started in 1972 as a team of five colleagues with a desire to do something new. Together, they changed enterprise software and reinvented how business was done. Today, as a market leader in enterprise application software, we remain true to our roots. That's why we engineer solutions to fuel innovation, foster equality and spread opportunity for our employees and customers across borders and cultures.
SAP values the entrepreneurial spirit, fostering creativity and building lasting relationships with our employees. We know that a diverse and inclusive workforce keeps us competitive and provides opportunities for all. We believe that together we can transform industries, grow economics, lift up societies and sustain our environment. Because it's the best-run businesses that make the world run better and improve people's lives.
About the Role:
As a Reliability Engineer at SAP/Concur you'll help us create the architecture and process that can handle the rapid growth and evolution of our product lines that will enable us to deliver at high scale with safety and reliability. We are still a small team, executive sponsored, growing quickly and focused on building an extraordinary team and company culture.
Much of our software development focuses on optimizing existing systems, building infrastructure and eliminating work through automation. Our Reliability Engineers are responsible for the big picture of how our systems relate to each other, we deploy a breadth of tools and approaches to solve a broad spectrum of problems. We follow Principles such as limiting time spent on operational work, iterative improvement, blameless postmortems and proactive identification of potential outages as keys to product quality and fulfilling work.
What you'll get to do:
- Engage in and improve the whole lifecycle of services to meet the needs of the business efficiently —from inception and design, through deployment, operation, refinement, and retirement.
- Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, and launch reviews.
- Scale systems sustainably through automation and evolve systems by partnering on and/or implementing changes that improve reliability and velocity.
- Practice sustainable incident response and blameless postmortems.
- Evolve our monitoring and analytics infrastructure.
- Evolve our CI/CD pipeline to survive an ever-growing number of engineers and enable an increasing rate of change safely.
- Enhance the use of configuration management tools to operationalize deployments
- Improve the reliability, efficiency, and fault-tolerance of our distributed systems.
What you bring:
- Experience building and operating large-scale production systems
- A track record of working collaboratively in a rapidly moving engineering team
- A bias toward repeatability and eliminating human effort through software automation
- A problem solver, willing to solve difficult problems and work independently when necessary
- The ability to identify problems, propose solutions, gain consensus and see those solutions into production
- Strong testing background: experience building unit, integration, performance, and load tests
- Exposure to real-time event logging, stats collection, and analysis
- Experience operating a large system on AWS
- Experience with algorithms, data structures, complexity analysis and software development
- Ability to diagnose technical problems, debug unfamiliar code, and automate routine tasks
- Analytical approach coupled with solid communication skills and a sense of ownership
- The ability to share your experience with others and help make us better systems thinkers and engineers
- Knowledge of microservices architecture and container orchestration frameworks such as Docker, Kubernetes, or Mesos.
- Experience with configuration management tools such as Ansible, Puppet, or Chef
- In depth knowledge of UNIX/Linux: command-line utilities; familiarity with system administration tools and concepts.
- Linux networking expertise, such as investigating network traffic, port forwarding, tuning OS parameters
- Experience with centralized logging systems, metrics, and tooling frameworks such as ELK, Prometheus, and Grafana.
- Experience with administering CI servers such as Jenkins.
- Practical experience with multiple languages, Go, Python, Java, etc
SAP'S DIVERSITY COMMITMENT
To harness the power of innovation, SAP invests in the development of its diverse employees. We aspire to leverage the qualities and appreciate the unique competencies that each person brings to the company.