At Anaplan, we are looking for a self-motivated SITE RELIABILITY ENGINEER to join our growing team at Anaplan HQ in the city-by-the-bay, SAN FRANCISCO to be a member of the Global Technical Operations group located across San Francisco, US, London, UK, and York, UK. You will be embedded in the SF Technical Operations organization, and also work closely with Engineering Development teams, Product Architecture, and Program Management. There is close partnering between our UK SRE team and our SF SRE team.
As a Site Reliability Engineer, we know that you are passionate about seamless uptime. You delight in building tools to automate routine tasks and constantly seek new ways to improve system performance. If you also want to join a fast-growing startup, work with friendly people, and play with some cool tech, you found the right place!
In the heart of the eclectic SOMA district, you can feel the excitement and energy of what we do at Anaplan when you step into our San Francisco office. There are open floor plans, fully stocked kitchens, and great collaboration spaces, and we focus on making sure that you have everything you need to work well from the right lighting to the latest technology. When it’s time for the team to play together, we visit with our neighbors at a Giants game or a local pub’s happy hour. Come see for yourself what a collaborative and exciting place to work looks like. You will join a team of individuals who embrace and respect diverse perspectives, aren’t afraid to push boundaries and try new ideas, and are passionate about helping our customers and each other succeed. We work hard, but we also don’t wait for an excuse to have fun and we inspire each other!
This role is an immediate full-time position. If you’re ready to roll up your sleeves and tackle unique problems that no one is solving in the tech space yet, keep reading.
What you’ll be doing:
- Our core technology is Java on Linux using open source technology throughout the stack. The Java engine runs and stores all data in RAM for super high performance while staying safe with transaction logs and auto recovery. The office is Macs with a few Windows holdouts. You decide which works best for you.
- Utilize your skills in automation, replication and scaling to manage the production cloud in our worldwide data centers
- Write scripts in Ruby, Python, Perl, etc. to build custom tools for automation, replication and scaling
- Build tools to monitor and provide metrics on our systems
- Perform Linux system administration (DNS, NFS, RPM, Apache, Raid, etc.)
- Extend the existing automation we have in place and making things even easier to use.
- Support Product Development Teams
- Lead Release deployments and participate in revising software design to scale and prevent against failures
- Ensure compliance with various best practices.
- Adhere to compliance standards in the development and operationsspaces as guided by security.
- Participate in on-call rotation
- Work in a customer facing production environment
More about you:
- B.S. in Computer Science and 3 + years relevant experience OR 10+ years equivalent experience supporting production platforms using the following skills:
- Automation using tools such as Chef and Rundeck
- Programming in any of the following: Ruby, Python, Perl
- Multi Data center management, replication, scaling.
- Middleware software such as HA Proxy, Consul, Terraform or equivalent architectures
- Java applications including JVM performance and tuning
- Metrics and monitoring – writing custom tools and familiar with open source options.
- Linux administration – DNS, NFS, RPM, Apache, Raid, etc.
Technologies you’ll work with:
- Red Hat
- Dell-Both Virtual and Bare Metal
- HP-Both Virtual and Bare Metal
- New Relic
- MySQL – replication, backups, some light querying
- Networking – Switches, routers, firewalls, VPN
- Amazon EC2, EFS and related AWS technologies
- Taking a bare metal server/hardware to a fully functional app server