As a Senior Production Engineer at Instructure, Inc., you will play a critical role in ensuring the reliability, performance, and scalability of our educational technology platforms. You will be a key contributor to our SRE team, focusing on proactive system health, incident response, automation, and continuous improvement of our infrastructure and applications.
What you will do- Design, implement, and maintain highly available and scalable infrastructure for Instructure's core products.
- Develop and deploy automation tools and scripts to streamline operational tasks, improve efficiency, and reduce manual intervention.
- Monitor system performance, identify bottlenecks, and implement solutions to optimize resource utilization and user experience.
- Participate in on-call rotations to provide 24/7 support for production systems, troubleshooting and resolving complex incidents quickly and effectively.
- Collaborate with development teams to ensure new features and services are designed for reliability, performance, and operability.
- Conduct root cause analysis for production incidents, implement preventative measures, and document findings.
- Contribute to the continuous improvement of our SRE practices, tooling, and processes.
- Mentor junior engineers and share expertise within the team and across the organization.
- Evaluate and recommend new technologies and solutions to enhance our infrastructure and operational capabilities.
What you will need to know/have- Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
- 5+ years of experience in a Production Engineering, Site Reliability Engineering (SRE), or DevOps role.
- Strong proficiency in at least one scripting language (e.g., Python, Go, Ruby, Bash).
- Extensive experience with cloud platforms (e.g., AWS, Azure, GCP), including deep knowledge of services like EC2, S3, RDS, Kubernetes, Lambda.
- Solid understanding of infrastructure-as-code principles and tools (e.g., Terraform, CloudFormation, Ansible).
- Demonstrated experience with containerization technologies (e.g., Docker, Kubernetes).
- Proficiency in monitoring and alerting systems (e.g., Prometheus, Grafana, New Relic, Datadog).
- Strong understanding of networking concepts, distributed systems, and database technologies.
- Excellent problem-solving skills and the ability to troubleshoot complex issues under pressure.
- Strong communication and collaboration skills, with the ability to work effectively with cross-functional teams.
- Experience with continuous integration and continuous deployment (CI/CD) pipelines.
Get in on all the awesome at Instructure!We offer competitive, meaningful benefits in every country where we operate. While they vary by location, here's a general idea of what you can expect:
- Competitive compensation, plus all full-time employees participate in our ownership program - because everyone should have a stake in our success.
- Flexible work culture. Our remote, hybrid and in-office collaboration spaces vary by role, team and location.
- Generous time off, including local holidays and our annual "Dim the Lights" period in late December, when teams are encouraged to step back and recharge based on departmental needs.
- Comprehensive wellness programs and mental health support
- Learning and development resources, including professional development tools and tuition reimbursement, to support your growth
- The technology and tools you need to do your best work
- Motivosity employee recognition program
- A culture rooted in inclusivity, support, and meaningful connection
We believe in hiring great people and treating them right. The more diverse we are, the better our ideas and outcomes.