As an embedded SRE, you are a force multiplier. You won't be shipping product features or doing chores for other teams; you will be building the self-service infrastructure and guardrails that allow Redfin's foundational engineering teams to own their own speed and reliability. You act as a technical guide and advocate, meeting teams where they are to help them level up their operational game and teaching them how to independently navigate the complexities of high-performance frontend systems.
About the Role- Enable Self-Service Performance: Build the RUM (Real User Monitoring) frameworks and automated deployment gates that empower frontend teams to maintain their own performance budgets.
- Operationalize AI: Build the "plumbing" for our AI gateway-cost tracking, usage monitoring, and reliability guardrails-so teams can leverage AI independently and safely.
- Intelligent Observability: Integrate AI-driven insights into our observability stack to automate incident diagnosis and help teams reduce MTTR.
- Strategic Advisory: Partner with core teams to provide "resilient-by-design" blueprints, helping them architect high-traffic paths that stay stable under load.
- Modernize Pipelines: Optimize delivery pipelines to be more observable and reliable, ensuring every deployment to our high-traffic environments is a non-event.
About You- Operational Lead: You aren't afraid to "sweep the floors" alongside your teams. You dive into the manual toil and the messy bugs to understand them first, then you build the tools and share the knowledge so those problems never come back.
- Performance Architect: You understand the systemic impact of browser rendering, hydration, and CDN logic on the customer journey.
- Technically Versatile: You can navigate a Java/Spring backend and a TypeScript/React frontend to identify architectural risks.
- Pragmatic Educator: You enjoy teaching teams how to own their operational health through better tooling and standards.
- 5+ years in SRE, Platform, or Systems Performance.
- Proficiency in TypeScript & Java (enough to navigate complex codebases and identify bottlenecks).
- Observability Foundations: Experience with Core Web Vitals, APM instrumentation, or RUM.
- Automation Mindset: Experience building CLI tools, automated workflows, or CI/CD pipelines.
- Bonus Points: Experience with Datadog, AWS/Kubernetes resource optimization, or managing third-party AI service limits.
What you'll getOur team members fuel our strategy, innovation and growth, so we ensure the health and well-being of not just you, but your family, too! We go above and beyond to give you the support you need on an individual level and offer all sorts of ways to help you live your best life. We are proud to offer eligible team members perks and health benefits that will help you have peace of mind. Simply put: We've got your back. Check out our full list of Benefits and Perks.
On-Call ExpectationsThis role may include participation in an on-call rotation to support production systems and ensure service reliability. On-call responsibilities may include coverage during nights and weekends. If applicable, frequency and scheduling will be determined by team needs and communicated accordingly.