Navigate Incident Management Like a Pro: MyFitnessPal's Sr. Director of Engineering Shares Insider Strategies with Lee Atchison

Getting the Most Out of SRE, SLOs, and Error Budgets with Joseph Bironas at Collective Health

Christina Tan

7.13.2018

What nuances in execution and mentality separate successful SRE implementations from the failed ones? How can you get the most out of your SLOs and error budgets?

Joseph Bironas shares the often-overlooked but critical insights to answer these questions. Joseph has 14 years of experience in SRE, 12 of which at Google. His insider's insights are uniquely incisive, multi-disciplinary, and empathetic, linking the significance of SRE to both business and engineering.

Joseph currently leads the SRE team at Collective Health, a company that is transforming the employer-driven healthcare economy, redefining the way health benefits work.

Podcast Summary

This “CliffsNotes” summary curates the key points that were discussed by Joseph Bironas in the 50-minute interview. It is not a standalone article and is most valuable when contextualized by the podcast.

The significance of reliability:

To engineering: product quality is just as important as product functionality.
To business: a reliable product is key to a company's brand & its customers' trust.

Fundamentally,

SLIs are user experience-centric.
SLOs are an organizational guardrail for managing risk.
SRE teams set a perimeter of defense, then slowly expand.
Operate without blame.

Counterintuitive Mentality Shifts

For successful SRE implementations

Consider reliability as a core feature.
People's minds are implicitly fixed to 100% reliability, but we should never aim for 100% reliability.
It's not enough to set a boundary for risk with SLOs, you want to proactively control the risk with experiments to test and address key system vulnerabilities.

‍

Resources

Book a blameless demo

To view the calendar in full page view, click here.

Share to

Get industry insights and events in your inbox.
Sign up for our monthly newsletter.

Company

About us Newsroom careers contact

Product

pricing integrations interactive Demo

Help Center

Getting Started Implementation Security Documents APIs & Webhooks

resources

Blog ebooks Incident Impact Calculator videos glossary Comparisons How Long do you Spend on an Incident?

legal

By clicking “Accept”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.

Based on the applicable laws of your country, you may have the right to request access to the personal information we collect from you, change that information, or delete it. To request to review, update, or delete your personal information, please fill out and submit a data subject access request to support@blameless.com.

I Accept

Preferences