EXCITING NEWS: BLAMELESS JOINS FORCES WITH FIREHYDRANT! Click here to view our blog!
How much time are engineering teams spending on incidents?
Are you trying to set your engineering team free to do their best work? Read our new case study to learn how Blameless can help you do that.

Resources

Browse through videos, guides, and other educational resources that cover incident management, reliability, team culture, and more.
Resource hero illustration - Blameless Images
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Blog
Ebook
This is some text inside of a div block.
7.1.2021

Elephant in the Blameless War Room: Accountability

ALL
Podcasts
Ebook
This is some text inside of a div block.
6.23.2021

Resilience in Action E8: Vanessa Yiu on Crafting Enterprise Architecture

Kurt chats with Vanessa Yiu, Head of Enterprise Architecture at Goldman Sachs. Vanessa shares her perspective on enterprise architecture, experience in operating enterprise-scale platforms, chairing the first global SRECon, advocating for women in STEM, and how enterprises can embark on the journey of making reliability more important.
ALL
Blog
Ebook
This is some text inside of a div block.
6.11.2021

Complete Guide to Service Level Objectives (SLOs) That Work

A "Service Level Objective" (SLO) is an internal target that measures how well a service is performing. Here's how they relate to SLAs, SLIs, and error budgets.
ALL
Videos
Ebook
This is some text inside of a div block.
6.2.2021

LISA21 - Groove with Ambiguity: The Robust, the Reliable, and the Resilient

The networked software systems we build are increasing in complexity every moment. Today the most successful builders and operators are embracing complexity through CI/CD, Chaos Engineering, and innovation in Incident Response. They realize that the adaptive world around us is advancing at such a breakneck speed, it is leaving our capacity to understand it in the dust. That humans and technology must race a gauntlet of automation surprises and collaboration challenges as a team, learning and improving along the way. This session showcases methods of deploying, running, and navigating complexity. It offers a practical view of how software systems can scale and remain robust to failure (like fallbacks or high availability), achieve highly reliable socio-technical operations (via runbooks and game days), and adapt to surprise through techniques of resilience engineering (graceful extensibility and building for adaptation).
ALL
Blog
Ebook
This is some text inside of a div block.
6.2.2021

Error Budgets Explained (And How to Make One for Your Team)

Wondering what error budgets (EBs) are and how they are useful? We explain what they are, how they are defined, and how they can help your team.
ALL
Blog
Ebook
This is some text inside of a div block.
5.31.2021

The 7 SRE Principles [And How to Put Them Into Practice]

Whether you're just adopting SRE or optimizing your current processes, we can help. We’ll explain the 7 key principles of SRE and how to put them into practice.
ALL
Blog
Ebook
This is some text inside of a div block.
5.25.2021

Building an SRE Team? Roles and Responsibilities Explained

Are you considering adopting SRE? We will explain the roles and responsibilities of an SRE team within your organization, and how to start building one.
ALL
Blog
Ebook
This is some text inside of a div block.
5.24.2021

SRE Culture [How to Build a Better Team]

If you're just adopting SRE or improving your current environment, we’ll help explain SRE culture and how to create a blameless development process. So what is SRE Culture? Let's talk about it.
ALL
Podcasts
Ebook
This is some text inside of a div block.
5.19.2021

Resilience in Action E7: Killing Ops with Tony Hansmann

In our seventh episode, Kurt chats with Tony Hansmann, Former Global CTO at Pivotal Software, Inc., about the joys and pains of being a consultant, how teams view digital transformation, how Tony is working towards killing ops, and more.
ALL
Blog
Ebook
This is some text inside of a div block.
5.10.2021

SRE vs. DevOps [Understanding Differences & Similarities]

Site Reliability Engineering (SRE) and DevOps share a goal of building a bridge between development and operations. We'll explore and compare both approaches.
ALL

Customer Success Stories

Agero

Agero’s Incident Management Is “Invincible” with the Help of Blameless Automation
Read more

Eventbrite

Eventbrite Mitigates Risk by Improving MTTA by 10X
Read more

Citrix, Greenlight, and Incognia

Top Reliability and Scaling Practices from Experts at Citrix, Greenlight Financial Technology, and Incognia
Read more

Machinify

Machinify gets "tremendous value" from Blameless, responds to incidents confidently with universal insight on service reliability
Read more
Purple ROI calculator illustration - Blameless Images
Incident Impact Calculator

Find out how much 
you could save

Incidents can do real damage to companies that aren't sufficiently prepared them. Use our calculator to estimate the full cost of incidents for your team.
use the calculator