Resources

Browse through videos, guides, and other educational resources that cover incident management, reliability, team culture, and more.

Resource hero illustration - Blameless Images

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Blog

Ebook

7.1.2021

Elephant in the Blameless War Room: Accountability

Podcasts

Ebook

6.23.2021

Resilience in Action E8: Vanessa Yiu on Crafting Enterprise Architecture

Kurt chats with Vanessa Yiu, Head of Enterprise Architecture at Goldman Sachs. Vanessa shares her perspective on enterprise architecture, experience in operating enterprise-scale platforms, chairing the first global SRECon, advocating for women in STEM, and how enterprises can embark on the journey of making reliability more important.

Blog

Ebook

6.11.2021

Complete Guide to Service Level Objectives (SLOs) That Work

A "Service Level Objective" (SLO) is an internal target that measures how well a service is performing. Here's how they relate to SLAs, SLIs, and error budgets.

Blog

Ebook

6.2.2021

Error Budgets Explained (And How to Make One for Your Team)

Wondering what error budgets (EBs) are and how they are useful? We explain what they are, how they are defined, and how they can help your team.

Videos

Ebook

6.2.2021

LISA21 - Groove with Ambiguity: The Robust, the Reliable, and the Resilient

The networked software systems we build are increasing in complexity every moment. Today the most successful builders and operators are embracing complexity through CI/CD, Chaos Engineering, and innovation in Incident Response. They realize that the adaptive world around us is advancing at such a breakneck speed, it is leaving our capacity to understand it in the dust. That humans and technology must race a gauntlet of automation surprises and collaboration challenges as a team, learning and improving along the way. This session showcases methods of deploying, running, and navigating complexity. It offers a practical view of how software systems can scale and remain robust to failure (like fallbacks or high availability), achieve highly reliable socio-technical operations (via runbooks and game days), and adapt to surprise through techniques of resilience engineering (graceful extensibility and building for adaptation).

Blog

Ebook

5.31.2021

The 7 SRE Principles [And How to Put Them Into Practice]

Whether you're just adopting SRE or optimizing your current processes, we can help. We’ll explain the 7 key principles of SRE and how to put them into practice.

Blog

Ebook

5.25.2021

Building an SRE Team? Roles and Responsibilities Explained

Are you considering adopting SRE? We will explain the roles and responsibilities of an SRE team within your organization, and how to start building one.

Blog

Ebook

5.24.2021

SRE Culture [How to Build a Better Team]

If you're just adopting SRE or improving your current environment, we’ll help explain SRE culture and how to create a blameless development process. So what is SRE Culture? Let's talk about it.

Podcasts

Ebook

5.19.2021

Resilience in Action E7: Killing Ops with Tony Hansmann

In our seventh episode, Kurt chats with Tony Hansmann, Former Global CTO at Pivotal Software, Inc., about the joys and pains of being a consultant, how teams view digital transformation, how Tony is working towards killing ops, and more.

Blog

Ebook

5.10.2021

SRE vs. DevOps [Understanding Differences & Similarities]

Site Reliability Engineering (SRE) and DevOps share a goal of building a bridge between development and operations. We'll explore and compare both approaches.

Customer Success Stories

Agero

Agero’s Incident Management Is “Invincible” with the Help of Blameless Automation

Eventbrite

Eventbrite Mitigates Risk by Improving MTTA by 10X

Citrix, Greenlight, and Incognia

Top Reliability and Scaling Practices from Experts at Citrix, Greenlight Financial Technology, and Incognia

Machinify

Machinify gets "tremendous value" from Blameless, responds to incidents confidently with universal insight on service reliability

Purple ROI calculator illustration - Blameless Images

Incident Impact Calculator

Find out how much  you could save

Incidents can do real damage to companies that aren't sufficiently prepared them. Use our calculator to estimate the full cost of incidents for your team.

use the calculator

Get industry insights and events in your inbox.
Sign up for our monthly newsletter.

Company

About us Newsroom careers contact

Product

pricing integrations interactive Demo

Help Center

Getting Started Implementation Security Documents APIs & Webhooks

resources

Blog ebooks Incident Impact Calculator videos glossary Comparisons How Long do you Spend on an Incident?

legal

By clicking “Accept”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.

Based on the applicable laws of your country, you may have the right to request access to the personal information we collect from you, change that information, or delete it. To request to review, update, or delete your personal information, please fill out and submit a data subject access request to support@blameless.com.

I Accept

Preferences

Resources

Elephant in the Blameless War Room: Accountability

Resilience in Action E8: Vanessa Yiu on Crafting Enterprise Architecture

Complete Guide to Service Level Objectives (SLOs) That Work

Error Budgets Explained (And How to Make One for Your Team)

LISA21 - Groove with Ambiguity: The Robust, the Reliable, and the Resilient

The 7 SRE Principles [And How to Put Them Into Practice]

Building an SRE Team? Roles and Responsibilities Explained

SRE Culture [How to Build a Better Team]

Resilience in Action E7: Killing Ops with Tony Hansmann

SRE vs. DevOps [Understanding Differences & Similarities]

Customer Success Stories

Agero

Eventbrite

Citrix, Greenlight, and Incognia

Machinify

Find out how much you could save

Find out how much  you could save