Effective PagerDuty Incident Response in Slack
SlackOps for PagerDuty
Slack has been taking the business communications industry by storm. This is also true for collaboration and coordination of incidents. Every day more and more enterprises are discussing, diagnosing, and solving incidents through Slack. While Slack is an amazing platform for collaboration, it is not an incident response platform, nor well suited to driving effective process. To deliver an effective PagerDuty Incident Response in Slack you need help and RigD is the solution!
In this series we have highlighted the key elements of an effective incident response process based on industry best practices, including PagerDuty’s PagerDuty’s incident response documentation. Each part in the series will discuss the purpose and value of each of these elements, show how to implement it in Slack using RigD, and give you a sense of the time and cost impact. Let’s take a quick look at the topics covered.
Opening an incident in Slack can save a good amount of time, but more importantly can significantly reduce the complexity as compared and training effort to opening an incident through the PagerDuty web UI. If you are opening a lot of manual incidents at your organization this is a must read.
Getting to the right person fast saves a tremendous amount of wasted time in incident response. Identifying who is on call and connecting with them is at the core of what PagerDuty does, and tracking it down in Slack can save even more time in Slack.
You wouldn’t be reading this blog if you were not already using Slack to collaborate on incidents. However, there is a substantial benefit to utilizing a dedicated Slack channel per incident or at the lease a Slack Thread. This part will show you how to make that easy.
Part 4 – PagerDuty Incident Status in Slack
Incident updates are a key part of keeping everyone aligned, both internally and externally. A timely external update can be the difference between keeping and loosing a customer. Never miss an update by driving them through Slack.
Technical incident response has been modeled after military and emergency services incident response. Identifying who is in command and executing an established triage are crucial for both. They can be done with easy using RigD in Slack.
The final and most critical element of the incident response is postmortems. Utilize Slack and RigD to ensure you never miss a postmortem and drive a better process.
The Impact of an Effective PagerDuty Incident Response
In each part we take a look at the time and cost savings that can be realized from running an effective PagerDuty incident response process in Slack. To demonstrate the value we leverage the data found in the Rand Group report, $5,600 cost per minute of downtime for an enterprise, and from the PagerDuty ROI study which computed an average of 20,483 incidents and 14 outages across the enterprises included in the study. Each of the elements discussed provides meaningful savings, but in aggregate it’s truly impressive. At 2,162 hours of savings using RigD is like adding an extra team member. Reducing the outage costs related to these by 80% is sure to get everyone looking at the bottom line excited. Most importantly though your customers will appreciate that they can get on with their business faster. Even if your operational work load is only a fraction of these enterprise benchmarks the benefits still demand you take a look at how to RigD can transform your operations.