ChatOps – SRE Managers: Automated Response to Service Issues
RigD Platform for Automated Response Part 2 – Core Capabilities for ChatOps
ChatOps, a topic of Interest for a couple of years in the DevOps and IT Operations area. It is one where DevOps teams work in a collaboration platform and the in-channel bot listens for key commands. The ChatBot assists the team in working an issue. The bot may do a variety of tasks such as simply feeding information to the channel or aiding the staff in performing simple commands. We at RigD work in Slack, and we call this general concept SlackOps. RigD takes this concept to whole new level with our platform features that are not just simple API calls, but actually assist the staff to perform activities, both machine centric and those that activities that help people who are managing the crisis perform their work to best of their abilities and team/organizational best practices.
A typical DevOps team can have dozens of tools to work with. We can easily see that keeping up with all these tools, their constructs, their APIs and their assumptions can be daunting. This is one of the key areas that ChatOps or SlackOps can really help the team. RigD’s model for tool integration is one where the platform drives the integration, making it easy to set up a variety of tools.
As you can see the experience to set up the tool is pretty straightforward with the next steps to achieve value presented in a straightforward way. No need to be a DevOps with 20 years of deep level experience, everyone can get started quickly.
Feeds and Filtering
Once your tool is connected, inevitably you want all that delicious tool information such as incidents, or logs or configuration or state changes to be filtered into the right channels through feeds. Pretty straightforward, but you obviously want a subset of your information in a channel. The last think you want (and we hear this happens more often than not) is that the team dumps all this information into a feed and over time it fills up with a long list of relevant and no so relevant information and people mute the channel and then over time, go off to another channel and try again. This is no way to get the right information where you need it. RigD has powerful feed controls.
Response or Work an Incident with ChatOps
Once an incident is determined by an external tool, RigD can help you respond to those incidents through the set of API activities. RigD can even remind you to make updates. Think of this as your co-pilot as you and your team race around the incident crises track. A particularly good blog to look at is here. RigD has a construct or concept called work an incident, where we create an activity “container” around all the work that occurs in a Slack channel to respond, troubleshoot, engage, question, triage, etc.
Our AI and Machine Intelligence is keeping track of all the people, activities (machine or human) that are occurring while the team is working an incident. This data is absolutely critical to monitoring, and evaluating the success of the team is responding and resolving an incident. When the team resolves an incident the work is stopped and an archive of all the activities in the incident are logged and available for analysis. Overtime and through many incidents this dataset will be the beginning of insights that that RigD’s machine intelligence will use to suggest improvement for the team. Additionally the data is critical for postmortems.
While you can do a significant amount of work, speeding up the time to accomplish simple tasks by 50 to 60%, the key to really banking savings time is the use of flows. Oftentimes, you need to perform multiple tasks within a tool’s UI, which involves bouncing around to different screens. With a flow this is simplified. Performance critical updates between tools are another area where flows are crucial, for example after doing all your activities within PagerDuty, you may need to update your Atlassian Statuspage and then log a bug or a ticket with JIRA. You can improve this cross tool workflow by automated to reduce work, and elimination of fat finger errors. Besides all the CRUD of managing a flow, you can also set up triggers, aliases (making it easy for the team to remember and type) and controlling access.
We provide hundreds out of the box for the different tools we have created integrations for. You can always create your own activities to integrate your favorite tools or custom endpoints you have.
Value Reporting for ChatOps
RigD keeps track of all the work that you and your team do and based upon benchmarks can help you understand how much time you are saving. It is a great way for your team to show how much time and money you are saving through automation and ChatOps.
Save time and increase your automated response with PagerDuty and Statuspage.
Yes you can build a bot for a few tasks, but will it have all the capabilities that an enterprise grade platform will have? Try RigD free for a 30 day trial, no credit card required. Click here.