Header background

AIOps, observability and a side of Whisky

Buzz words are ever omnipresent within technology and, of course, the APM space is no different. Over the last few years, AIOps and observability have come charging into this domain and Dynatrace took some time to discuss the differences and what they mean with our community during a recent event of ours, sampling whisky.

Alois Reitbauer, VP Chief Technology Strategist at Dynatrace drew many comparisons between the process of crafting whisky and building out an effective observability model – quite the feat indeed!

Key Performance Indicators (KPIs) and monitoring

We kicked off the tasting, and discussion, talking through the processes of monitoring and making whisky and the similarities in the processes. Both undergo a long process, whether it’s coding to production or from mash to whisky, respectively. During each process, there are certain expectations that must be met, and KPIs are taken from monitoring the process and then used to measure the success or status of the process and understand performance. For monitoring, KPIs would include failure rate, response time, errors, etc., but comparatively, in whisky creation, the KPIs would be on flavor, color, smell for example, as our whisky taster Greg walked us through.

In software, we compare different services together with these metrics. Once we have these metrics, we must make sense of them – this moves us on to observability.

Observability

Observability is there to provide us the means to obtain knowledge from these metrics to truly understand the application, what’s happening inside a system, and its performance, much like a whisky. A lighter color of whisky provides key information into how it was produced. In applications, observability gives us the model, which once fed the right data allows us to ask the right questions – where is our bottleneck? What isn’t currently normal, where are we failing most? It moves beyond just metrics – applying a meaning, a context, a model by which to apply to the monitoring data using metrics, logs, and traces.

But take observability one step further and automate the questions your IT teams are asking. To do so, many IT teams will take advantage of what we refer to as Artificial Intelligence for IT Operations – more commonly known as AIOps. AIOps replaces multiple separate, manual IT operations tools with a single, intelligent, and automated IT operations platform and enables a quicker and more proactive response from IT Ops with a lot less effort.

For this use case, Alois walked us through how Dynatrace takes advantage of our built-in AI engine, Davis, which continuously looks for problems and provides the precise root cause in real-time. As a result, resolutions can happen in minutes before outcomes are impacted. Dynatrace OneAgent gathers the monitoring data (metrics, logs and traces), observability is provided as we overlay our model to this data, Davis is then continually asking the right questions of this data model. Our guest speaker walked us through some of the advantages of this later in the event.

This differs from Whisky’s “observability” as this is typically tasted every 2-4 weeks, compared to the world of technology in which we are talking about data points being checked every millisecond. Observability now provides us with the model and what to expect, so when we receive data that are different from what we expect – this is an anomaly.

Anomaly detection and AIOps

Anomaly detection leads us into the next evolution in monitoring, and more specifically AIOps, which hit its zenith with automated root cause analysis. It’s important to understand detecting an anomaly – for example, the failure rate of a method has increased and it’s clear to see this is a degradation from the expected value at this hour, on this day – is not enough, as simply spotting anomalies isn’t always actionable. This was part of the issue with the second generation of monitoring. Automated root-cause analysis is the next evolution from this anomaly detection; going beyond the failure rate increase and stating why this anomaly has appeared, which could be because of it being tied to a wrong data type being provided to the method for example. Thus, was the walkthrough from monitoring through to observability and into AIOps and the future – embodying the growth from the 2nd to 3rd generation of APM, something our guest speaker Michael Akers also discussed later in the session.

Applying this in the real world – A sit-down interview

We then shifted gears for another round of whisky tasting which led to a conversation with Michael Akers, Monitoring Services Manager at Vitality, and Nina Harris, an Account Director at Dynatrace about what observability means to him, how it’s helping him daily, and moving away from theory into everyday use. Michael looks after monitoring tools covering logs, infrastructure, APIs and applications, and end-user devices.

Michael walked us through the history at Vitality. Formerly the company used Dynatrace’s 2nd Gen product known as AppMon, which suffered like the other 2nd Gen tooling out there, for example, the lack of integration, automation, and higher total cost of ownership. The team at Vitality knew it simply had to change as it was impeding application development and by extension, features were being added slowly. That was when the decision was made to move away from simply “monitoring”, to adopting a full-stack observability model with AIOps built in – the Dynatrace Software Intelligence Platform.

Michael went on to highlight some of the key-value gains Vitality has seen with Dynatrace, such as:

  • Out of the box integration with ITSM tooling
  • ChatOps sent directly to the right team – No time lost in support which results in a huge MTTR benefit, and fewer analysts so teams can focus these resources on proactive and innovative work
  • AIOps has eliminated firefighting – Teams can focus on innovation work, one example is shifting observability left, creating an unbreakable pipeline
  • Full-stack observability in a single tool has enabled teams to accelerate to meet the business demands
  • Dynatrace Business Insights – Enabled collaboration between business and technical teams, as well as providing insight into business metrics, too, enabling teams to identify key end-users to view their performance and proactively reach out or notify end users of issues

Several drams of whisky later, it’s safe to say we gained clarity on the meanings and differences between the monitoring, observability, and AIOps – and of course, Whisky! From learning about different flavor profiles, how to taste it, and the process behind it, it was a fantastic session by both Greg and Alois and of course a special thanks to our guest speaker Michael, who walked us through how the benefits of this buzz words look in the real world.

Read eBook!

It’s also worth highlighting there is a free eBook that discusses observability specifically, and the pillars which make up the foundation for observability: Logs, metrics & traces. And, you can read more about Vitality’s story with Dynatrace here.