Events – a flood or mountain creek

It’s hard to live with them, but even harder without them. Events – indicators of a healthy environment or signs of disease.

According to ITIL, event can be defined as “any change of state that has significance for the management of a configuration item (CI) or IT service.”

Why events and event management?

Imagine this as a patient lying in a hospital bed and being connected to all those fancy (let’s be serious – very useful) devices. Properly configured, they can give a lot of needed (sometimes life-saving) information so doctors always have a clear picture about the state of the patient. But, if they produced useless reports and created information overflow, information that was needed to save the patient’s life could get lost. It’s the same in an IT environment.

Service operation procedure is responsible for “keeping the lights on,” i.e. taking care of live services. To execute this responsibly and efficiently, service operations need to know the status of the infrastructure and services in their responsibility, as well as be able to detect any deviation from normal or expected operation. Events are the instruments that are used. Events are usually recognized through notifications created by an IT service, CI (configuration item) or monitoring tool. And, that’s where the party begins. Someone has to define which monitoring data will be used, what they mean and what to do with them. That is the purpose of the Event Management process. Or, to put it officially, the purpose of event management is to manage events throughout their lifecycles. Lifecycle means activities to detect events, make sense of them and determine appropriate action.

Monitoring tools

Let’s see what should be considered while designing the event management process. Event management is the basis for operational monitoring and control. There are two types of monitoring tools:

  • Active monitoring tools – the tool polls CIs to determine their status. Exceptions will generate alerts that need to be communicated further (to appropriate tool or person/team).
  • Passive monitoring tools – the tool detects and correlates alerts generated by CIs and performs predefined actions.

We have to draw the line here between monitoring tools and Event Management tools. Monitoring tools are, usually, specialized tools for certain technology. They monitor and create events. Event Management tools are, usually, part of the IT Service Management tool and integrate the Event Management process with other ITIL processes (e.g. Incident Management). They take over events from monitoring tools and introduce them to the Event Management process (or, in other words, workflow).

Monitoring tools can generate a lot of events (or a flood – from the title). Some are less useful, while other could be very useful. Therefore, Event Management uses categorization, filtering and correlation.

Three types of events

What I usually find are three types (i.e. categories) of events:

  • Informational – such events are for informational purposes and don’t require any type of action. What usually happens with such events is that they are stored in log files. Purpose – e.g. statistics. Example: a user has logged in to an application.
  • Warning – an event indicates a situation that must be checked, followed by certain action. Example: a server’s RAM utilization is above 75%.
  • Exceptional – this event indicates that a CI or service operates abnormally. Usually, the Service Level Agreement (SLA) or business process is breached.

Of course, based on real situations, categorization can vary. That depends also on tools that are used and other IT Service Management processes.

Incident generation_filtering_and_correlationFigure: Incident generation, filtering and correlation

Filtering and correlation can take place either on monitoring tools or on Event Management tools. I experienced that it is more useful if a monitoring tool filters and correlates, but this is not very often (usually, this means further investment). Filtering means that not all events should be communicated to the Event Management tool. Some events don’t bring any valuable information (but they are generated due to an inability to turn off the notification), and it is better to keep them either on CIs (that generate events) or inside the monitoring tool. When events are already generated, it must be decided what to do and how to proceed with them. This is a job of correlation. Correlation will separate events into categories (remember – informational, warning, exceptional) and add some logic. E.g. only the first in a series of events related to the same CI will be communicated to the Event Management tool.

How do we implement all this?

How do we implement filtering and correlation? An ideal situation is when there is a managing monitoring tool in place. Such a tool will pick up all notifications, i.e. events from other tools, apply filtering and correlation logic and communicate with the Event Management tool. But, in real life, the situation is vice-versa. The Event Monitoring tool is used for filtering and correlation. To be able to do that, it has to have the ability to communicate with monitoring tools. E-mail is one of the common technologies. E-mail that is sent to the Event Management tool contains a keyword, which is recognized by the Event Management tool; after receipt, workflow (inside Event Management tool) decides what to do with the event (i.e. performs filtering). This is how (remember, from the title) a mountain creek is created.

Although Event Management sounds simple, it is not. Not because of the Event Management process, but because of what comes before the process begins. A flood does not happen when a river enters the ocean, but much earlier. A lot of water gates must be in place to avoid it.

Although it depends on the event management tool in place, you can also check out a free preview of our Event Management process template to see how the process could be implemented.

Advisera Branimir Valentic
Branimir Valentic
Branimir is an expert in IT service management (consultancy, training and tools), IT governance (training and consulting), project management and consultancy in IT and telecommunication. He holds the following certificates: ITIL Expert, ISO 20000, ISMS Lead Auditor and PRINCE2.