Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Event correlation and filtering - How to, where-to start?

Got an asynchronous stream of events, where each event has information like -

  • Agency (one of many Agencies possible to be served by my solution)
  • Agent (one of many Agents in an Agency)
  • Served-Entity (a person/organization served by 1 or more agencies)
  • Date+Time
  • Class-Data (tags from a fixed but large set of tags)

What I need to do is to --

  1. Correlate an event based on Served-Entity, Date+Time and Class-Data, and create a consolidated new Event. Example:

    Event #0021: { Agency='XYZ', Agent='ABC', Served-Entity='MMN', Date+Time='12-03-2011/11:03:37', Class-Date='missed-delivery,no-repeat,untracable,orphan' }

    Event #0193: { Agency='KLM', Agent='DAY', Served-Entity='MMN', Date+Time='12-03-2011/12:32:21', Class-Date='missed-delivery,orphan,lost' }

    Event #1217: { Agency='KLM', Agent='CARE', Served-Entity='MMN', Date+Time='12-03-2011/18:50:45', Class-Date='escalated' }

    Here I find 3 events which are spaced out in time (more than 7hr separation), which are for the same Served-Entity (MMN), occur within a certain time window (say 24-hours), have matching or related Class-Data.

  2. Finally create a consolidated (new) event which could represent an inference drawn.

  3. Be able to create reports on a per Agency, per Agency, per Served-Entity basis, based on things like specific Class-Data tags (e.g. missed-delivery) over a certain period of time. This could be done using the original/input events, or the synthesized (inference) events.

  4. While this is not a requirement today, but quite likely to appear in future, that the "tags" that appear in Class-Data could grow, without any human intervention. So not sure if this should then be treated as unstructured data.

  5. Also not an immediate requirement, but in future there may be a need to identify trends / patterns of event occurrences (i.e. Event1 led to Event2 led to Event3).

The event arrival rate could be quite high... possibly thousands of events per minute. Maybe more. And, I need to archive the original/synthesized events for a period of time (a month or so).

My solution needs to be based on FOSS components (preferably). Some research done so far, points in the direction of CEP (Complex Event Processing), Bayesian-Networks/Classification, Predictive-Analytics.

Looking for some suggestions regarding approach to take. I'd prefer to take the path which meets most of my goals, with minimum difficulty/time, or to put another way, "learning AI" or "formal statistical methods" isn't my short-term goal :-)

like image 267
mike.dinnone Avatar asked Mar 31 '11 17:03

mike.dinnone


People also ask

What is event correlation give an example?

One example of event correlation can occur with intrusion detection. Perhaps there is an employee account that hasn't been accessed for years, and suddenly a large number of login attempts are noticed. That account may start executing suspicious commands.

What is event correlation in Siem?

SIEM event correlation is an essential part of any SIEM solution. It aggregates and analyzes log data from across your network applications, systems, and devices, making it possible to discover security threats and malicious patterns of behaviors that otherwise go unnoticed and can lead to compromise or data loss.


2 Answers

Mike,

did you consider something like Esper/Nesper to see if they might meet your requirements ? while I've looked at something similar myself -- especially on Erlang (check my post here), and you'd find some useful answers there.

IC

like image 86
bdutta74 Avatar answered Oct 13 '22 00:10

bdutta74


Your problem is a tactical problem as opposed to a procedural problem. Both types have their own set of tools, and you will be in a world of pain if you try to solve a tactical problem with procedural tools.

Just to clarify terms, when i say procedural, I am talking about use cases where you can say do X, then Y, then Z. With Tactical problems, X, Y, and Z can occur at anytime, and you must be able to handle the event.

You are on the right track with CEP. You might also look into using a rules engine. You didn't mention what your dev environment is, but if its Java, you might take a look at Jess. If you really want a nice and robust rules engine, take a look at Tibco Business Events. It is very powerful and fault tolerant, but definitely not free.

like image 45
clarson Avatar answered Oct 13 '22 00:10

clarson