Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Appropriate design pattern for an event log parser?

Tags:

java

logging

Working on a project that parses a log of events, and then updates a model based on properties of those events. I've been pretty lazy about "getting it done" and more concerned about upfront optimization, lean code, and proper design patterns. Mostly a self-teaching experiment. I am interested in what patterns more experienced designers think are relevant, or what type of pseudocoded object architecture would be the best, easiest to maintain and so on.

There can be 500,000 events in a single log, and there are about 60 types of events, all of which share about 7 base properties and then have 0 to 15 additional properties depending on the event type. The type of event is the 2nd property in the log file in each line.

So for I've tried a really ugly imperative parser that walks through the log line by line and then processes events line by line. Then I tried a lexical specification that uses a "nextEvent" pattern, which is called in a loop and processed. Then I tried a plain old "parse" method that never returns and just fires events to registered listener callbacks. I've tried both a single callback regardless of event type, and a callback method specific to each event type.

I've tried a base "event" class with a union of all possible properties. I've tried to avoid the "new Event" call (since there can be a huge number of events and the event objects are generally short lived) and having the callback methods per type with primitive property arguments. I've tried having a subclass for each of the 60 event types with an abstract Event parent with the 7 common base properties.

I recently tried taking that further and using a Command pattern to put event handling code per event type. I am not sure I like this and its really similar to the callbacks per type approach, just code is inside an execute function in the type subclasses versus the callback methods per type.

The problem is that alot of the model updating logic is shared, and alot of it is specific to the subclass, and I am just starting to get confused about the whole thing. I am hoping someone can at least point me in a direction to consider!

like image 938
Josh Avatar asked Sep 18 '08 16:09

Josh


1 Answers

Well... for one thing rather than a single event class with a union of all the properties, or 61 event classes (1 base, 60 subs), in a scenario with that much variation, I'd be tempted to have a single event class that uses a property bag (dictionary, hashtable, w/e floats your boat) to store event information. The type of the event is just one more property value that gets put into the bag. The main reason I'd lean that way is just because I'd be loathe to maintain 60 derived classes of anything.

The big question is... what do you have to do with the events as you process them. Do you format them into a report, organize them into a database table, wake people up if certain events occur... what?

Is this meant to be an after-the-fact parser, or a real-time event handler? I mean, are you monitoring the log as events come in, or just parsing log files the next day?

like image 149
David Hill Avatar answered Sep 29 '22 03:09

David Hill