Working on a project that parses a log of events, and then updates a model based on properties of those events. I've been pretty lazy about "getting it done" and more concerned about upfront optimization, lean code, and proper design patterns. Mostly a self-teaching experiment. I am interested in what patterns more experienced designers think are relevant, or what type of pseudocoded object architecture would be the best, easiest to maintain and so on.
There can be 500,000 events in a single log, and there are about 60 types of events, all of which share about 7 base properties and then have 0 to 15 additional properties depending on the event type. The type of event is the 2nd property in the log file in each line.
So for I've tried a really ugly imperative parser that walks through the log line by line and then processes events line by line. Then I tried a lexical specification that uses a "nextEvent" pattern, which is called in a loop and processed. Then I tried a plain old "parse" method that never returns and just fires events to registered listener callbacks. I've tried both a single callback regardless of event type, and a callback method specific to each event type.
I've tried a base "event" class with a union of all possible properties. I've tried to avoid the "new Event" call (since there can be a huge number of events and the event objects are generally short lived) and having the callback methods per type with primitive property arguments. I've tried having a subclass for each of the 60 event types with an abstract Event parent with the 7 common base properties.
I recently tried taking that further and using a Command pattern to put event handling code per event type. I am not sure I like this and its really similar to the callbacks per type approach, just code is inside an execute function in the type subclasses versus the callback methods per type.
The problem is that alot of the model updating logic is shared, and alot of it is specific to the subclass, and I am just starting to get confused about the whole thing. I am hoping someone can at least point me in a direction to consider!
Well... for one thing rather than a single event class with a union of all the properties, or 61 event classes (1 base, 60 subs), in a scenario with that much variation, I'd be tempted to have a single event class that uses a property bag (dictionary, hashtable, w/e floats your boat) to store event information. The type of the event is just one more property value that gets put into the bag. The main reason I'd lean that way is just because I'd be loathe to maintain 60 derived classes of anything.
The big question is... what do you have to do with the events as you process them. Do you format them into a report, organize them into a database table, wake people up if certain events occur... what?
Is this meant to be an after-the-fact parser, or a real-time event handler? I mean, are you monitoring the log as events come in, or just parsing log files the next day?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With