Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Different ways of observing data changes

In my application I have many classes. Most of these classes store quite some data, and it is important that other modules in my application are also 'updated' if the content of one of the data classes changes.

The typical way to do this is like this:

void MyDataClass::setMember(double d)
{
m_member = d;
notifyAllObservers();
}

This is a quite good method if the member is not often changed and the 'observing classes' need to be up-to-date as fast as possible.

Another way of observing the changes is this:

void MyDataClass::setMember(double d)
{
setDirty();
m_member = d;
}

This is a good method if the member is changed many times, and the 'observing classes' look at regular intervals at all 'dirty' instances.

Unfortunately, I have a mix of both kinds of data members in my classes. Some are changed not that often (and I can live with normal observers), others are changed many many times (this is within complex mathematical algorithms) and calling the observers everytime the value changes will kill the performance of my application.

Are there any other tricks of observing data changes, or patterns in which you can easily combine several different methods of observing data changes?

Although this is a rather language-independent question (and I can try to understand examples in other languages), the final solution should work in C++.

like image 714
Patrick Avatar asked Jul 01 '10 20:07

Patrick


2 Answers

The two methods you've described cover (conceptually) both aspects, however I think you haven't explained sufficiently their pros and cons.

There is one item that you should be aware of, it's the population factor.

  • Push method is great when there are many notifiers and few observers
  • Pull method is great when there are few notifiers and many observers

If you have many notifiers and your observer is supposed to iterate over every of them to discover the 2 or 3 that are dirty... it won't work. On the other hand, if you have many observers and at each update you need to notify all of them, then you're probably doomed because simply iterating through all of them is going to kill your performance.

There is one possibility that you have not talked about however: combining the two approaches, with another level of indirection.

  • Push every change to a GlobalObserver
  • Have each observer check for the GlobalObserver when required

It's not that easy though, because each observer need to remember when was the last time it checked, to be notified only on the changes it has not observed yet. The usual trick is to use epochs.

Epoch 0       Epoch 1      Epoch 2
event1        event2       ...
...           ...

Each observer remembers the next epoch it needs to read (when an observer subscribes it is given the current epoch in return), and reads from this epoch up to the current one to know of all the events. Generally the current epoch cannot be accessed by a notifier, you can for example decide to switch epoch each time a read request arrives (if the current epoch is not empty).

The difficulty here is to know when to discard epochs (when they are no longer needed). This requires reference counting of some sort. Remember that the GlobalObserver is the one returning the current epochs to objects. So we introduce a counter for each epoch, which simply counts how many observers have not observed this epoch (and the subsequent ones) yet.

  • On subscribing, we return the epoch number and increment the counter of this epoch
  • On polling, we decrement the counter of the epoch polled and return the current epoch number and increment its counter
  • On unsubscribing, we decrement the counter of the epoch --> make sure that the destructor unsubscribes!

It's also possible to combine this with a timeout, registering the last time we modified the epoch (ie creation of the next) and deciding that after a certain amount of time we can discard it (in which case we reclaim the counter and add it to the next epoch).

Note that the scheme scales to multithread, since one epoch is accessible for writing (push operation on a stack) and the others are read-only (except for an atomic counter). It's possible to use lock-free operations to push on a stack at the condition that no memory need be allocated. It's perfectly sane to decide to switch epoch when the stack is complete.

like image 121
Matthieu M. Avatar answered Sep 30 '22 15:09

Matthieu M.


other tricks of observing data changes

Not really. You have "push" and "pull" design patterns. There aren't any other choices.

A notifyAllObservers is a push, ordinary attribute access is a pull.

I'd recommend consistency. Clearly, you have a situation where an object has many changes but all changes do not percolate through to other objects.

Don't be confused by this.

An observer does not need to do an expensive computation merely because it was notified of a change.

I think you should have some classes like this to handle the "frequent changes but slow requests" classes.

class PeriodicObserver {
    bool dirty;
    public void notification(...) {
        // save the changed value; do nothing more.  Speed matters.
        this.dirty= True;
    }
    public result getMyValue() {
        if( this.dirty ) { 
            // recompute now
        }
        return the value
}
like image 25
S.Lott Avatar answered Sep 30 '22 15:09

S.Lott