Simple machine learning question. Probably numerous ways to solve this:
There is an infinite stream of 4 possible events:
'event_1', 'event_2', 'event_4', 'event_4'
The events do not come in in completely random order. We will assume that there are some complex patterns to the order that most events come in, and the rest of the events are just random. We do not know the patterns ahead of time though.
After each event is received, I want to predict what the next event will be based on the order that events have come in in the past. So my question is: What machine learning algorithm should I use for this predictor?
The predictor will then be told what the next event actually was:
Predictor=new_predictor() prev_event=False while True: event=get_event() if prev_event is not False: Predictor.last_event_was(prev_event) predicted_event=Predictor.predict_next_event(event)
The question arises of how long of a history that the predictor should maintain, since maintaining infinite history will not be possible. I'll leave this up to you to answer. The answer can't be infinte though for practicality.
So I believe that the predictions will have to be done with some kind of rolling history. Adding a new event and expiring an old event should therefore be rather efficient, and not require rebuilding the entire predictor model, for example.
Specific code, instead of research papers, would add for me immense value to your responses. Python or C libraries are nice, but anything will do.
Update: And what if more than one event can happen simultaneously on each round. Does that change the solution?
Naive Bayes is a simple but surprisingly powerful algorithm for predictive modeling. The model consists of two types of probabilities that can be calculated directly from your training data: 1) The probability of each class; and 2) The conditional probability for each class given each x value.
Sequence prediction is a popular machine learning task, which consists of predicting the next symbol(s) based on the previously observed sequence of symbols. These symbols could be a number, an alphabet, a word, an event, or an object like a webpage or product.
Reinforcement learning is the training of machine learning models to make a sequence of decisions. The agent learns to achieve a goal in an uncertain, potentially complex environment.
First, find the common difference for the sequence. Subtract the first term from the second term. Subtract the second term from the third term. To find the next value, add to the last given number.
This is essentially a sequence prediction problem, so you want Recurrent neural networks or hidden Markov models.
If you only have a fixed time to look back, time window approaches might suffice. You take the sequence data and split it into overlapping windows of length n. (eg. you split a sequence ABCDEFG into ABC, BCD, CDE, DEF, EFG). Then you train a function approximator (e.g. neural network or linear regression) to map the first n-1 parts of that window onto the nth part.
Your predictor will not be able to look back in time longer than the size of your window. RNNs and HMMs can do so in theory, but are hard to tune or sometimes just don't work.
(State of the art RNN implementations can be found in PyBrain http://pybrain.org)
Update: Here is the pybrain code for your problem. (I haven't tested it, there might be some typos and stuff, but the overall structure should work.)
from pybrain.datasets import SequentialDataSet from pybrain.supervised.trainers import BackpropTrainer from pybrain.tools.shortcuts import buildNetwork from pybrain.structure import SigmoidLayer INPUTS = 4 HIDDEN = 10 OUTPUTS = 4 net = buildNetwork(INPUTS, HIDDEN, OUTPUTS, hiddenclass=LSTMLayer, outclass=SigmoidLayer, recurrent=True) ds = SequentialDataSet(INPUTS, OUTPUTS) # your_sequences is a list of lists of tuples which each are a bitmask # indicating the event (so 1.0 at position i if event i happens, 0.0 otherwise) for sequence in your_sequences: for (inpt, target) in zip(sequence, sequence[1:]): ds.newSequence() ds.appendLinked(inpt, target) net.randomize() trainer = BackpropTrainer(net, ds, learningrate=0.05, momentum=0.99) for _ in range(1000): print trainer.train()
This will train the recurrent network for 1000 epochs and print out the error after every epochs. Afterwards you can check for correct predictions like this:
net.reset() for i in sequence: next_item = net.activate(i) > 0.5 print next_item
This will print an array of booleans for every event.
Rather than keeping a full history, one can keep aggregated information about the past (along with a relatively short sliding history, to be used as input to the Predictor logic).
A tentative implementation could go like this:
In a nutshell: Managing a set of Markov chains of increasing order, and grading and averaging their predictions
There can be several variations on the general logic described above. In particular in the choice of the particular metric used to "grade" the quality of prediction of the individual N-Gram lengths.
Other considerations should be put with regards to detecting and adapting to possible shifts in the events distribution (the above assumes a generally ergodic event source). One possible approach is to use two sets of tables (combining the probabilities accordingly), and periodically dropping the contents of all tables of one of the sets. Choosing the right period for these resets is a tricky business, essentially balancing the need for statistically significant volumes of history and the need for short enough period lest me miss on the shorter modulations...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With