Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there any algorithm to mine continuous closed sequences from a sequence database?

I am working on text compression and I want to use the knowledge of mining closed frequent sequences. The existing algorithms like GSP, CloSpan, ClaSP, Bide mine all frequent sequences both continuous and non continuous. Can you help me in finding such algorithm?

For example if the sequence database is 
SID        Sequence
1            CAABC
2            ABCB
3            CABC
4            ABBCA
and minimum support is 2

the existing algorithms consider the subsequence 'CB' of sequence with id 1 but I don't want that.

like image 745
Mark Avatar asked Mar 05 '26 06:03

Mark


1 Answers

Modern sequential pattern mining algorithms try to prune the search space to reduce running time. The search space is exponentially larger as a "non-continuous" sub-sequence can be any combination from the input sequences. In your case, the search space is far smaller given that the sequences are continuous i.e. we already know the combinations. So, you can probably make an algorithm for this on your own if you like, and it would still be reasonably faster.

Here's how a crude recursive example could look like:

def f(str, database, minSupp):
    freq = 0

    if len(patt) == 0:
        return ""

    #count frequency
    for trans in db:
        if patt in trans:
            freq += 1

    if freq >= minSupp:
        return patt
    else: #break it down
        r = []
        r.append(f(patt[1:], db, minSupp)) #All but the first element
        r.append(f(patt[:-1], db, minSupp)) #All but the last element
        return r

This just demonstrates one way of doing it. Of course, it crappy.

To do this much faster, you can use probably write some conditions so that a recursive call is not made in case the pattern is known.

An even faster way would be that you maintain an inverted-index of all patterns and then incrementally update them to create super-patterns using the Apriori-condition. For this, you can refer to some slide-show explaining how the Apriori algorithm works (using your candidate generation method; one example of which is in the algorithm above).

like image 53
Muhammad Ali Avatar answered Mar 06 '26 19:03

Muhammad Ali



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!