Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find a period of eventually periodic sequence

Short explanation.

I have a sequence of numbers [0, 1, 4, 0, 0, 1, 1, 2, 3, 7, 0, 0, 1, 1, 2, 3, 7, 0, 0, 1, 1, 2, 3, 7, 0, 0, 1, 1, 2, 3, 7]. As you see, from the 3-rd value the sequence is periodic with a period [0, 0, 1, 1, 2, 3, 7].

I am trying to automatically extract this period from this sequence. The problem is that neither I know the length of the period, nor do I know from which position the sequence becomes periodic.

Full explanation (might require some math)

I am learning combinatorial game theory and a cornerstone of this theory requires one to calculate Grundy values of a game graph. This produces infinite sequence, which in many cases becomes eventually periodic.

I found a way to efficiently calculate grundy values (it returns me a sequence). I would like to automatically extract offset and period of this sequence. I am aware that seeing a part of the sequence [1, 2, 3, 1, 2, 3] you can't be sure that [1, 2, 3] is a period (who knows may be the next number is 4, which breaks the assumption), but I am not interested in such intricacies (I assume that the sequence is enough to find the real period). Also the problem is the sequence can stop in the middle of the period: [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, ...] (the period is still 1, 2, 3).

I also need to find the smallest offset and period. For example for original sequence, the offset can be [0, 1, 4, 0, 0] and the period [1, 1, 2, 3, 7, 0, 0], but the smallest is [0, 1, 4] and [0, 0, 1, 1, 2, 3, 7].


My inefficient approach is to try every possible offset and every possible period. Construct the sequence using this data and check whether it is the same as original. I have not done any normal analysis, but it looks like it is at least quadratic in terms of time complexity.

Here is my quick python code (have not tested it properly):

def getPeriod(arr):
    min_offset, min_period, n = len(arr), len(arr), len(arr)
    best_offset, best_period = [], []
    for offset in xrange(n):
        start = arr[:offset]
        for period_len in xrange(1, (n - offset) / 2):
            period = arr[offset: offset+period_len]
            attempt = (start + period * (n / period_len + 1))[:n]

            if attempt == arr:
                if period_len < min_period:
                    best_offset, best_period = start[::], period[::]
                    min_offset, min_period = len(start), period_len
                elif period_len == min_period and len(start) < min_offset:
                    best_offset, best_period = start[::], period[::]
                    min_offset, min_period = len(start), period_len

    return best_offset, best_period

Which returns me what I want for my original sequence:

offset [0, 1, 4]
period [0, 0, 1, 1, 2, 3, 7]

Is there anything more efficient?

like image 238
Salvador Dali Avatar asked May 23 '16 05:05

Salvador Dali


1 Answers

Remark: If there is a period P1 with length L, then there is also a period P2, with the same length, L, such that the input sequence ends exactly with P2 (i.e. we do not have a partial period involved at the end).

Indeed, a different period of the same length can always be obtained by changing the offset. The new period will be a rotation of the initial period.

For example the following sequence has a period of length 4 and offset 3:

0 0 0 (1 2 3 4) (1 2 3 4) (1 2 3 4) (1 2 3 4) (1 2 3 4) (1 2

but it also has a period with the same length 4 and offset 5, without a partial period at the end:

0 0 0 1 2 (3 4 1 2) (3 4 1 2) (3 4 1 2) (3 4 1 2) (3 4 1 2)


The implication is that we can find the minimum length of a period by processing the sequence in reverse order, and searching the minimum period using zero offset from the end. One possible approach is to simply use your current algorithm on the reversed list, without the need of the loop over offsets.

Now that we know the length of the desired period, we can also find its minimum offset. One possible approach is to try all various offsets (with the advantage of not needing the loop over lengths, since the length is known), however, further optimizations are possible if necessary, e.g. by advancing as much as possible when processing the list from the end, allowing the final repetition of the period (i.e. the one closest to the start of the un-reversed sequence) to be partial.

like image 63
qwertyman Avatar answered Oct 09 '22 21:10

qwertyman