Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Data structure for dynamically changing n-length sequence with longest subsequence length query

I need to design a data structure for holding n-length sequences, with the following methods:

  • increasing() - returns length of the longest increasing sub-sequence
  • change(i, x) - adds x to i-th element of the sequence

Intuitively, this sounds like something solvable with some kind of interval tree. But I have no idea how to think of that.

I'm wondering how to use the fact, that we completely don't need to know how this sub-sequence looks like, we only need its length...

Maybe this is something that can be used, but I'm pretty much stuck at this point.

like image 726
qiubit Avatar asked Jan 09 '16 22:01

qiubit


People also ask

How do you find the length of a longest increasing subsequence?

To find the LIS for a given array, we need to return max(L(i)) where 0 < i < n. Formally, the length of the longest increasing subsequence ending at index i, will be 1 greater than the maximum of lengths of all longest increasing subsequences ending at indices before i, where arr[j] < arr[i] (j < i).

What is a largest number of increasing Subsequences an array of size N can have?

The Longest Increasing Subsequence (LIS) problem is to find the length of the longest subsequence of a given sequence such that all elements of the subsequence are sorted in increasing order. For example, the length of LIS for {10, 22, 9, 33, 21, 50, 41, 60, 80} is 6 and LIS is {10, 22, 33, 50, 60, 80}.

How do you find the length of the longest non decreasing subsequence?

Time: O(K * N/K * log(N/K)) = O(N * log(N/K)) , where N <= 10^5 is length of arr , K <= N . We have total K new arr, each array have up to N/K elements. We need O(M * logM) to find the longest non-decreasing subsequence of an array length M .


2 Answers

This solves the problem only for contiguous intervals. It doesn't solve arbitrary subsequences. :-(

It is possible to implement this with time O(1) for interval and O(log(n)) for change.

First of all we'll need a heap for all of the current intervals, with the largest on top. Finding the longest interval is just a question of looking on the top of the heap.

Next we need a bunch of information for each of our n slots.

value: Current value in this slot
interval_start: Where the interval containing this point starts
interval_end: Where the interval containing this point ends
heap_index: Where to find this interval in the heap NOTE: Heap operations MUST maintain this!

And now the clever trick! We always store the value for each slot. But we only store the interval information for an interval at the point in the interval whose index is divisible by the highest power of 2. There is always only one such point for any interval, so storing/modifying this is very little work.

Then to figure out what interval a given position in the array currently falls in, we have to look at all of the neighbors that are increasing powers of 2 until we find the last one with our value. So, for instance, position 13's information might be found in any of the positions 0, 8, 12, 13, 14, 16, 32, 64, .... (And we'll take the first interval we find it in in the list 0, ..., 64, 32, 16, 8, 12, 14, 13.) This is a search of a O(log(n)) list so is O(log(n)) work.

Now how do we implement change?

  1. Update value.
  2. Figure out what interval we were in, and whether we were at an interval boundary.
  3. If intervals got changed, remove the old ones from the heap. (We may remove 0, 1 or 2)
  4. If intervals got change, insert the new ones into the heap. (We may insert 0, 1, or 2)

That update is very complex, but it is a fixed number of O(log(n)) operations and so should be O(log(n)).

like image 183
btilly Avatar answered Sep 23 '22 09:09

btilly


I try to explain my idea. It can be a bit simpler than implementing interval tree, and should give desirable complexity - O(1) for increasing(), and O(logS) for change(), where S is sequences count (can be reduced to N in worst cases of course).

At first you need original array. It need to check borders of intervals (I will use word interval as synonym to sequence) after change(). Let it be A

At the second you need bidirectional list of intervals. Element of this list should store left and right borders. Every increasing sequence should be presented as separate element of this list and this intervals should go one after another as they presented in A. Let this list be L. We need to operate pointers on elements, so, I don't know is it possible to do it on iterators with standard container.

At third you need priority queue that stores lengths of all intervals in you array. So, increasing() function can be done with O(1) time. But you need also storing of pointer to node from L to lookup intervals. Let this priority queue be PQ. More formally you priority queue contains pairs (length of interval, pointer to list node) with comparison only by length.

At forth you need tree, that can retrieve interval borders (or range) for particular element. It can be simply implemented with std::map where key is left border of tree, so with help of map::lower_bound you can find this interval. Value should store pointer to interval in L. Let this map be MP

And next important thing - List nodes should stores indecies of corresponding element in priority queue. And you shouldn't work with priority queue without connection with link to node from L (every swap operation on PQ you should update corresponding indecies on L).

change(i, x) operation can be looks like this:

  1. Find interval, where i located with map. -> you find pointer to corresponding node in L. So, you know borders and length of interval
  2. Try to understand what actions need to do: nothing, split interval, glue intervals.
  3. Do this action on list and map with connection with PQ. If you need split interval, remove it from PQ (this is not remove-max operation) and then add 2 new elements to PQ. Similar if you need to glue intervals, you can remove one from PQ and do increase-key to second.

One difficulty is that PQ should support removing arbitrary element (by index), so you can't use std::priority_queue, but it is not difficult to implement as I think.

like image 40
valdem Avatar answered Sep 19 '22 09:09

valdem