Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fast algorithm mapping int to monotonically increasing int subset

Tags:

c

algorithm

I have encountered variations of this problem multiple times, and most recently it became a bottleneck in my arithmetic coder implementation. Given N (<= 256) segments of known non-negative size Si laid out in order starting from the origin, and for a given x, I want to find n such that

S0 + S1 + ... + Sn-1 <= x < S0 + S1 + ... + Sn

The catch is that lookups and updates are done at about the same frequency, and almost every update is in the form of increasing the size of a segment by 1. Also, the bigger a segment, the higher the probability it will be looked up or updated again.

Obviously some sort of tree seems like the obvious approach, but I have been unable to come up with any tree implementation that satisfactorily takes advantage of the known domain specific details.

Given the relatively small size of N, I also tried linear approaches, but they turned out to be considerably slower than a naive binary tree (even after some optimization, like starting from the back of the list for numbers above half the total)

Similarly, I tested introducing an intermediate step that remaps values in such a way as to keep segments ordered by size, to make access faster for the most frequently used, but the added overhead exceeded gains.

Sorry for the unclear title -- despite it being a fairly basic problem, I am not aware of any specific names for it.

like image 261
tohoho Avatar asked Jun 01 '15 06:06

tohoho


2 Answers

I suppose some BST would do... You may try to add a new numeric member (int or long) to each node to keep a sum of values of all left descendants. Then you'll seek for each item in approximately logarithmic time, and once an item is added, removed or modified you'll have to update just its ancestors on the returning path from the recursion. You may apply some self-organizing tree structure, for example AVL to keep the worst-case search optimal or a splay tree to optimize searches for those most often used items. Take care to update the left-subtree-sums during rebalancing or splaying.

like image 184
CiaPan Avatar answered Oct 31 '22 19:10

CiaPan


You could use a binary tree where each node n contains two integers A_n and U_n, where initially A_n = S_0 + .. S_n and U_n = 0.

Let, at any fixed subsequent time, T_n = S_0 + .. + S_n.

When looking for the place of a query x, you would go along the tree, knowing that for each node m the current corresponding value of T_m is A_m + U_m + sum_{p : ancestors of m, we visited the right child of p to attain m} U_p. This solves look up in O(log(N)).

For update of the n-th interval (increasing its size by y), you just look for it in the tree, increasing the value of U_m og y for each node m that you visit along the way. This also solves update in O(log(N)).

like image 1
vib Avatar answered Oct 31 '22 19:10

vib