I have gone through few tutorials of Range update - Range queries of Binary indexed tree. I'm unable to understand any of them. I don't understand the need of building another tree. Could someone explain it to me in plain English with an example?

Trying to explain in more intuitive way (the way I understood). I'll divide it in four steps: Assume the update is between A and B with V and the query is a prefix query for any index <=X <h3>The first range update/point query tree (T1)</h3> The first is a simple range update/point query tree. When you update A to B with V, in practice you add V to position A, so any prefix query X>=A is affected by it. Then you remove V from B+1, so any query X >= B+1 doesn't see the V added to A. No surprises here. <h3>Prefix query to the range update/point tree</h3> The <code>T1.sum(X)</code> is a point query to this first tree at X. We optimistically assume then that every element before X is equal to the value at X. That's why we do <code>T1.sum(X)*X</code>. Obviously this isn't quite right, that's why we: <h3>Use a modified range update/point query tree to fix the result (T2)</h3> When updating the range, we also update a second tree to tell how much we have to fix the first <code>T1.sum(X)*X</code> query. This update consists in removing <code>(A-1)*V</code> from any query X>=A. Then we add back <code>B*V</code> for X>=B. We do the latter because queries to the first tree won't return V for X>=B+1 (because of the <code>T1.add(B+1, -V)</code>), so we need to somehow tell that there is a rectangle of area <code>(B-A+1)*V</code> for any query X>=B+1. We already removed <code>(A-1)*V</code> from A, we only need to add back <code>B*V</code> to B+1. <h3>Wrapping it all together</h3> <pre class="prettyprint"><code>update(A, B, V): T1.add(A, V) # add V to any X>=A T1.add(B+1, -V) # cancel previously added V from any X>=B+1 T2.add(A, (A-1)*V) # add a fix for (A-1)s V that didn't exist before A T2.add(B+1, -B*V) # remove the fix, and add more (B-A+1)*V to any query # X>=B+1. This is really -(A-1)*V -(B-A+1)*V, but it # simplifies to -B*V sum(X): return T1.sum(X)*X - T2.sum(X) </code></pre>

Let me try to explain it. <ol> <li>Why do we need a second tree? I cannot answer this question. Strictly speaking, I cannot prove that it is impossible to solve this problem using only one binary index tree(and I have never seen such a proof anywhere). </li> <li>How can one come up with this method? Again, I don't know. I'm not the inventor of this algorithm. So I cannot tell why does it look exactly like this. The only thing I will try to explain is why and how this method works.</li> <li>To understand this algorithm better, the first we should do is to forget about how the binary index tree itself works. Let's treat it as just a black box that supports two operations: update one element and perform a range sum query in <code>O(log n)</code> time. We just want to use one or more such "black boxes" to build a data structure that can perform range updates and queries efficiently.</li> <li> We will maintain two binary index trees: <code>T1</code> and <code>T2</code>. I will use the following notation: <code>T.add(pos, delta)</code> for performing a point update in a position <code>pos</code> by <code>delta</code> value and <code>T.get(pos)</code> for a sum <code>[0 ... pos]</code>. I claim that if an update function looks like this: <pre class="prettyprint"><code>void update(left, right, delta) T1.add(left, delta) T1.add(right + 1, -delta); T2.add(left, delta * (left - 1)) T2.add(right + 1, -delta * right); </code></pre> and a range query is answered this way(for a prefix <code>[0 ... pos]</code>): <pre class="prettyprint"><code>int getSum(pos) return T1.sum(pos) * pos - T2.sum(pos) </code></pre> then the result is always correct. </li> <li> To prove its correctness, I will prove the following statement: each update changes the answer appropriately(it gives a proof by induction for all operations, because initially everything is filled with zeros and the correctness is obvious). Let's assume that we had a <code>left, right, DELTA</code> update and now we are performing <code>pos</code> query(that is, 0 ... pos sum). Let's consider 3 cases: i) <code>pos < L</code>. The update does not affect this query. The answer is correct(due to the induction hypothesis). ii) <code>L <= pos <= R</code>. This update will add <code>DELTA * pos - (left - 1) * pos</code>. It means that <code>DELTA</code> is added <code>pos - L + 1</code> times. That's exactly how it should be. Thus, this case is also handled correctly. iii) <code>pos > R</code>. This update will add <code>0 + DELTA * right - DELTA * (left - 1)</code>. That is, <code>DELTA</code> is added exactly <code>right - left + 1</code> times. It is correct, too. We have just shown the correctness of the induction step. Thus, this algorithm is correct. </li> <li>I have only shown how to answer <code>[0, pos]</code> sum queries. But answering <code>[left, right]</code> query is easy now: it is just <code>getSum(right) - getSum(left - 1)</code>. </li> </ol> That's it. I have shown that this algorithm is correct. Now let's try to code it and see if it works(it is just a sketch, so the code quality might be not really good): <pre class="prettyprint"><code>#include <bits/stdc++.h> using namespace std; // Binary index tree. struct BIT { vector<int> f; BIT(int n = 0) { f.assign(n, 0); } int get(int at) { int res = 0; for (; at >= 0; at = (at & (at + 1)) - 1) res += f[at]; return res; } void upd(int at, int delta) { for (; at < f.size(); at = (at | (at + 1))) f[at] += delta; } }; // A tree for range updates and queries. struct Tree { BIT f1; BIT f2; Tree(int n = 0): f1(n + 1), f2(n + 1) {} void upd(int low, int high, int delta) { f1.upd(low, delta); f1.upd(high + 1, -delta); f2.upd(low, delta * (low - 1)); f2.upd(high + 1, -delta * high); } int get(int pos) { return f1.get(pos) * pos - f2.get(pos); } int get(int low, int high) { return get(high) - (low == 0 ? 0 : get(low - 1)); } }; // A naive implementation. struct DummyTree { vector<int> a; DummyTree(int n = 0): a(n) {} void upd(int low, int high, int delta) { for (int i = low; i <= high; i++) a[i] += delta; } int get(int low, int high) { int res = 0; for (int i = low; i <= high; i++) res += a[i]; return res; } }; int main() { ios_base::sync_with_stdio(0); int n = 100; Tree t1(n); DummyTree t2(n); for (int i = 0; i < 10000; i++) { int l = rand() % n; int r = rand() % n; int v = rand() % 10; if (l > r) swap(l, r); t1.upd(l, r, v); t2.upd(l, r, v); for (int low = 0; low < n; low++) for (int high = low; high < n; high++) assert(t1.get(low, high) == t2.get(low, high)); } return 0; } </code></pre> Oh, yeah. I forgot about time complexity analysis. But it is trivial here: we make a constant number of queries to binary index tree, thus it is <code>O(log n)</code> per query.

Need a clear explanation of Range updates and range queries Binary indexed tree

2 Answers

Trying to explain in more intuitive way (the way I understood). I'll divide it in four steps:

Assume the update is between A and B with V and the query is a prefix query for any index <=X

The first range update/point query tree (T1)

The first is a simple range update/point query tree. When you update A to B with V, in practice you add V to position A, so any prefix query X>=A is affected by it. Then you remove V from B+1, so any query X >= B+1 doesn't see the V added to A. No surprises here.

Prefix query to the range update/point tree

The T1.sum(X) is a point query to this first tree at X. We optimistically assume then that every element before X is equal to the value at X. That's why we do T1.sum(X)*X. Obviously this isn't quite right, that's why we:

Use a modified range update/point query tree to fix the result (T2)

When updating the range, we also update a second tree to tell how much we have to fix the first T1.sum(X)*X query. This update consists in removing (A-1)*V from any query X>=A. Then we add back B*V for X>=B. We do the latter because queries to the first tree won't return V for X>=B+1 (because of the T1.add(B+1, -V)), so we need to somehow tell that there is a rectangle of area (B-A+1)*V for any query X>=B+1. We already removed (A-1)*V from A, we only need to add back B*V to B+1.

Wrapping it all together

update(A, B, V):
    T1.add(A, V)         # add V to any X>=A
    T1.add(B+1, -V)      # cancel previously added V from any X>=B+1

    T2.add(A, (A-1)*V)   # add a fix for (A-1)s V that didn't exist before A
    T2.add(B+1, -B*V)    # remove the fix, and add more (B-A+1)*V to any query 
                         # X>=B+1. This is really -(A-1)*V -(B-A+1)*V, but it 
                         # simplifies to -B*V

sum(X):
    return T1.sum(X)*X - T2.sum(X)

116

answered Sep 30 '22 01:09

Juan Lopes

Let me try to explain it.

Why do we need a second tree? I cannot answer this question. Strictly speaking, I cannot prove that it is impossible to solve this problem using only one binary index tree(and I have never seen such a proof anywhere).
How can one come up with this method? Again, I don't know. I'm not the inventor of this algorithm. So I cannot tell why does it look exactly like this. The only thing I will try to explain is why and how this method works.
To understand this algorithm better, the first we should do is to forget about how the binary index tree itself works. Let's treat it as just a black box that supports two operations: update one element and perform a range sum query in O(log n) time. We just want to use one or more such "black boxes" to build a data structure that can perform range updates and queries efficiently.
We will maintain two binary index trees: T1 and T2. I will use the following notation: T.add(pos, delta) for performing a point update in a position pos by delta value and T.get(pos) for a sum [0 ... pos]. I claim that if an update function looks like this:
```
void update(left, right, delta)
    T1.add(left, delta)
    T1.add(right + 1, -delta);
    T2.add(left, delta * (left - 1))
    T2.add(right + 1, -delta * right);
```
and a range query is answered this way(for a prefix [0 ... pos]):
```
int getSum(pos)
    return T1.sum(pos) * pos - T2.sum(pos)
```
then the result is always correct.
To prove its correctness, I will prove the following statement: each update changes the answer appropriately(it gives a proof by induction for all operations, because initially everything is filled with zeros and the correctness is obvious). Let's assume that we had a left, right, DELTA update and now we are performing pos query(that is, 0 ... pos sum). Let's consider 3 cases:
i) pos < L. The update does not affect this query. The answer is correct(due to the induction hypothesis).
ii) L <= pos <= R. This update will add DELTA * pos - (left - 1) * pos. It means that DELTA is added pos - L + 1 times. That's exactly how it should be. Thus, this case is also handled correctly.
iii) pos > R. This update will add 0 + DELTA * right - DELTA * (left - 1). That is, DELTA is added exactly right - left + 1 times. It is correct, too.

We have just shown the correctness of the induction step. Thus, this algorithm is correct.
I have only shown how to answer [0, pos] sum queries. But answering [left, right] query is easy now: it is just getSum(right) - getSum(left - 1).

That's it. I have shown that this algorithm is correct. Now let's try to code it and see if it works(it is just a sketch, so the code quality might be not really good):

#include <bits/stdc++.h>

using namespace std;

// Binary index tree.
struct BIT {
  vector<int> f;

  BIT(int n = 0) {
    f.assign(n, 0);
  }

  int get(int at) {
    int res = 0;
    for (; at >= 0; at = (at & (at + 1)) - 1)
      res += f[at];
    return res;
  }

  void upd(int at, int delta) {
    for (; at < f.size(); at = (at | (at + 1)))
      f[at] += delta;
  }
};

// A tree for range updates and queries.
struct Tree {
  BIT f1;
  BIT f2;

  Tree(int n = 0): f1(n + 1), f2(n + 1) {}

  void upd(int low, int high, int delta) {
    f1.upd(low, delta);
    f1.upd(high + 1, -delta);
    f2.upd(low, delta * (low - 1));
    f2.upd(high + 1, -delta * high);
  }

  int get(int pos) {
    return f1.get(pos) * pos - f2.get(pos);
  }

  int get(int low, int high) {
    return get(high) - (low == 0 ? 0 : get(low - 1));
  }
};

// A naive implementation.
struct DummyTree {
  vector<int> a;

  DummyTree(int n = 0): a(n) {}

  void upd(int low, int high, int delta) {
    for (int i = low; i <= high; i++)
      a[i] += delta;
  }

  int get(int low, int high) {
    int res = 0;
    for (int i = low; i <= high; i++)
      res += a[i];
    return res;
  }
};

int main() {
  ios_base::sync_with_stdio(0);
  int n = 100;
  Tree t1(n);
  DummyTree t2(n);
  for (int i = 0; i < 10000; i++) {
    int l = rand() % n;
    int r = rand() % n;
    int v = rand() % 10;
    if (l > r)
      swap(l, r);
    t1.upd(l, r, v);
    t2.upd(l, r, v);
    for (int low = 0; low < n; low++)
      for (int high = low; high < n; high++)
    assert(t1.get(low, high) == t2.get(low, high));
  }
  return 0;
}

Oh, yeah. I forgot about time complexity analysis. But it is trivial here: we make a constant number of queries to binary index tree, thus it is O(log n) per query.

answered Sep 29 '22 23:09

kraskevich

Related questions
                            
                                Efficiently find all connected induced subgraphs
                            
                                How to update element priorities in a heap for Prim's Algorithm?
                            
                                Rectangle packing with constraints
                            
                                Strange but practical 2D bin packing optimization
                            
                                Swiss tournament - pairing algorithm
                            
                                Interpreting Dijkstra's Algorithm
                            
                                How to find an element in a specified range in std::map?
                            
                                Finding neighbourhoods (cliques) in street data (a graph)
                            
                                which sorting algorithms give near / approximate sort sooner?
                            
                                What algorithm to use to segment a sequence of numbers into n subsets, to minimize the standard deviation of the sum of the numbers in each subset
                            
                                Storing a bucket of numbers in an efficient data structure
                            
                                Finding the longest road in a Settlers of Catan game algorithmically
                            
                                Which node data structure to use for a trie
                            
                                PHP Repairing Bad Text
                            
                                Algorithm for generating a unique (constant) code for a string which should be reversible
                            
                                insert or update keys in a python dictionary
                            
                                Number of pairs that share at least one digit
                            
                                Crossover operator for permutations
                            
                                Print all unique combination of factors of a given number
                            
                                Is there a fast algorithm to determine the godel number of a term of a context free language?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Need a clear explanation of Range updates and range queries Binary indexed tree

Tags:

algorithm

data-structures

fenwick-tree

user3739818

People also ask