Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Queries add a number, remove a number, replace all A[i] in array with A[i] xor X, find sum of K smallest numbers

Tags:

The problem is:

Initially, the sequence is empty. There are n queries and 4 types of queries:

  • Add(x): add x to the sequence, if there is already x in the sequence, still add x.
  • Remove(x): remove x from the sequence 1 times.
  • Xor(x): replace all N of the sequence with N xor x.
  • Sum(K): find sum of the k smallest elements in the sequence.
  • 0 <= x, n, K <= 10^5

For each query sum(x), output the sum of the x smallest elements in the sequence.

Input:

7
Add(4)      // A[] = {4}
Remove(3)   // A[] = {4}
Add(2)      // A[] = {4, 2}
Sum(2)      // A[] = {4, 2} => Output: 6
Xor(2)      // A[] = {4^2, 2^2} = {6, 0}
Sum(1)      // A[] = {6, 0} => Output: 0
Sum(2)      // A[] = {6, 0} => Output: 6

I solved the problem with the following way:

Use a vector A to hold the sequence of numbers, and an array Count[] where Count[x] is the number of occurrences of x in A. Initially A is empty, and every Count[x] = 0.

  • For each Add(x) query, I add x to A, and Count[x] = Count[x]+1
  • For each Remove(x) query, if Count[x] = 0 then skip, otherwise, remove x from A and Count[x] = Count[x]-1
  • For each Xor(x) query, replace every A[i] with A[i]^x
  • For each Sum(x) query, sort A in ascending value, take the sum of the first x numbers

It seems that my way has a complexity of O(n^2), so for n <= 100000 the above algorithm cannot work. Is there a better way to solve this problem? Thanks a lot.

My code can run well in n <= 5000. Here is it:

int Count[100001];
vector<int> A;
void Add(int x) {
     A.push_back(x);
     Count[x] = Count[x]+1;
}
void Remove(int x) {
     if (Count[x] == 0) return;
     Count[x] = Count[x]-1;
     auto Find = find(A.begin(), A.end(), x);
     A.erase(Find);
}
void Xor(int x) {
     for (int& i : A)
         i = i^x;
}
int Sum(int x) {
     int Num = 0, S = 0;
     for (int i : A) {
        if (Num + 1 > x) return S;
        S = S + i; Num = Num + 1;
     }
     return S;
}
like image 568
little.dinosaurs Avatar asked Oct 03 '21 05:10

little.dinosaurs


1 Answers

I'll describe a data structure that supports Add(x)/Remove(x)/Count()/SumXorWith(x) (returns the sum of all elements xor x; doesn't modify the sequence) and then sketch how to extend it to a full solution where each operation is O(log^2 n) (taking n to be both the number of operations and the upper bound on the values).

First observe that Count and SumXorWith can be used to count, for each bit position, how many numbers have that position set (e.g., for the low order bit, it's (Count() + SumXorWith(0) - SumXorWith(1)) / 2). Conversely, it's enough to maintain these counts. In pseudocode:

*** Variables, initially zero:

count : int
bit_count : int[17]

*** Operations:

Add(x):
increment count
for j from 0 to 16, add the j'th bit of x to bit_count[j]

Remove(x):
decrement count
for j from 0 to 16, subtract the j'th bit of x from bit_count[j]

Count():
return count

SumXorWith(x):
return the sum for j from 0 to 16 of
2**j * (if j'th bit of x = 0 then bit_count[j] else count - bit_count[j])

To extend this data structure to handle Xor(x)/Sum(), we could just take count - bit_count for each bit set in x, but for efficiency (that we'll need later), there's a trick. The idea is that we store the sequence xor cum_xor. More pseudocode:

*** Additional variable, initially zero

cum_xor : int

*** Operations:

Add(x): super.Add(x xor cum_xor)
Remove(x): super.Remove(x xor cum_xor)
Xor(x): cum_xor <- cum_xor xor x
Count(): return super.Count()
Sum(): return super.SumXorWith(cum_xor)

Finally, we need to handle Sum(x), with selection. This is, frankly, the tedious part. We set up a height-17 (ceiling of log2(100000)) trie on big-endian bit patterns, with one of the data structures above at each node of the trie. To Add/Remove, we descend the trie, doing Add/Remove at each node. Xor we handle as before, by updating cum_xor. Sum(x) is the trickiest, of course. Starting at the root of the trie, we examine the current node. If it has at most x elements, just sum it. Otherwise, its "favored" child is the one that agrees with cum_xor, and its "disfavored" child is the one that disagrees. If the favored child has at least x elements, then we can operate recursively on it and ignore the disfavored child. Otherwise, we sum the whole favored child and operate recursively on the disfavored child, decreasing x by the number of elements in the favored child.

(For maximum practical efficiency, we'd want something with higher fan-out than the trie and likely the naive implementation near the leaves, but this is as simple as I can make it and likely fast enough.)

like image 141
David Eisenstat Avatar answered Oct 12 '22 19:10

David Eisenstat