Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to zero out array in O(1)?

Is there an way to zero out an array in with time complexsity O(1)? It's obvious that this can be done by for-loop, memset. But their time complexity are not O(1).

like image 388
Peiyun Avatar asked May 29 '12 10:05

Peiyun


People also ask

How do you initialize an array in O 1?

In order to initialize an array, you need to pass over all its cells and put there a zero, as the default initial value. That process takes O(n) time, whereas n = array_size.

Is array length O 1?

array. length is O(1) and the loop is O(n) overall (assuming the "something" is constant-time). Is it different for c? C is different in that, depending on how the array is allocated, you can either find out its size in O(1) time or not at all.

Why is array O 1?

Why the complexity of fetching value is O(1)? As Arrays are allocated contiguously in memory, Fetching a value via an index of the array is an arithmetic operation. All arithmetic operations are done in constant time i.e., O(1).


2 Answers

Yes

However not any array. It takes an array that has been crafted for this to work.

template <typename T, size_t N>
class Array {
public:
    Array(): generation(0) {}

    void clear() {
        // FIXME: deal with overflow
        ++generation;
    }

    T get(std::size_t i) const {
        if (i >= N) { throw std::runtime_error("out of range"); }

        TimedT const& t = data[i];
        return t.second == generation ? t.first : T{};
    }

    void set(std::size_t i, T t) {
        if (i >= N) { throw std::runtime_error("out of range"); }

        data[i] = std::make_pair(t, generation);
    }


private:
    typedef std::pair<T, unsigned> TimedT;

    TimedT data[N];
    unsigned generation;
};

The principle is simple:

  • we define an epoch using the generation attribute
  • when an item is set, the epoch in which it has been set is recorded
  • only items of the current epoch can be seen
  • clearing is thus equivalent to incrementing the epoch counter

The method has two issues:

  • storage increase: for each item we store an epoch
  • generation counter overflow: there is something as a maximum number of epochs

The latter can be thwarted using a real big integer (uint64_t at the cost of more storage).

The former is a natural consequence, one possible solution is to use buckets to downplay the issue by having for example up to 64 items associated to a single counter and a bitmask identifying which are valid within this counter.


EDIT: just wanted to get back on the buckets idea.

The original solution has an overhead of 8 bytes (64 bits) per element (if already 8-bytes aligned). Depending on the elements stored it might or might not be a big deal.

If it is a big deal, the idea is to use buckets; of course like all trade-off it slows down access even more.

template <typename T>
class BucketArray {
public:
     BucketArray(): generation(0), mask(0) {}
     
     T get(std::size_t index, std::size_t gen) const {
         assert(index < 64);

         return gen == generation and (mask & (1 << index)) ?
                data[index] : T{};
     }

     void set(std::size_t index, T t, std::size_t gen) {
         assert(index < 64);

         if (generation < gen) { mask = 0; generation = gen; }

         mask |= (1 << index);
         data[index] = t;
     }

private:
     std::uint64_t generation;
     std::uint64_t mask;
     T data[64];
};

Note that this small array of a fixed number of elements (we could actually template this and statically check it's inferior or equal to 64) only has 16 bytes of overhead. This means we have an overhead of 2 bits per element.

template <typename T, size_t N>
class Array {
    typedef BucketArray<T> Bucket;
public:
    Array(): generation(0) {}
    
    void clear() { ++generation; }

    T get(std::size_t i) const {
        if (i >= N) { throw ... }

        Bucket const& bucket = data[i / 64];
        return bucket.get(i % 64, generation);
    }

    void set(std::size_t i, T t) {
        if (i >= N) { throw ... }

        Bucket& bucket = data[i / 64];
        bucket.set(i % 64, t, generation);
    }

private:
    std::uint64_t generation;
    Bucket data[N / 64 + 1];
};

We got the space overhead down by a factor of... 32. Now the array can even be used to store char for example, whereas before it would have been prohibitive. The cost is that access got slower, as we get a division and modulo (when we will get a standardized operation that returns both results in one shot ?).

like image 77
Matthieu M. Avatar answered Sep 17 '22 14:09

Matthieu M.


You can't modify n locations in memory in less than O(n) (even if your hardware, for sufficiently small n, maybe allows a constant-time operation to zero certain nicely-aligned blocks of memory, like for example flash memory does).

However, if the object of the exercise is a bit of lateral thinking, then you can write a class representing a "sparse" array. The general idea of a sparse array is that you keep a collection (perhaps a map, although depending on usage that might not be all there is to it), and when you look up an index, if it's not in the underlying collection then you return 0.

If you can clear the underlying collection in O(1), then you can zero out your sparse array in O(1). Clearing a std::map isn't usually constant-time in the size of the map, because all those nodes need to be freed. But you could design a collection that can be cleared in O(1) by moving the whole tree over from "the contents of my map", to "a tree of nodes that I have reserved for future use". The disadvantage would just be that this "reserved" space is still allocated, a bit like what happens when a vector gets smaller.

like image 26
Steve Jessop Avatar answered Sep 17 '22 14:09

Steve Jessop