What is the best C++ data structure that could be used for storing and managing a collection of integers?

Tags:

this is my first StackOverflow question so please let me know if I didn't follow community guidelines with this question and if I should delete it.

I got my first ever interview question and I got rejected because of my implementation.

The question is:

Design and implement a C++ class that stores a collection of integers. On construction, the collection should be empty. The same number may be stored more than once.

Implement the following methods:

Insert(int x). Insert an entry for the value “x”.
Erase(int x). Remove one entry with the value “x” (if one exists) from the collection.
Erase(int from, int to). Remove all the entries with a value in the range [from, to).
Count(int from, int to). Count how many entries have a value in the range [from, to).

I thought a good implementation would be to use linked lists since it uses non-contiguous memory and removing entries would not require shuffling a lot of data (like in vectors or arrays). However, I got feedback from the company saying my implementation was O(n^2) time complexity and was very inefficient so I was rejected. I don't want to repeat the same mistake if a similar question pops up in another interview so I'd like to know what is the best way to approach this question (a friend suggested using maps but he is also unsure).

My code is:

void IntegerCollector::insert(int x)
{
    entries.push_back(x);
}

void IntegerCollector::erase(int x)
{
    list<int>::iterator position = find(entries.begin(), entries.end(), x);
    if (position != entries.end())
        entries.erase(position);
}

void IntegerCollector::erase(int from, int to)
{
    list<int>::iterator position = entries.begin();

    while (position != entries.end())
    {
        if (*position >= from && *position <= to)
            position = entries.erase(position);
        else
            position++;
    }
}

int IntegerCollector::count(int from, int to)
{
    list<int>::iterator position = entries.begin();
    int count = 0;

    while (position != entries.end())
    {
        if (*position >= from && *position <= to)
            count++;

        position++;
    }

    return count;
}

The feedback mentioned that they would only hire candidates that can implement solutions with O(nlogn) complexity.

765

asked Dec 17 '18 14:12

Muhamad Gafar

2 Answers

The key consideration here is that integers of the same value are indistinguishable. Thus, all you need to do is store a count of each distinct value in the container.

Then, you can just use a std::map<int, size_t> as backing structure that maps each integer (key) to the number of times it exists in your data structure (value = count).

Inserting and erasing single elements is just incrementing and decrementing (possibly removing in the latter case) values for the given key (both O(log(distinct_values_in_container)) for finding the key).

Since std::map is ordered, you can use lower_bound and upper_bound to do binary search, so finding the keys in [from, to) is very efficient (finding the range is also O(log(distinct_values_in_container))). Erasing them or summing their counts is easy then (runtime is more complicated here).

If you want to gain extra credit, it will pay to understand the limitations of asymptotic runtimes. Consider these points:

What these asymptotic runtimes mean in practice depends a lot on the usage pattern. If no duplicates are ever inserted, we are at O(n), but you can also get arbitrarily good times (in terms of n = number of insertions) if there are lots of identical elements (for example, if each key has O(exp(n)) values then O(distinct_values_in_container) = O(log(n))). In the extreme case that all involved integers are the same, all operations are O(1).

As an interviewee, I would also talk about whether these asymptotic runtimes are meaningful in practice. It may be that the map's tree structure (which is toxic for the cache and branch predictor) loses to a simple std::vector<std::pair<int, size_t>> (if erasure is always in bulk) or even a std::vector<size_t> (if the keys are "dense") for the given application.

I think your main mistake (and why you were rejected) is not realizing that there is no need to store each inserted integer separately. You unfortunately also seem to have missed the possibility of keeping the list sorted, but I don't see where the O(n^2) comes from.

198

answered Oct 18 '22 18:10

Max Langhof

If you were being hired for a role that didn't require any previous programming experience then I would not have rejected you on that code sample alone.

Using a std::list was an interesting choice and showed you had thought about this. The fact that you used a C++ standard library container rather than trying to build this from a lower level is a yes-hire flag for me. With your approach (1) is fast, but (2), (3), and (4) will be slow. In the absence of any other information you ought to arrange things so that reading (including querying) data is faster than writing. Your approach has this the other way round. Sometimes though that is what you want - for example when taking measurements real-time you’d want the data dump stage to be as fast as possible at the expense of anything else. For that application your solution would be difficult to beat!

Reservations, but by no means red lines:

An integer does not mean an int. In the absence of being able to clarify, build your class on

template<typename Y> std::map<Y, std::size_t>

where Y is an integral type. Note the use of std::size_t for the counter. It counts the number of times a particular Y is present.

Include some program comments next time.

Don't use using namespace std;. Although tutorials do for clarity, professional programmers don't.

answered Oct 18 '22 18:10

Bathsheba

Related questions
                            
                                How to start developing with OpenGL and C++, what tools do I need to install on windows [closed]
                            
                                c++ insert into vector at known position
                            
                                How to open a .a file
                            
                                What’s the best way to delete boost::thread object right after its work is complete?
                            
                                Concurrent writes in the same global memory location
                            
                                Why is std::cout not printing the correct value for my int8_t number?
                            
                                What‘s the difference between srand(1) and srand(0)
                            
                                Why should I initialize static class variables in C++?
                            
                                The most efficient way to implement a phonetic search
                            
                                Should constructors accept parameters or should I create setters?
                            
                                Invoking virtual method in constructor: difference between Java and C++
                            
                                Is atomic decrementing more expensive than incrementing?
                            
                                override map::compare with lambda function directly
                            
                                error : expected unqualified-id before return in c++
                            
                                Accepting std::chrono::duration of any representation/period
                            
                                Adding all values of map using std::accumulate
                            
                                Create shared library from cpp files and static library with g++ [duplicate]
                            
                                What is better: reserve vector capacity, preallocate to size or push back in loop?
                            
                                Why use mem_fn?
                            
                                Is there way to perform "if (condition) typedef ..."

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is the best C++ data structure that could be used for storing and managing a collection of integers?

Tags:

c++

collections

integer

Muhamad Gafar

People also ask

2 Answers

Max Langhof

Bathsheba

Recent Activity

Donate For Us