Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best STL data structure to find unordered elements

I'm currently trying to implement a hash table in C++ for a homework...

I've chosen to use internal linking as a solution for collisions in the table...

and I'm looking for a good STL container that will find a specific entry in an unordered set of data.

I can't use an stl container that is based on trees (set, map, trees, etc...)

Right now I'm using a vector, is it a good choice? The search time will be linear, right? Can it be better?

like image 289
Pacane Avatar asked Nov 05 '22 06:11

Pacane


1 Answers

As you're saying I assume the buckets can get big..., it's better to use std::list. Searching is linear in both cases, but adding elements is constant in std::list.

I guess they're all the same, since data isn't ordered - No, they are not. If they were, there would be just one container. Each container has it's own advantages and disadvantages, different containers are used for different situations.

A little information about vector:

  • std::vector has capacity, that's why it has capacity() and size() methods. They're both different. So, suppose the capacity is 4 and you have 2 elements, then size will be 2. So, adding another element will increment the size (will be 3) and it's all very fast.

  • But what happens when you have to add 5+ elements and the capacity is 4? Completely new memory is allocated, all old elements are copied in the new memory, all old elements are destroyed (their destructors are called, if user-defined types). Then the old memory has to be freed. These are expensive operations if you think that adding/removing elements will be more often.
    You can avoid this, using std::vector::reserve method to reserve some memory in advance and not reallocate new memory all the time and copy everything over and over again. But this is useful when you know the approximate size of these vectors. I suppose you don't in your situation( reserving much memory is't a good solution, too - you should not waste memory just like that ) So, again, I'd prefer std::list.

Or double hash.

Anyway, this allocating of new memory and copying of objects will not happen that often, as std::vector is "clever" and when allocate new space, it doesn't increase the capacity with only 1 element or something. I think it doubles it, but I'm not that sure about that. Argh, I don't know how exactly this is called in English.. Probably something like "accumulative time/memory" or "accumulative complexity" :? Don't know :/

NOTE: Whatever you choose, I'd suggest you to pay your attention at the hash-function. It's the most important here. A hash container should NOT have too many elements with the same hash. So, my advice is to search for a good hash-function and then this will not matter that much.

Hope that helped (:


EDIT: I'd recommend you this article - comparing std::vector and std::deque - it's perfect - compares memory usage (allocating, deallocating, growing), CPU usage, etc. I'd recommend the whole site for such articles - there aren't many, but are really well written.

like image 122
Kiril Kirov Avatar answered Nov 09 '22 05:11

Kiril Kirov