Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is vector faster than unordered_map?

I am solving a problem on LeetCode, but nobody has yet been able to explain my issue.

The problem is as such:

Given an arbitrary ransom note string and another string containing letters from all the magazines, write a function that will return true if the ransom note can be constructed from the magazines ; otherwise, it will return false.

Each letter in the magazine string can only be used once in your ransom note.

Note: You may assume that both strings contain only lowercase letters.

canConstruct("a", "b") -> false
canConstruct("aa", "ab") -> false
canConstruct("aa", "aab") -> true

My code (which takes 32ms):

class Solution {
public:
    bool canConstruct(string ransomNote, string magazine) {
        if(ransomNote.size() > magazine.size()) return false;
        unordered_map<char, int> m;
        
        for(int i = 0; i < magazine.size(); i++)
            m[magazine[i]]++;
            
        for(int i = 0; i < ransomNote.size(); i++)
        {
            if(m[ransomNote[i]] <= 0) return false;
            m[ransomNote[i]]--;
        }
        return true;
    }
};

The code (which I dont know why is faster - takes 19ms):

bool canConstruct(string ransomNote, string magazine) {
        int lettersLeft = ransomNote.size(); // Remaining # of letters to be found in magazine
        int arr[26] = {0};
        for (int j = 0; j < ransomNote.size(); j++) {
            arr[ransomNote[j] - 'a']++; // letter - 'a' gives a value of 0 - 25 for each lower case letter a-z
        }
        
        int i = 0;
        while (i < magazine.size() && lettersLeft > 0) {
            if (arr[magazine[i] - 'a'] > 0) {
                arr[magazine[i] - 'a']--;
                lettersLeft--;
            }
            i++;
        }
        if (lettersLeft == 0) {
            return true;
        } else {
            return false;
        }
    }

Both of these have the same complexity and use the same structure to solve the problem, but I don't understand why one takes almost twice as much time than the other. The time to query a vector is O(1), but its the same for an unordered_map. Same story with adding an entry/key to either of them.

Please, could someone explain why the run time varies so much?

like image 901
Omkar Avatar asked Apr 01 '19 09:04

Omkar


2 Answers

First thing to note is, although the average time to query an unordered_map is constant, the worst case is not O(1). As you can see here it actually rises to the order of O(N), N denoting the size of the container.

Secondly, as vector allocates sequential portions of memory, accessing to that memory is highly efficient and actually is constant, even in the worst-case. (i.e. simple pointer arithmetic, as opposed to computing the result of a more complex hash function) There is also the possibility of various levels of caching of sequential memory that may be involved (i.e. depending on the platform your code is running on) which may make the execution of a code using vector even faster, compared to one that is using unordered_map.

In essence, in terms of complexity, the worst-case performance of a vector is more efficient than that of unordered_map. On top of that, most hardware systems offer features such as caching which give usage of vector an even bigger edge. (i.e. lesser constant factors in O(1) operations)

like image 182
ilim Avatar answered Sep 20 '22 15:09

ilim


Your second approach uses plain C array where accessing an element is a simple pointer dereference. But that is not the case with unordered_map. There are two points to note:

  1. First, accessing an element is not a simple pointer dereference. It has to do other works to maintain it's internal structure. An unordered_map is actually a hash table under the hood and C++ standard indirectly mandates it to be implemented using open addressing which is a far more complex algorithm than simple array access.
  2. Second, O(1) access is on average but not on worst case.

For these reasons no wonder that array version will work better than unordered_map even though they have same run time complexity. This is another example where two codes with same run time complexity performs differently.

You will see the benefit of unordered_map only when you have a large number of keys (oppose to fixed 26 here).

like image 25
taskinoor Avatar answered Sep 19 '22 15:09

taskinoor