Why is using sorting (O(n log n) complexity) to find the majority element faster than using a HashMap (O(n) complexity)?

Tags:

Majority element question:

Given an array of size n, find the majority element. The majority element is the element that appears more than ⌊ n/2 ⌋ times. You may assume that the array is non-empty and the majority element always exist in the array.

// Solution1 - Sorting ----------------------------------------------------------------
    class Solution {
        public int majorityElement(int[] nums) {
            Arrays.sort(nums);
            return nums[nums.length/2];
        }
    }

// Solution2 - HashMap ---------------------------------------------------------------
class Solution {
    public int majorityElement(int[] nums) {
        // int[] arr1 = new int[nums.length];
        HashMap<Integer, Integer> map = new HashMap<>(100);  
        Integer k = new Integer(-1);
        try{
            for(int i : nums){
                if(map.containsKey(i)){
                    map.put(i, map.get(i)+1);
                }
                else{
                    map.put(i, 1);
                }
            }
            for(Map.Entry<Integer, Integer> entry : map.entrySet()){
                if(entry.getValue()>(nums.length/2)){
                    k = entry.getKey();
                    break;
                }
            }
        }catch(Exception e){
            throw new IllegalArgumentException("Error");
        }
        return k;    
    }
}

The Arrays.sort() function is implemented in Java using QuickSort and has O(n log n) time complexity.

On the other hand, using HashMap to find the majority element has only O(n) time complexity.

Hence, solution 1 (sorting) should take longer than solution 2 (HashMap), but when I was doing the question on LeetCode, the average time taken by solution 2 is much more (almost 8 times more) than solution 1.

Why is that the case? I'm really confused.....

Is the size of the test case the reason? Will solution 2 become more efficient when the number of elements in the test case increases dramatically?

920

asked Jun 08 '20 17:06

Y.Wang

2 Answers

Big O isn't a measure of actual performance. It's only going to give you an idea of how your performance will evolve in comparison to n.

Practically, an algorithms in O(n.logn) will eventually be slower than O(n) for some n. But that n might be 1, 10, 10^6 or even 10^600 - at which point it's probably irrelevant because you'll never run into such a data set - or you won't have enough hardware for it.

Software engineers have to consider both actual performance and performance at the practical limit. For example hash map lookup is in theory faster than an unsorted array lookup... but then most arrays are small (10-100 elements) negating any O(n) advantage due the extra code complexity.

You could certainly optimize your code a bit, but in this case you're unlikely to change the outcome for small n unless you introduce another factor (e.g. artificially slow down the time per cycle with a constant).

(I wanted to find a good metaphor to illustrate, but it's harder than expected...)

102

answered Oct 09 '22 09:10

ptyx

It depends on the test cases, some test cases will be faster in HashMap while others not.

Why is that? The Solution 1 grantee in worst case O(N log₂ N), but the HashMap O(N . (M + R)) where M is the cost of collisions and R the cost of resizing the array.

HashMap uses an array named table of the nodes internally, and it resizes different times when the input increase or shrink. And you assigned it with an initial capacity of 100.

So let see what happens? Java uses Separate chaining for resolving the collisions and some test cases may have lots of collisions which lead to consuming lots of time when a query or update the hashmap.

Conclusion the implementation of hashmap is affected by two factors: 1. Resize the table array based on the input size 2. How many collision appears in the input

answered Oct 09 '22 07:10

heaprc

Related questions
                            
                                How can I launch an Access add-in (not COM add-in) from VBA code?
                            
                                Failed to build a React Native signed release
                            
                                How to find entry of each a(i, j) in n*n matrix where n<=900 and a(i,j)=0 or a(i, j)=1?
                            
                                Bundler is deprecating bundle console in favor of bin/console. Can anyone provide more clarity as to how bin/console should work?
                            
                                How can I conditionally enable a Rust feature based on cfg?
                            
                                Issue with EntityFramework Core - trying to add migration for Identity Tables - SQLite
                            
                                How to make transparent cross symbol in python pyqt5
                            
                                Oracle nested xml parsing
                            
                                How can I get all xhr calls in puppeteer?
                            
                                Pycharm: DLL load failed: The specified procedure could not be found
                            
                                How an os.Signal channel is handled internally in Go?
                            
                                Why does my TypeScript report an error for a union type?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why is using sorting (O(n log n) complexity) to find the majority element faster than using a HashMap (O(n) complexity)?

Tags:

java

performance

hashmap

quicksort

Y.Wang

People also ask

2 Answers

ptyx

heaprc

Recent Activity

Donate For Us