I have written code for two approaches to find out the first unique character in a string on LeetCode.
Problem Statement: Given a string, find the first non-repeating character in it and return it's index. If it doesn't exist, return -1.
Sample Test Cases:
s = "leetcode" return 0.
s = "loveleetcode", return 2.
Approach 1 (O(n)) (correct me if I am wrong):
class Solution { public int firstUniqChar(String s) { HashMap<Character,Integer> charHash = new HashMap<>(); int res = -1; for (int i = 0; i < s.length(); i++) { Integer count = charHash.get(s.charAt(i)); if (count == null){ charHash.put(s.charAt(i),1); } else { charHash.put(s.charAt(i),count + 1); } } for (int i = 0; i < s.length(); i++) { if (charHash.get(s.charAt(i)) == 1) { res = i; break; } } return res; } }
Approach 2 (O(n^2)):
class Solution { public int firstUniqChar(String s) { char[] a = s.toCharArray(); int res = -1; for(int i=0; i<a.length;i++){ if(s.indexOf(a[i])==s.lastIndexOf(a[i])) { res = i; break; } } return res; } }
In approach 2, I think the complexity should be O(n^2) since indexOf executes in O(n*1) here.
But when I execute both the solutions on LeetCode I get runtime 19 ms for approach 2 and 92 ms for approach 1. I am confused; why does that happen?
I suppose LeetCode tests for both small and large input values for best, worst and average cases.
Update:
I am aware of the fact that O(n^2 algorithms) can perform better for certain n < n1. In this question I wanted to understand why that is happening in this case. i.e. which part of Approach 1 makes it slower.
LeetCode link to the question
Absolutely not. For large enough , may be faster than . The key here is for large enough . The big-Oh notation describes asymptotic run time complexity.
C++ C++ is one of the most efficient and fastest languages. It is widely used by competitive programmers for its execution speed and Standard Template Libraries(STL).
So, O(N*log(N)) is far better than O(N^2) . It is much closer to O(N) than to O(N^2) . But your O(N^2) algorithm is faster for N < 100 in real life. There are a lot of reasons why it can be faster.
O(2n) denotes an algorithm whose growth doubles with each addition to the input data set. The growth curve of an O(2n) function is exponential - starting off very shallow, then rising meteorically.
Consider:
Clearly f1 is O(n2) and f2 is O(n). For a small input (say, n=5), we have f1(n) = 25 but f2(n) > 1000.
Just because one function (or time complexity) is O(n) and another is O(n2) doesn't mean that the former is smaller for all values of n, just that there is some n beyond which this will be the case.
For very short strings e.g. single character the cost of creating HashMap
, re-sizing it, looking up entries while boxing and unboxing of char
into Character
might overshadow the cost of String.indexOf()
, which is probably considered hot and inlined by JVM either way.
Another reason might be the cost of RAM access. With additional HashMap
, Character
and Integer
objects involved in a lookup there might be need for additional access to and from RAM. Single access is ~100 ns and this can add up.
Take a look at Bjarne Stroustrup: Why you should avoid Linked Lists. This lecture illustrates that performance is not the same as complexity and memory access can be a killer for an algorithm.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With