Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Algorithm with O(n log n) time and O(1) space complexity vs O(n) time and O(n) space complexity

I am curious to know which algorithm is better :

  • Algorithm with O(n log n) time and O(1) space complexity
  • Algorithm with O(n) time and O(n) space complexity

Most of the algorithm which are solved in O(n long n) time and constant space can be solved in O(n) time by paying penalty in terms of space. Which algorithm is better ? How do I decide between these two parameters ?

Example : Array Pair Sum

  1. Can be solved in O(n logn) time by sorting
  2. Can be solved using hash maps in O(n) time but with O(n) space
like image 473
Anil Kumar K K Avatar asked Mar 22 '15 23:03

Anil Kumar K K


1 Answers

Without actually testing anything (a risky move!), I'm going to claim that the O(n log n)-time, O(1)-space algorithm is probably faster than the O(n)-time, O(n)-space algorithm, but is still probably not the optimal algorithm.

First, let's talk about this from a high-level perspective that ignores the particular details of the algorithms you're describing. One detail to keep in mind is that although O(n)-time algorithms are asymptotically faster than O(n log n)-time algorithms, they're only faster by a logarithmic factor. Keeping in mind that the number of atoms in the universe is about 1080 (thanks, physics!), the base-2 log of the number of atoms in the universe is about 240. From a practical perspective, this means that you can think of that extra O(log n) factor as just a constant. Consequently, to determine whether an O(n log n) algorithm will be faster or slower than an O(n) algorithm on a particular input, you'd need to know more about what constants are hidden by the big-O notation. An algorithm that runs in time 600n will be slower than an algorithm that runs in time 2n log n for any n that fits in the universe, for example. Therefore, in terms of wall-clock performance, to evaluate which algorithm is faster, you'd probably need to do a bit of profiling on the algorithm to see which one is faster.

Then there's the effects of caching and locality of reference. Computer memory has a huge number of caches in it that are optimized for the case where reads and writes are located next to one another. The cost of a cache miss can be huge - hundreds or thousands of times slower than a hit - so you want to try to minimize this. If an algorithm uses O(n) memory, then as n gets larger, you need to start worrying about how closely-packed your memory accesses will be. If they're spread out, then the cost of the cache misses might start to add up pretty quickly, significantly driving up the coefficient hidden in the big-O notation of the time complexity. If they're more sequential, then you probably don't need to worry too much about this.

You also need to be careful about total memory available. If you have 8GB of RAM on your system and get an array with one billion 32-bit integers, then if you need O(n) auxiliary space with even a reasonable constant, you're not going to be able to fit your auxiliary memory into main memory and it will start getting paged out by the OS, really killing your runtime.

Finally, there's the issue of randomness. Algorithms based on hashing have expected fast runtimes, but if you get a bad hash function, there's a chance that the algorithm will slow down. Generating good random bits is hard, so most hash tables just go for "reasonably good" hash functions, risking worst-case inputs that will make the algorithm's performance degenerate.

So how do these concerns actually play out in practice? Well, let's look at the algorithms. The O(n)-time, O(n)-space algorithm works by building a hash table of all the elements in the array so that you can easily check whether a given element is present in the array, then scanning over the array and seeing whether there is a pair that sums up to the total. Let's think about how this algorithm works given the factors above.

  • The memory usage is O(n) and, due to how hashing works, the accesses to the hash table are not likely to be sequential (an ideal hash table would have pretty much random access patterns). This means that you're going to have a lot of cache misses.

  • The high memory usage means that for large inputs, you have to worry about memory getting paged in and out, exacerbating the above problem.

  • As a result of the above two factors, the constant term hidden in the O(n) runtime is likely much higher than it looks.

  • Hashing is not worst-case efficient, so there may be inputs that cause performance to significantly degrade.

Now, think about the O(n log n)-time, O(1) space algorithm, which works by doing an in-place array sort (say, heapsort), then walking inwards from the left and right and seeing if you can find a pair that sums to the target. The second step in this process has excellent locality of reference - virtually all array accesses are adjacent - and pretty much all of the cache misses you're going to get are going to be in the sorting step. This will increase the constant factor hidden in the big-O notation. However, the algorithm has no degenerate inputs and its low memory footprint probably means that the locality of reference will be better than the hash table approach. Therefore, if I had to guess, I'd put my money on this algorithm.

... Well, actually, I'd put my money on a third algorithm: an O(n log n)-time, O(log n)-space algorithm that's basically the above algorithm, but using introsort instead of heapsort. Introsort is an O(n log n)-time, O(log n)-space algorithm that uses randomized quicksort to mostly sort the array, switching to heapsort if the quicksort looks like it's about to degenerate, and doing a final insertion sort pass to clean everything up. Quicksort has amazing locality of reference - this is why it's so fast - and insertion sort is faster on small inputs, so this is an excellent compromise. Plus, O(log n) extra memory is basically nothing - remember, in practice, log n is at most 240. This algorithm has about the best locality of reference that you can get, giving a very low constant factor hidden by the O(n log n) term, so it would probably outperform the other algorithms in practice.

Of course, I've got to qualify that answer as well. The analysis I did above assumes that we're talking about pretty large inputs to the algorithm. If you're only ever looking at small inputs, then this whole analysis goes out the window because the effects I was taking into account won't start to show up. In that case, the best option would just be to profile the approaches and see what works best. From there, you might be able to build a "hybrid" approach where you use one algorithm for inputs in one size range and a different algorithm for inputs in a different size range. Chances are that this would give an approach that beats any single one of the approaches.

That said, to paraphrase Don Knuth, "beware of the above analysis - I have merely proved it correct, not actually tried it." The best option would be to profile everything and see how it works. The reason I didn't do this was to go through the analysis of what factors to keep an eye out for and to highlight the weakness of a pure big-O analysis comparing the two algorithms. I hope that the practice bears this out! If not, I'd love to see where I got it wrong. :-)

like image 115
templatetypedef Avatar answered Oct 17 '22 07:10

templatetypedef