Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

(Dis)Proving that one algorithm works faster than another due to language internals

For a project at university, we had to implement a few different algorithms to calculate the equivalenceclasses when given a set of elements and a collection of relations between said elements.

We were instructed to implement, among others, the Union-Find algorithm and its optimizations (Union by Depth, Size). By accident (doing something I thought was necessary for the correctness of the algorithm) I discovered another way to optimize the algorithm.

It isn't as fast as Union By Depth, but close. I couldn't for the life of me figure out why it was as fast as it was, so I consulted one of the teaching assistants who couldn't figure it out either.

The project was in java and the datastructures I used were based on simple arrays of Integers (the object, not the int) Later, at the project's evaluation, I was told that it probably had something to do with 'Java caching', but I can't find anything online about how caching would effect this.

What would be the best way, without calculating the complexity of the algorithm, to prove or disprove that my optimization is this fast because of java's way of doing stuff? Implementing it in another, (lower level?) language? But who's to say that language won't do the same thing?

I hope I made myself clear,

thanks

like image 435
thepandaatemyface Avatar asked Dec 20 '10 00:12

thepandaatemyface


People also ask

What are the causes of algorithmic bias?

There are three main causes of algorithmic bias: input bias, training bias, and programming bias. 20 On the other hand, algorithmic outcomes oftentimes labeled as “biased” may simply reflect unpleasant facts based on causal relationships derived from reliable representative data.

Does deep learning always deliver better performance than other machine learning algorithms?

The most important difference between deep learning and traditional machine learning is its performance as the scale of data increases. When the data is small, deep learning algorithms don't perform that well.


3 Answers

The only way is to prove the worst-case (average case, etc) complexity of the algorithm.

Because if you don't, it might just be a consequence of a combination of

  • The particular data
  • The size of the data
  • Some aspect of the hardware
  • Some aspect of the language implementation
  • etc.
like image 171
Mitch Wheat Avatar answered Nov 18 '22 08:11

Mitch Wheat


It is generally very difficult to perform such task given modern VM's! Like you hint they perform all sorts of stuff behind your back. Method calls gets inlined, objects are reused. Etc. A prime example is seeing how trivial loops gets compiled away if their obviously are not performing anything other than counting. Or how a funtioncall in functional programming are inlined or tail-call optimized.

Further, you have the difficulty of proving your point in general on any data set. An O(n^2) can easily be much faster than a seemingly faster, say O(n), algorithm. Two examples

  1. Bubble sort is faster at sorting a near-sorted data collection than quick sort.
  2. Quick sort in the general case, of course is faster.

Generally the big-O notation purposely ignores constants which in a practical situation can mean life or death to your implementation. and those constants may have been what hit you. So in practice 0.00001 * n ^2 (say the running time of your algorithm) is faster than 1000000 * n log n

So reasoning is hard given the limited information you provide.

like image 37
Carlo V. Dango Avatar answered Nov 18 '22 07:11

Carlo V. Dango


It is likely that either the compiler or the JVM found an optimisation for your code. You could try reading the bytecode that is output by the javac compiler, and disabling runtime JIT compilation with the -Djava.compiler=NONE option.

like image 1
Neil Bartlett Avatar answered Nov 18 '22 09:11

Neil Bartlett