I have a gnarly piece of code whose time-efficiency I would like to measure. Since estimating this complexity from the code itself is hard, I want to place it in a loop and time the results. Once enough data-points (size -> time) are gathered, I can see which curve fits best.
Repeating operations a number of times with random input data of a given size can smooth out fluctuations due to the OS deciding to multitask at bad moments, yielding more precise times. Increasing the size of the problem provides more points, ideally well spaced.
My test-code works fine (initial, non-timed warm-up loop to decrease load-times; and then, starting from a size of 10, scaling up to 1000000 in increments of a factor of 10%, repeating runs until 5s have elapsed or 5 full runs have finished). However, I arrived at these numbers by guess-work.
Is there an accepted, "scientific" way to scale repetitions and problem size to achieve faster, more accurate time-vs-size plots? Is there code out there (or libraries) that can scaffold all the boring bits, and which I should have been aware of before rolling-my-own? In particular, I can think that, when bumps in timings are found, more measures could be warranted -- while relatively smooth readings could simply be considered "good enough".
Edit
I am aware of the classical method of calculating big-O complexity. It works fine for self-contained algorithms with a nice representative operation (say, "comparisons" or "swaps"). It does not work as advertised when those conditions are not met (example: the compile-time C++ template instantiation costs of LLVM, which is a large and complex and where I do not know what the relevant representative operation would be). That is why I am treating it as a black box, and trying to measure times from the outside instead of by code inspection.
To calculate the running time, find the maximum number of nested loops that go through a significant portion of the input. Some algorithms use nested loops where the outer loop goes through an input n while the inner loop goes through a different input m. The time complexity in such cases is O(nm).
Let's use T(n) as the total time in function of the input size n , and t as the time complexity taken by a statement or group of statements. T(n) = t(statement1) + t(statement2) + ... + t(statementN); If each statement executes a basic operation, we can say it takes constant time O(1) .
Time complexity is defined as the amount of time taken by an algorithm to run, as a function of the length of the input. It measures the time taken to execute each statement of code in an algorithm. It is not going to examine the total execution time of an algorithm.
Measuring the time complexity can be very difficult (if it is possible at all) and I never saw this in algorithm papers. If you cannot calculate the time-complexity from (pseudo-) code or the algorithm description, then maybe you can use a heuristic to simplify the analysis.
Maybe you can also calculate the complexity of some parts of the algorithm and ignore some other parts if they have obviously a much smaller complexity.
If nothing helps, the normal way would to show how the algorithm scales on an machine, just as you wrote. But there are many things that effect the results. Just to notice some of them:
All in all: I think you can only get an idea, how your algorithm scales, but you cannot exactly get an upper bound of the complexity by measuring the run-time. Maybe this works for really small examples, but for bigger ones you will not get correct results.
The best you can do would be:
This way you can see if changes have improved the algorithm or not and others can verify your results.
About the input:
I'm not aware of any software for this, or previous work done on it. And, fundamentally, I don't think you can get answers of the form "O(whatever)" that are trustworthy. Your measurements are noisy, you might be trying to distinguish nlog(n) operations from nsqrt(n) operations, and unlike a nice clean mathematical analysis, all of the dropped constants are still floating around messing with you.
That said, the process I would go through if I wanted to come up with a best estimate:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With