I'm trying to figure out the best programming language for an analytical model I'm building. Primary consideration is speed at which it will run FOR loops.
Some detail:
So with all that said, what's the fastest programming language specifically for FOR loops? From a search of SO and Google, Fortran and C bubble to the top, but looking for some more advice before diving in to one or the other.
Thanks!
C++ is one of the most efficient and fastest languages.
The for loop is the fastest but poorly readable. The foreach is fast, and iteration is controllable. The for…of takes time, but it's sweeter. The for…in takes time, hence less convenient.
Several benchmarks show Lua as the fastest language in the realm of interpreted scripting languages. Lua is fast not only in fine-tuned benchmark programs, but in real life too. Substantial fractions of large applications have been written in Lua.
Fortran semantics say that function arguments never alias and there is an array type, where in C arrays are pointers. This is why Fortran is often faster than C. This is why numerical libraries are still written in Fortran. However, it comes at the cost of pointer arithmetic.
This for loop looks no more complex than this when it hits the CPU:
for(int i = 0; i != 1024; i++)
translates to
mov r0, 0 ;;start the counter
top:
;;some processing
add r0, r0, 1 ;;increment the counter by 1
jne top: r0, 1024 ;;jump to the loop top if we havn't hit the top of the for loop (1024 elements)
;;continue on
As you can tell, this is sufficiently simple you can't really optimize it very well[1]... Refocus towards the algorithm level.
The first cut at the problem is to look at cache locality. Look up the classic example of matrix multiplication and swapping the i
and j
indexes.
edit: As a second cut, I would suggest evaluating the algorithm for data-dependencies between iterations and data dependency between localities in your 'matrix' of data. It may be a good candidate for parallelization.
[1] There are some micro-optimizations possible, but those will not produce the speedsups you're looking for.
~300k * ~150 * ~30 * ~12 = ~16G
iterations, right?
This number of primitive operations should complete in a matter of minutes (if not seconds) in any compiled language on any decent CPU.
Fortran, C/C++ should do it almost equally well. Even managed languages, such as Java and C#, would only fall behind by a small margin (if at all).
If you have a problem of ~16G iterations running 55 hours, this means that they are very far from being primitive (80k per second? this is ridiculous), so maybe we should know more. (as was already suggested, is disk access limiting performance? is it network access?)
As @Rotsor said, 16G operations / 55 hours is about 80,000 operations per second, or one operation every 12.5 microseconds. That's a lot of time per operation.
That means your loops are not the cause of poor performance, it's what's in the innermost loop that's taking the time. And Octave is an interpreted language. That alone means an order of magnitude slowdown.
If you want speed, you first need to be in a compiled language. Then you need to do performance tuning (aka profiling) or, just single step it in a debugger at the instruction level. That will tell you what it is actually doing in its heart of hearts. Once you've got it to where it's not wasting cycles, fancier hardware, cores, CUDA, etc. will give you a parallelism speedup. But it's silly to do that if your code is taking unnecessarily many cycles. (Here's an example of performance tuning - a 43x speedup just by trimming the fat.)
I can't believe the number of responders talking about matlab, APL, and other vectorized languages. Those are interpreters. They give you concise source code, which is not at all the same thing as fast execution. When it comes down to the bare metal, they are stuck with the same hardware as every other language.
Added: to show you what I mean, I just ran this C++ code, which does 16G operations, on this moldy old laptop, and it took 94 seconds, or about 6ns per iteration. (I can't believe you baby-sat that thing for 2 whole days.)
void doit(){
double sum = 0;
for (int i = 0; i < 1000; i++){
for (int j = 0; j < 16000000; j++){
sum += j * 3.1415926;
}
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With