Following this post I decided to benchmark Julia against GNU Octave and the results were inconsistent with the speed-ups illustrated in julialang.org.
I compiled both Julia and GNU Octave with CXXFLAGS='-std=c++11 -O3'
, the results I got:
a=0.9999; tic;y=a.^(1:10000);toc Elapsed time is 0.000159025 seconds. tic;y=a.^(1:10000);toc Elapsed time is 0.000162125 seconds. tic;y=a.^(1:10000);toc Elapsed time is 0.000159979 seconds.
--
tic;y=cumprod(ones(1,10000)*a);toc Elapsed time is 0.000280142 seconds. tic;y=cumprod(ones(1,10000)*a);toc Elapsed time is 0.000280142 seconds. tic;y=cumprod(ones(1,10000)*a);toc Elapsed time is 0.000277996 seconds.
tic();y=a.^(1:10000);toc() elapsed time: 0.003486508 seconds tic();y=a.^(1:10000);toc() elapsed time: 0.003909662 seconds tic();y=a.^(1:10000);toc() elapsed time: 0.003465313 seconds
--
tic();y=cumprod(ones(1,10000)*a);toc() elapsed time: 0.001692931 seconds tic();y=cumprod(ones(1,10000)*a);toc() elapsed time: 0.001690245 seconds tic();y=cumprod(ones(1,10000)*a);toc() elapsed time: 0.001689241 seconds
Could someone explain why Julia is slower than GNU Octave with these basic operations? After warmed, it should call LAPACK/BLAS without overhead, right?
EDIT:
As explained in the comments and answers, the code above is not a good benchmark nor it illustrates the benefits of using the language in a real application. I used to think of Julia as a faster "Octave/MATLAB", but it is much more than that. It is a huge step towards productive, high-performance, scientific computing. By using Julia, I was able to 1) outperform software in my research field written in Fortran and C++, and 2) provide users with a much nicer API.
Pure Julia erfinv(x) [ = erf–1(x) ] 3–4× faster than Matlab's and 2–3× faster than SciPy's (Fortran Cephes). Julia code can actually be faster than typical “oplmized” C/Fortran code, by using techniques [metaprogramming/ code generalon] that are hard in a low-level language.
Many people believe Julia is fast because it is Just-In-Time (JIT) compiled (i.e. every statement is run using compiled functions which are either compiled right before they are used, or cached compilations from before).
jl is 1.5 to 5 times faster than Python's pandas library even when limited to a single core; with multithreading enabled it can be over 20 times faster.” Julia makes excellent use of its ability for multi-threaded processing, but even using a single thread, Julia consistently is faster in reading CSVs.
Julia, especially when written well, can be as fast and sometimes even faster than C. Julia uses the Just In Time (JIT) compiler and compiles incredibly fast, though it compiles more like an interpreted language than a traditional low-level compiled language like C, or Fortran.
Vectorized operations like .^
are exactly the kind of thing that Octave is good at because they're actually entirely implemented in specialized C code. Somewhere in the code that is compiled when Octave is built, there is a C function that computes .^
for a double and an array of doubles – that's what you're really timing here, and it's fast because it's written in C. Julia's .^
operator, on the other hand, is written in Julia:
julia> a = 0.9999; julia> @which a.^(1:10000) .^(x::Number,r::Ranges{T}) at range.jl:327
That definition consists of this:
.^(x::Number, r::Ranges) = [ x^y for y=r ]
It uses a one-dimensional array comprehension to raise x
to each value y
in the range r
, returning the result as a vector.
Edward Garson is quite right that one shouldn't use globals for optimal performance in Julia. The reason is that the compiler can't reason very well about the types of globals because they can change at any point where execution leaves the current scope. Leaving the current scope doesn't sound like it happens that often, but in Julia, even basic things like indexing into an array or adding two integers are actually method calls and thus leave the current scope. In the code in this question, however, all the time is spent inside the .^
function, so the fact that a
is a global doesn't actually matter:
julia> @elapsed a.^(1:10000) 0.000809698 julia> let a = 0.9999; @elapsed a.^(1:10000) end 0.000804208
Ultimately, if all you're ever doing is calling vectorized operations on floating point arrays, Octave is just fine. However, this is often not actually where most of the time is spent even in high-level dynamic languages. If you ever find yourself wanting to iterate over an array with a for loop, operating on each element with scalar arithmetic, you'll find that Octave is quite slow at that sort of thing – often thousands of times slower than C or Julia code doing the same thing. Writing for loops in Julia, on the other hand, is a perfectly reasonable thing to do – in fact, all our sorting code is written in Julia and is comparable to C in performance. There are also many other reasons to use Julia that don't have to do with performance. As a Matlab clone, Octave inherits many of Matlab's design problems, and doesn't fare very well as a general purpose programming language. You wouldn't, for example, want to write a web service in Octave or Matlab, but it's quite easy to do so in Julia.
You're using global variables which is a performance gotcha in Julia.
The issue is that globals can potentially change type whenever your code calls anther function. As a result, the compiler has to generate extremely slow code that cannot make any assumptions about the types of global variables that are used.
Simple modifications of your code in line with https://docs.julialang.org/en/stable/manual/performance-tips/ should yield more satisfactory results.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With