Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Julia file input reading speed

I'm giving Julia a go while solving Code Jam problems, in this case Rope Intranet from round 1C 2010 (https://code.google.com/codejam/contest/619102/dashboard)

Solution is basically:

for tc = 1:int(readline())
    n = int(readline())
    a = [map(int, split(readline())) for _ = 1:n]
    ans = 0

    for (i, x) in enumerate(a)
        for y in a[i + 1:end]
            ans += (x[1] - y[1]) * (x[2] - y[2]) < 0
        end
    end

    println("Case #", tc, ": ", ans)
end

However, time results for the large input are not very impressive comparing to solutions in c++ and python:

julia
real    0m6.196s
user    0m6.028s
sys     0m0.373s

c++
real    0m0.392s
user    0m0.338s
sys     0m0.053s


pypy
real    0m0.529s
user    0m0.507s
sys     0m0.016s

Situation changes, when I replace file input with random numbers (still slower than c++ though):

julia
real    0m3.065s
user    0m2.868s
sys     0m0.338s

c++
real    0m1.413s
user    0m1.348s
sys     0m0.055s

pypy
real    0m22.491s
user    0m22.257s
sys     0m0.160s

Is there any way I can optimize file reading times in Julia (I'm using v0.3.7)?

like image 525
frost Avatar asked May 09 '15 15:05

frost


1 Answers

So my baseline time on the large input (time cat A-large-practice.in | julia original.jl) was

real    0m2.730s
user    0m2.683s
sys     0m0.351s

On my system it takes Julia about 0.2 seconds to start running the file. Putting the code in a function is possibly the most important first step - this means everything isn't a global, which can hurt performance due to the type inference struggling. This got me down to real 2.044s.

The second problem is actually a flaw of Julia's type inference for a different reason. Basically, list comprehensions + map = some type instability. This propagates through to make the arithmetic later on slow as it has to check types. The fix is simply a = Vector{Int}[map(int, split(readline())) for _ = 1:n], which then gives me real 0m0.474s.

Most of the time is thus spent on Julia startup, and the map line, and actually printing the Case #. If I really expand that map line out to for loops I can grind it down to real 0m0.411s but that doesn't seem worth it.

like image 78
IainDunning Avatar answered Nov 15 '22 09:11

IainDunning