Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Handling large numbers in Julia

Tags:

julia

In Python, I can do the following to get the sum of all digits in N, where N=99999 ** 99999. The sum can be obtained with sum(map(int,str(N))).

How can i find the sum of all digits in N with Julia?

like image 299
Lijo Joseph Avatar asked Apr 25 '18 13:04

Lijo Joseph


2 Answers

You are hitting integer overflow. Try using BigInts.

julia> N=digits(big(99999)^99999)
499995-element Array{Int64,1}:
 9
 9
 9
 9
 9
 8
 9
 9
 9
 9
 ⋮
 5
 0
 8
 2
 1
 8
 8
 7
 6
 3

Notice that

julia> typemax(Int64)
9223372036854775807

which is far too small, but BigInts are arbitrary sized. big(i) turns i into a big (BigFloat if it's a float, and BigInt if it's an integer). Julia does not default to using bigs/arbitrary sized numbers since they are quite slow, but if you invoke them then the type stability of most dispatches will propagate the big type, so big(i)^j will end up big.

like image 91
Chris Rackauckas Avatar answered Oct 28 '22 19:10

Chris Rackauckas


This question is a little old, but I found an interesting performance improvement that can be useful for new users (or that more advanced users can explain), at least those using Julia 1.5 or earlier.

By default, Julia casts "99999" to Int64. However, 99999⁹⁹⁹⁹⁹ is a big number that doesn't fit into an Int64 variable (that supports up to 2⁶³). You have, then, to use big(99999) or equivalently BigInt(99999) (what Python does automatically).

One could write

julia> sum(digits(big(99999)^99999))

and it definitely works.

EDIT: As rafak pointed out, Julia 1.6 optimized digits(n::BigInt) and the problem is gone. Here are the new benchmarks:

  • Python 3.8.5:
>>> %timeit sum(map(int,str(99999**99999)))
4.36 s ± 51.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
  • Julia 1.6.1:
julia> @benchmark sum(digits(big(99999)^99999))
BenchmarkTools.Trial: 
  memory estimate:  13.56 MiB
  allocs estimate:  990
  --------------
  minimum time:     40.200 ms (0.84% GC)
  median time:      41.894 ms (0.00% GC)
  mean time:        42.006 ms (0.27% GC)
  maximum time:     44.410 ms (1.59% GC)
  --------------
  samples:          119
  evals/sample:     1

I'll keep the old benchmarks for historical reasons.


In Julia 1.5, benchmarking a little, I found:

  • Python:
>>> %timeit sum(map(int,str(99999**99999)))
4.9 s ± 5.43 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
  • Julia 1.5:
julia> @benchmark sum(digits(big(99999)^99999))
BenchmarkTools.Trial: 
  memory estimate:  142.75 GiB
  allocs estimate:  5343475
  --------------
  minimum time:     79.833 s (3.24% GC)
  median time:      79.833 s (3.24% GC)
  mean time:        79.833 s (3.24% GC)
  maximum time:     79.833 s (3.24% GC)
  --------------
  samples:          1
  evals/sample:     1

I know. Julia is the "above the light speed", "faster than C and Fortran" language. Why would it be (considerably) slower than Python, the "turtle language"? And how can we deal with this huge amount of memory allocated? Let's try the "pythonic" algorithm, rewritten in Julia 1.5:

julia> sum(map(x -> parse(Int32, x), collect(string(big(99999)^99999))))

I know that it sounds very strange and counter-intuitive, but numbers say everything. Here my results.

  • Julia ("Pythonic" way):
julia> @benchmark sum(map(x -> parse(Int32, x), collect(string(big(99999)^99999))))
BenchmarkTools.Trial: 
  memory estimate:  13.56 MiB
  allocs estimate:  991
  --------------
  minimum time:     62.130 ms (0.00% GC)
  median time:      63.111 ms (0.00% GC)
  mean time:        63.183 ms (0.24% GC)
  maximum time:     66.055 ms (0.68% GC)
  --------------
  samples:          80
  evals/sample:     1

Remarks.

  • %timeit is an IPython magic. @benchmark is a macro implemented by BenchmarkTools.jl (need install).
  • In the last Julia benchmark, memory allocation can be improved using Int8 to map the string, but the time is fairly the same.
  • New and "historic" benchmarks should not be compared, as they were done in different machines.

In summary, it can be done in Julia both with a simple but slow code and with a verbose but fast one. I can't explain exactly why though.

like image 2
Héliton Martins Avatar answered Oct 28 '22 21:10

Héliton Martins