I've been playing with some Perl programs to calculate excellent numbers. Although the runtimes for my solutions were acceptable, I thought a different language, especially one designed for numeric stuff, might be faster. A friend suggested Julia, but the performance I'm seeing is so bad I must be doing something wrong. I've looked through the Performance Tips and don't see what I should improve:
digits = int( ARGS[1] )
const k = div( digits, 2 )
for a = ( 10 ^ (k - 1) ) : ( 10 ^ (k) - 1 )
front = a * (10 ^ k + a)
root = floor( front ^ 0.5 )
for b = ( root - 1 ): ( root + 1 )
back = b * (b - 1);
if back > front
break
end
if log(10,b) > k
continue
end
if front == back
@printf "%d%d\n" a b
end
end
end
I have an equivalent C program that's an order of magnitude faster instead of the factor of 2 noted on the Julia page (although most of the Stackoverflow questions about Julia's speed seem to point out flawed benchmarks from that page):
And the non-optimized pure Perl I wrote takes half as long:
use v5.20;
my $digits = $ARGV[0] // 2;
die "Number of digits must be even and non-zero! You said [$digits]\n"
unless( $digits > 0 and $digits % 2 == 0 and int($digits) eq $digits );
my $k = ( $digits / 2 );
foreach my $n ( 10**($k-1) .. 10**($k) - 1 ) {
my $front = $n*(10**$k + $n);
my $root = int( sqrt( $front ) );
foreach my $try ( $root - 2 .. $root + 2 ) {
my $back = $try * ($try - 1);
last if length($try) > $k;
last if $back > $front;
# say "\tn: $n back: $back try: $try front: $front";
if( $back == $front ) {
say "$n$try";
last;
}
}
}
I'm using the pre-compiled Julia for Mac OS X since I couldn't get the source to compile (but I didn't try beyond it blowing up the first time). I figure that's part of it.
Also, I see about a 0.7 second start up time for any Julia program (see Slow Julia Startup Time), which means the equivalent compiled C program can run about 200 times before Julia finishes once. As the runtime increases (bigger values of digits
) and the startup time means less, my Julia program is still really slow.
I haven't gotten to the part for very large numbers (20+ digit excellent numbers) which I didn't realize Julia doesn't handle those any better than most other languages.
Here's my C code, which is a little different from when I started this. My rusty, inelegant C skills are essentially the same thing as my Perl.
#include <math.h>
#include <stdio.h>
#include <stdlib.h>
int main( int argc, char *argv[] ) {
long
k, digits,
start, end,
a, b,
front, back,
root
;
digits = atoi( argv[1] );
k = digits / 2;
start = (long) pow(10, k - 1);
end = (long) pow(10, k);
for( a = start; a < end; a++ ) {
front = (long) a * ( pow(10,k) + a );
root = (long) floor( sqrt( front ) );
for( b = root - 1; b <= root + 1; b++ ) {
back = (long) b * ( b - 1 );
if( back > front ) { break; }
if( log10(b) > k ) { continue; }
if( front == back ) {
printf( "%ld%ld\n", a, b );
}
}
}
return 0;
}
I've benchmarked your code (brian.jl
) against the following code, which attempts to make minimal changes to your code and follows Julian style:
function excellent(digits)
k = div(digits, 2)
l = 10 ^ (k - 1)
u = (10 ^ k) - 1
for a in l:u
front = a * (10 ^ k + a)
root = isqrt(front)
for b = (root - 1):(root + 1)
back = b * (b - 1)
back > front && break
log10(b) > k && continue
front == back && println(a,b)
end
end
end
excellent(int(ARGS[1]))
Separating out u
and l
was a personal preference for readability. As a baseline, the Julia startup time on my machine is:
$ time julia -e ''
real 0m0.248s
user 0m0.306s
sys 0m0.091s
So if the computation you are running per execution of Julia from a cold start is on the order of 0.3 seconds, then Julia might not be a great choice for you at this stage. I passed in 16
to the scripts, and got:
$ time julia brian.jl 16
1045751633986928
1140820035650625
3333333466666668
real 0m15.973s
user 0m15.691s
sys 0m0.586s
and
$ time julia iain.jl 16
1045751633986928
1140820035650625
3333333466666668
real 0m9.691s
user 0m9.839s
sys 0m0.155s
A limitation of this code as written is that if digits>=20
we will exceed the storage of a Int64
. Julia, for performance reasons, doesn't auto-promote integer types to arbitrary precision integers. We can use our knowledge of the problem to address this by changing the last line to:
digits = int(ARGS[1])
excellent(digits >= 20 ? BigInt(digits) : digits)
We get the BigInt
version of excellent for free, which is nice. Ignoring that for now, on profiling my version I found that ~74% of the time is spent calculating log10
, following by ~19% on isqrt
. I did this profiling by replacing the last line with
excellent(4) # Warm up to avoid effects of JIT
@profile excellent(int(ARGS[1]))
Profile.print()
Now if we wanted to dabble in minor algorithmic changes, given what we know now from the profiler, we can replace the log10
line (which is just checking the number of digits in effect) with ndigits(b) > k && continue
, which gives us
$ time julia iain.jl 16
1045751633986928
1140820035650625
3333333466666668
real 0m3.634s
user 0m3.785s
sys 0m0.153s
This changes the balance to about ~56% from the isqrt
and ~28% from ndigits
. Digging further into that 56%, about half is spent executing this line which seems like a pretty sensible algorithm, so any improvement would probably change the spirit of the comparison as it'd really be a different approach altogether. Investigating the machine code with @code_native
tends to suggest there is nothing else too strange going on, although I didn't dig deep into this.
If I allow myself to engage in some more minor algorithmic improvements, I can start from root+1
and only doing the ndigits
check once, i.e.
for a in l:u
front = a * (10^k + a)
root = isqrt(front)
b = root + 1
ndigits(b) > k && continue
front == b*(b-1) && println(a,b)
b = root
front == b*(b-1) && println(a,b)
b = root - 1
front == b*(b-1) && println(a,b)
end
which gets me down to
real 0m2.901s
user 0m3.050s
sys 0m0.154s
(I'm not convinced the second two equality checks are needed, but I'm trying to minimize differences!). Finally I thought I'd eke out some extra speed by precomputing 10^k
, i.e. k10 = 10^k
, which seems to be computed freshly each iteration. With that, I get to
real 0m2.518s
user 0m2.670s
sys 0m0.153s
Which is a fairly nice 20x improvement over the original code.
I was curious about how Perl was getting such good performance from this code, so I felt I ought to do a comparison. Since there are some seemingly needless differences in both control flow and operations between the Perl and Julia versions of the code in the question, I ported each version to the other language and compared all four. I also wrote a fifth Julia version using more idiomatic numerical functions but with the same control flow structure as the question's Perl version.
The first variant is essentially the Perl code from the question, but wrapped up in a function:
sub perl1 {
my $k = $_[0];
foreach my $n (10**($k-1) .. 10**($k)-1) {
my $front = $n * (10**$k + $n);
my $root = int(sqrt($front));
foreach my $t ($root-2 .. $root+2) {
my $back = $t * ($t - 1);
last if length($t) > $k;
last if $back > $front;
if ($back == $front) {
print STDERR "$n$t\n";
last;
}
}
}
}
Next, I translated this to Julia, keeping the same control flow and using the same operations – it takes the integer floor of the square root of front
in the outer loop and takes the length of the "stringification" of t
in the inner loop:
function julia1(k)
for n = 10^(k-1):10^k-1
front = n*(10^k + n)
root = floor(Int,sqrt(front))
for t = root-2:root+2
back = t * (t - 1)
length(string(t)) > k && break
back > front && break
if back == front
println(STDERR,n,t)
break
end
end
end
end
Here's the question's Julia code with some minor formatting tweaks, wrapped up in a function:
function julia2(k)
for a = 10^(k-1):10^k-1
front = a * (10^k + a)
root = floor(front^0.5)
for b = root-1:root+1
back = b * (b - 1);
back > front && break
log(10,b) > k && continue
if front == back
@printf STDERR "%d%d\n" a b
# missing break?
end
end
end
end
I translated this back to Perl, keeping the same control flow structure and using the same operations as the Perl code – taking floor of root
raised to the 0.5 power in the outer loop, and taking the logarithm base 10 in the inner loop:
sub perl2 {
my $k = $_[0];
foreach my $a (10**($k-1) .. 10**($k)-1) {
my $front = $a * (10**$k + $a);
my $root = int($front**0.5);
foreach my $b ($root-1 .. $root+1) {
my $back = $b * ($b - 1);
last if $back > $front;
next if log($b)/log(10) > $k;
if ($front == $back) {
print STDERR "$a$b\n"
}
}
}
}
Finally, I wrote a Julia version that has the same control flow structure as the question's Perl version but uses more idiomatic numerical operations – the isqrt
and ndigits
functions:
function julia3(k)
for n = 10^(k-1):10^k-1
front = n*(10^k + n)
root = isqrt(front)
for t = root-2:root+2
back = t * (t - 1)
ndigits(t) > k && break
back > front && break
if back == front
println(STDERR,n,t)
break
end
end
end
end
As far as I know (I used to do a lot of Perl programming, but it's been a while), there aren't Perl versions of either of these operations, so there's no corresponding perl3
variant.
I ran all five variations with Perl 5.18.2 and Julia 0.3.9, respectively, ten times each for 2, 4, 6, 8, 10, 12 and 14 digits. Here are the timing results:
The x axis is the number of digits requested. The y axis is median time in seconds required to compute each function. The y axis is plotted on a log scale (there's some rendering bug in the Cairo backend of Gadfly, so the superscripts don't appear very raised). We can see that except for the very smallest number of digits (2), all three Julia variants are faster than both Perl variants – and julia3
is substantially faster than everything else. How much faster? Here's a comparison of the other four variants relative to julia3
(not log scale):
The x axis is the number of digits requested again, while the y axis is how many times slower each variant was than julia3
. As you can see here, I was unable to reproduce the Perl performance claimed in the question – the Perl code was not 2x faster than Julia – it was 7 to 40 times slower than julia3
and at least 2x slower than the slowest Julia variant for any non-trivial number of digits. I didn't test with Perl 5.20 – perhaps someone could follow up by running these benchmarks with a newer Perl and see if that explains the different results? Code to run the benchmarks can be found here: excellent.pl, excellent.jl. I ran the them like this:
cat /dev/null >excellent.csv
for d in 2 4 6 8 10 12 14; do
perl excellent.pl $d >>excellent.csv
julia excellent.jl $d >>excellent.csv
done
I analyzed the resulting excellent.csv
file using this Julia script.
Finally, as has been mentioned in the comments, using BigInt
or Int128
is an option for exploring larger excellent numbers in Julia. However, this requires a little care for writing the algorithm generically. Here's a fourth variation that works generically:
function julia4(k)
ten = oftype(k,10)
for n = ten^(k-1):ten^k-1
front = n*(ten^k + n)
root = isqrt(front)
for t = root-2:root+2
back = t * (t - 1)
ndigits(t) > k && break
back > front && break
if back == front
println(STDERR,n,t)
break
end
end
end
end
This is the same as julia3
but works for generic integer types by converting 10 to the type of its argument. Since the algorithm scales exponentially, however, it still takes a very long time to compute for any number of digits much larger than 14:
julia> @time julia4(int128(10)) # digits = 20
21733880705143685100
22847252005297850625
23037747345324014028
23921499005444619376
24981063345587629068
26396551105776186476
31698125906461101900
33333333346666666668
34683468346834683468
35020266906876369525
36160444847016852753
36412684107047802476
46399675808241903600
46401324208242096401
48179452108449381525
elapsed time: 2260.27479767 seconds (5144 bytes allocated)
It works, but 37 minutes is kind of a long time to wait. Using a faster programming language only gets you a constant factor speedup – 40x in this case – but that only buys a couple of extra digits. To really explore larger excellent numbers, you'll need to look into better algorithms.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With