Below piece of code when run with different versions of perl gives different output:
#!/usr/bin/env perl
my $number1 = 2.198696207;
my $number2 = 2.134326286;
my $diff = $number1 - $number2;
print STDOUT "\n 2.198696207 - 2.134326286: $diff\n";
$number1 = 0.449262271;
$number2 = 0.401361096;
$diff = $number1 - $number2;
print STDOUT "\n 2.198696207 - 2.134326286: $diff\n";
PERL 5.16.3:-
perl -v
This is perl 5, version 16, subversion 3 (v5.16.3) built for x86_64-linux
file `which perl`
/sv/app/perx/third-party/bin/perl: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, not stripped
2.198696207 - 2.134326286: 0.0643699210000004
2.198696207 - 2.134326286: 0.047901175
PERL 5.8.7:- perl -v
This is perl, v5.8.7 built for i686-linux-thread-multi-64int
file `which perl`
/sv/app/perx/third-party/bin/perl: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.2.5, dynamically linked (uses shared libs), for GNU/Linux 2.2.5, not stripped
2.198696207 - 2.134326286: 0.0643699209999999
2.198696207 - 2.134326286: 0.047901175
I have not been able to find any documentation which speaks about the difference in precision/rounding of floating point numbers introduced between the above two versions.
EDIT: thanks to Mark Dickinson for pointing out irregularities in my initial answer. The conclusion changed because of his detective work. Many thanks also to ikegami for his doubts on the initial analysis.
In summary: its because of small differences in the string to double conversation. And it looks like that these differences are caused by a different behavior of the same code when running on 32 bit and 64 bit.
Details
This is perl, v5.8.7 built for i686-linux-thread-multi-64int
This is Perl for 32 bit architecture
This is perl 5, version 16, subversion 3 (v5.16.3) built for x86_64-linux
And this for 64 bit architecture.
This means these Perl versions are built against different CPU architectures and maybe different compile time options. This might result in a different precision for floating point operations. But it might also be related to string to double conversations as was pointed out in comments from ikegami.
For difference between the architectures see Problem with floating-point precision when moving from i386 to x86_64 or x87 FPU vs. SSE2 on Wikipedia.
I've done the following tests on the same computer with identical versions of Ubuntu (15.10) inside a LXC container, but one for 32 bit and the other for 64 bit.
# on 32 bit bit
$ perl -v
This is perl 5, version 20, subversion 2 (v5.20.2) built for i686-linux-gnu-thread-multi-64int
$ perl -V:nvsize
$ nvsize='8';
$ perl -E 'say 2.198696207-2.134326286'
0.0643699209999999
# on 64 bit
$ perl -v
This is perl 5, version 20, subversion 2 (v5.20.2) built for x86_64-linux-gnu-thread-multi
$ perl -V:nvsize
$ nvsize='8';
$ perl -E 'say 2.198696207-2.134326286'
0.0643699210000004
This shows that the difference is not related to the Perl version or to the size of the floating point used. To get more details we have a look at the internal representation of the numbers using unpack('H*',pack('N',$double))
.
For 2.134326286 the representation is the same, i.e. 0xb7e7eaa819130140. But for 2.198696207 we get a different representation:
32 bit: 2.198696207 -> 0xe*5*3b7709ee960140
64 bit: 2.198696207 -> 0xe*6*3b7709ee960140
This means that the internal representation of the number is different on 64 bit and 32 bit. This can be due to different functions used because of optimizations for different platforms or because the same functions behaves slightly different on 32 bit and 64 bit. Checking with the libc function atof
shows that this returns 0xe53b7709ee960140 on 64 bit too, so it looks like Perl is using a different function for the conversation.
Digging deeper shows that the Perl I have used on both platforms has USE_PERL_ATOF
set which indicates that Perl is using its own implementation of the atof
function. The source code for some current implementation of this function can be found here.
Looking at this code it is hard to see how it could behave differently for 32 and 64 bit. But there is one important platform dependent value which indicates how much data the atof implementation will accumulate inside an unsigned int before adding it to the internal representation of the floating point:
#define MAX_ACCUMULATE ( (UV) ((UV_MAX - 9)/10))
Obviously UV_MAX
is different on 32 bit and 64 bit so it will cause different accumulation steps in 32 bit which causes different floating point additions with potential precision problems. My guess is that this somehow explains the tiny difference in the behavior between 32 bit and 64 bit.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With