Why do 4 different languages give 4 different results here?

Tags:

Consider this (all commands run on an 64bit Arch Linux system):

Perl (v5.24.0)

$ perl -le 'print 10190150730169267102/1000%10'
6

awk (GNU Awk 4.1.3)

$ awk 'BEGIN{print 10190150730169267102/1000%10}'
6

R (3.3.1)

> (10190150730169267102/1000)%%10
[1] 6

bc

$ echo 10190150730169267102/1000%10 | bc
7

Python 2 (2.7.12)

>>> print(10190150730169267102/1000%10)
7

Python 3 (3.5.2)

>>> print(10190150730169267102/1000%10)
8.0

So, Perl, gawk and R agree, as do bc and Pyhon 2. Nevertheless, between the 6 tools tested, I got 4 different results. I understand that this has something to do with how very long integers are being rounded, but why do the different tools differ quite so much? I had expected that this would depend on the processor's ability to deal with large numbers, but it seems to depend on internal features (or bugs) of the language.

Could someone explain what is going on behind the scenes here? What are the limitations in each language and why do they behave quite so differently?

683

asked Aug 23 '16 11:08

terdon

1 Answers

You're seeing different results for two reasons:

The division step is doing two different things: in some of the languages you tried, it represents integer division, which discards the fractional part of the result and just keeps the integer part. In others it represents actual mathematical division (which following Python's terminology I'll call "true division" below), returning a floating-point result close to the true quotient.
In some languages (those with support for arbitrary precision), the large numerator value 10190150730169267102 is being represented exactly; in others, it's replaced by the nearest representable floating-point value.

The different combinations of the possibilities in 1. and 2. above give you the different results.

In detail: in Perl, awk, and R, we're working with floating-point values and true division. The value 10190150730169267102 is too large to store in a machine integer, so it's stored in the usual IEEE 754 binary64 floating-point format. That format can't represent that particular value exactly, so what gets stored is the closest value that is representable in that format, which is 10190150730169266176.0. Now we divide that approximation by 1000, again giving a floating-point result. The exact quotient, 10190150730169266.176, is again not exactly representable in the binary64 format, and we get the closest representable float, which happens to be 10190150730169266.0. Taking a remainder modulo 10 gives 6.

In bc and Python 2, we're working with arbitrary-precision integers and integer division. Both those languages can represent the numerator exactly. The division result is then 10190150730169267 (we're doing integer division, not true division, so the fractional part is discarded), and the remainder modulo 10 is 7. (This is oversimplifying a bit: the format that bc is using internally is somewhat closer to Python's Decimal type than to an arbitrary-precision integer type, but in this case the effect is the same.)

In Python 3, we're working with arbitrary-precision integers and true division. The numerator is represented exactly, but the result of the division is the nearest floating-point value to the true quotient. In this case the exact quotient is 10190150730169267.102, and the closest representable floating-point value is 10190150730169268.0. Taking the remainder of that value modulo 10 gives 8.

Summary:

Perl, awk, R: floating-point approximations, true division
bc, Python 2: arbitrary-precision integers, integer division
Python 3: arbitrary-precision integers, true division

176

answered Nov 12 '22 09:11

Mark Dickinson

Related questions
                            
                                Serving .json file to download
                            
                                SQLAlchemy func.count on boolean column
                            
                                Pretty Display JSON data from with Flask [duplicate]
                            
                                Google Sheets API "update" method Http Error 400
                            
                                MongoEngine delete document
                            
                                How to round float down to a given precision?
                            
                                python selenium send_keys CONTROL, 'c' not copying actual text
                            
                                Scheduling an asyncio coroutine from another thread
                            
                                How to assign sounds to channels in Pygame?
                            
                                Python Searching Nested Lists
                            
                                Rescaling to (0,1) certain columns from Pandas Python dataframe
                            
                                SMTP AUTH extension not supported by server - Sending emails through a private host
                            
                                Modify namespace of importing script in Python
                            
                                How to replace a function call in an existing method
                            
                                Sort A list of Strings Based on certain field
                            
                                Is it superfluous to declare # -*- coding: utf-8 -*- after #!/usr/bin/python3? [duplicate]
                            
                                Split string into chunks of same letters [duplicate]
                            
                                AttributeError: 'Cycler' object has no attribute 'change_key'
                            
                                How can I replace Python Pandas table text values with unique IDs?
                            
                                python append folder name to filenames in all sub folders

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why do 4 different languages give 4 different results here?

Tags:

python

rounding

awk

perl

long-integer

terdon

People also ask

1 Answers

Mark Dickinson

Recent Activity

Donate For Us