Collatz Conjecture Python - Incorrect Output Above 2 Trillion (Only!)

Tags:

I've written a basic script in Python3 that calculates the Collatz Conjecture. It takes a positive integer as an input and returns the number the steps until the sequence drops down to 1.

My script works perfectly for any integer inputs less than ~2 trillion, but above this threshold the outputs are too small.

As an example, here are some inputs, my script's output, and the actual correct output:

Integer Input          Script Output     Correct Output    989,345,275,647        1,348             1,348   1,122,382,791,663        1,356             1,356   1,444,338,092,271        1,408             1,408   1,899,148,184,679        1,411             1,411   2,081,751,768,559          385             1,437   2,775,669,024,745          388             1,440   3,700,892,032,993          391             1,443   3,743,559,068,799          497             1,549 `

Correct output values are based on this link: http://www.ericr.nl/wondrous/delrecs.html

My script's output is always exactly 1,052 less than the correct output for inputs above 2 trillion, but I have no idea as to what's causing this.

Can anyone explain what's wrong, and how to update/fix the script so that it works correctly for all inputs? I thought Python is able to accept arbitrarily big numbers without a problem ...

Thank you!

# Python Code for the Collatz Conjecture # Rules: Take any integer 'n' and assess: # If integer is even, divide by 2 (n/2) # If integer is odd, multiply by 3 and add 1 (3n+1) # Result: a list of all steps until 'n' goes down to 1  while True:     print("Please enter a positive integer:")     n = input("")     if n == 'q':         print("Until next time ...\n")         break     try:         n = int(n)         if n > 0:             i = 0             while n > 1:                 if n % 2 == 0:                     n = int(n/2)                     i += 1                 else:                     n = int((3*n)+1)                     i += 1             print("# of steps to reach '1' = ", str(i), "\n")         else:             print("Sorry, that's not a valid entry. Please try again!\n")     except ValueError:         print("Sorry, that's not a valid entry. Please try again!\n")

809

asked Aug 14 '18 20:08

ronster1000

Video Answer

1 Answers

This line:

n = int(n/2)

… converts n to a float, divides that float by 2, then converts back to an int by throwing away the fractional part.

For integers up to 2**52, converting to float is lossless, but for anything larger, it has to round to the nearest 53-bit number, which loses information.

Of course 2 trillion is well under that 2**53 limit for float precision—but the Collatz sequence starting at N frequently goes much, much higher than N. It's not at all implausible that many numbers around 2 trillion have sequences that go past 2**53, while very few numbers below it do. It's even possible that a whole long sequence of numbers starting at exactly 2 trillion go past 2**53 but not a single number below it does. But I have no idea how to prove such a thing without building the entire sequence for every number up to 2 trillion. (If there is a proof, it would probably lean heavily on the existing partial proofs of the conjecture under various different conditions, which are above my paygrade…)

Anyway, the solution is simple: you want to use integer division:

n = n // 2

Here's an example to demonstrate:

>>> n = 2**53 + 3 >>> n 9007199254740995 >>> int(n/2) 4503599627370498 >>> n//2 4503599627370497

To verify that this is actually happening in your code, try this:

def collatz(n):     overflow = False     i = 0     while n > 1:         if n > 2**53:             overflow=True         if n % 2 == 0:             n = int(n/2)             i += 1         else:             n = int((3*n)+1)             i += 1     return i, overflow  if __name__ == '__main__':     import sys     for arg in sys.argv[1:]:         num = int(arg.replace(',', ''))         result, overflow = collatz(num)         print(f'{arg:>30}: {result:10,} {overflow}')

When I run this:

$ python3 collatz.py 989,345,275,647 1,122,382,791,663 1,444,338,092,271 1,899,148,184,679 2,081,751,768,559 2,775,669,024,745 3,700,892,032,993 3,743,559,068,799

… it gives me:

           989,345,275,647:      1,348 False          1,122,382,791,663:      1,356 False          1,444,338,092,271:      1,408 False          1,899,148,184,679:      1,411 False          2,081,751,768,559:        385 True          2,775,669,024,745:        388 True          3,700,892,032,993:        391 True          3,743,559,068,799:        497 True

So, we went past 2**53 in exactly the same cases where we got the wrong answer.

And to verify the fix, change the int(n/2) to n//2:

           989,345,275,647:      1,348 False          1,122,382,791,663:      1,356 False          1,444,338,092,271:      1,408 False          1,899,148,184,679:      1,411 False          2,081,751,768,559:      1,437 True          2,775,669,024,745:      1,440 True          3,700,892,032,993:      1,443 True          3,743,559,068,799:      1,549 True

So, why is it always off by the same amount?

Well, that's mostly just a coincidence of the specific numbers you happen to be using.

When you pass 2**53 via 3n+1, you're going to convert either the last bit, or the last 2 bits, to 0, which means you normally cut off a big part of the chain and replace it with just 1 or 2 divisions. But there are obviously going to be a few numbers where the chain you end up jumping to is longer than the correct one. In fact, it only took me 3 tries to find one: 3,743,559,068,799,123 should take 326 steps, but it takes 370.

I suspect (but again, I can't even imagine how to prove) that many big numbers are going to end up in that same range around 375, a little shorter as they get (logarithmically) bigger. Why? Well, there's only so many numbers you can round to—and most of them are probably in cycles with each other you start doing truncating division. So, let's say that almost every number near 2**53 has a rounding cycle length of a bit over 50, and most numbers in the trillions range reach that 2**53 range in a bit over 300 steps… then most of them are going to end up around 375. (Those numbers are pulled out of thin air, of course, but you could do a Monte Carlo simulation to see how far from reality they actually are…)

answered Sep 22 '22 23:09

abarnert

Related questions
                            
                                looking for source code of from gen_nn_ops in tensorflow
                            
                                Recommendation for python form validation library [closed]
                            
                                How to nest records in an Avro schema?
                            
                                Converting byte string in unicode string
                            
                                Wrapping long y labels in matplotlib tight layout using setp
                            
                                django template if or statement
                            
                                Python Multiprocessing Lib Error (AttributeError: __exit__)
                            
                                Converting string 'yyyy-mm-dd' into datetime [duplicate]
                            
                                How to access pandas DataFrame datetime index using strings
                            
                                How to pick one key from a dictionary randomly
                            
                                Numpy, multiply array with scalar [duplicate]
                            
                                Custom authentication in Google App Engine
                            
                                Appending to the same list from different processes using multiprocessing
                            
                                Detect what a python string begins with [duplicate]
                            
                                Use IPython REPL in VS Code
                            
                                Convert byte array back to numpy array
                            
                                Celery: is there a way to write custom JSON Encoder/Decoder?
                            
                                What does clf mean in machine learning?
                            
                                How to exit Python script in Command Prompt?
                            
                                Seaborn - change bar color according to hue name

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Collatz Conjecture Python - Incorrect Output Above 2 Trillion (Only!)

Tags:

python

iteration

algorithm

python-3.x

collatz

ronster1000

People also ask

Video Answer

1 Answers

abarnert

Recent Activity

Donate For Us