Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Stocking large numbers into numpy array

I have a dataset on which I'm trying to apply some arithmetical method. The thing is it gives me relatively large numbers, and when I do it with numpy, they're stocked as 0.

The weird thing is, when I compute the numbers appart, they have an int value, they only become zeros when I compute them using numpy.

x = np.array([18,30,31,31,15])
10*150**x[0]/x[0]
Out[1]:36298069767006890

vector = 10*150**x/x
vector
Out[2]: array([0, 0, 0, 0, 0])

I have off course checked their types:

type(10*150**x[0]/x[0]) == type(vector[0])
Out[3]:True

How can I compute this large numbers using numpy without seeing them turned into zeros?

Note that if we remove the factor 10 at the beggining the problem slitghly changes (but I think it might be a similar reason):

x = np.array([18,30,31,31,15])
150**x[0]/x[0]
Out[4]:311075541538526549

vector = 150**x/x
vector
Out[5]: array([-329406144173384851, -230584300921369396, 224960293581823801,
   -224960293581823801, -368934881474191033])

The negative numbers indicate the largest numbers of the int64 type in python as been crossed don't they?

like image 884
ysearka Avatar asked May 17 '16 09:05

ysearka


People also ask

What is the maximum size of NumPy array?

There is no general maximum array size in numpy. Of course there is, it is the size of np. intp datatype. Which for 32bit versions may only be 32bits...

Can NumPy arrays have more than 2 dimensions?

In general numpy arrays can have more than one dimension. One way to create such array is to start with a 1-dimensional array and use the numpy reshape() function that rearranges elements of that array into a new shape.

Is appending to NumPy array faster than list?

It's faster to append list first and convert to array than appending NumPy arrays. NumPy automatically converts lists, usually, so I removed the unneeded array() conversions.

How do I make my NumPy array bigger?

To convert the original array into a bigger array, resize() will add more elements (than available in the original array) by copying the existing elements (repeated as many times as required) to fill the desired larger size.


2 Answers

As Nils Werner already mentioned, numpy's native ctypes cannot save numbers that large, but python itself can since the int objects use an arbitrary length implementation. So what you can do is tell numpy not to convert the numbers to ctypes but use the python objects instead. This will be slower, but it will work.

In [14]: x = np.array([18,30,31,31,15], dtype=object)

In [15]: 150**x
Out[15]: 
array([1477891880035400390625000000000000000000L,
       191751059232884086668491363525390625000000000000000000000000000000L,
       28762658884932613000273704528808593750000000000000000000000000000000L,
       28762658884932613000273704528808593750000000000000000000000000000000L,
       437893890380859375000000000000000L], dtype=object)

In this case the numpy array will not store the numbers themselves but references to the corresponding int objects. When you perform arithmetic operations they won't be performed on the numpy array but on the objects behind the references.
I think you're still able to use most of the numpy functions with this workaround but they will definitely be a lot slower than usual.

But that's what you get when you're dealing with numbers that large :D
Maybe somewhere out there is a library that can deal with this issue a little better.

Just for completeness, if precision is not an issue, you can also use floats:

In [19]: x = np.array([18,30,31,31,15], dtype=np.float64)

In [20]: 150**x
Out[20]: 
array([  1.47789188e+39,   1.91751059e+65,   2.87626589e+67,
         2.87626589e+67,   4.37893890e+32])
like image 157
swenzel Avatar answered Oct 25 '22 02:10

swenzel


150 ** 28 is way beyond what an int64 variable can represent (it's in the ballpark of 8e60 while the maximum possible value of an unsigned int64 is roughly 18e18).

Python may be using an arbitrary length integer implementation, but NumPy doesn't.

As you deduced correctly, negative numbers are a symptom of an int overflow.

like image 45
Nils Werner Avatar answered Oct 25 '22 04:10

Nils Werner