Multiplication of two positive numbers gives a negative output in Python 3

Question

I have a DataFrame df1 :

df1.head() = 

             wght          num_links 
id_y  id_x                      
 3     133   0.000203          2      
       186   0.000203          2 
 5     6     0.000203          2      
       98    0.000203          2      
       184   0.000203          2

I need to calculate a variable called thr,

thr = N*(N-1)*2,

where Nis the number of rows of df1.

The problem is that when I calculate thr,Python throws a negative value(although all of the inputs are positive):

ipdb> df1['wght'].count()*(df1['wght'].count()-1)*2
-712569744

Possible hint

The number of rows N is

ipdb> df1['wght'].count() 
137736

therefore,

ipdb> 137736*137735*2
37942135920.

Taking into account that the max value that can be assigned to a int32 is 2147483647, I suspect that NumPy considers type(thr) = <int32>, when it should be <int64>. Does this make sense?

Please note that I have not written the code that generates df1 because

ipdb> df1['wght'].count() 
137736

However, if it is needed to reproduce the error, let me know.

Thanks in advance.

MaxU - stop WAR against UA · Accepted Answer

You are experiencing np.int32 overflow, so just use len(df) instead of df.column.count().

Here is a small demo:

In [149]: x = pd.DataFrame(np.random.randint(0,100,size=(137736, 3)), columns=list('ABC'))

In [150]: x.A.count() * (x.A.count() - 1) * 2
Out[150]: -712569744

In [151]: len(x) * (len(x) - 1) * 2
Out[151]: 37942135920

In [153]: type(x.A.count())
Out[153]: numpy.int32

In [154]: type(len(x))
Out[154]: int

DomTomCat · Answer

If you get the type of count() (i.e. type(df1['wght'].count())) you'll receive:

<class 'numpy.int32'>

So try your computation with:

n = df1['wght'].count().astype(np.int64)
n*(n-1)*2

Amritesh Anand · Answer

You can pass df1['wght'].count() to long constructor like this, to ensure it is long.

N = long(df1['wght'].count())

Although storing to any variable

N = df1['wght'].count()

should work as the class int has a __mul__ method (which implements *) that creates a long result when required.

Also Python 3.x has "unified" int and long which also takes care of the bug.

Multiplication of two positive numbers gives a negative output in Python 3

Tags:

python

pandas

dataframe

numpy

Miquel

3 Answers

MaxU - stop WAR against UA

DomTomCat

Amritesh Anand

Recent Activity

Donate For Us

Multiplication of two positive numbers gives a negative output in Python 3

Tags:

python

pandas

dataframe

numpy

Miquel

3 Answers

MaxU - stop WAR against UA

DomTomCat

Amritesh Anand

Related questions

Recent Activity

Donate For Us