Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Checking that a pandas.Series.index contains a value

I (think I) know how to check if a value is contained in the index of a pandas Series, but I can't get it to work in the example below. Is it a bug perhaps?

First, I generate some random numbers:

import numpy as np
import pandas as pd

some_numbers = np.random.randint(0,4,size=10)
print(some_numbers)

Output:

[0 2 2 3 1 1 2 2 3 2]

Then, I create a Series with those numbers and compute their frequency

s = pd.Series(some_numbers)
gb = s.groupby(s).size() / len(s)
print(gb)

Output:

0    0.1
1    0.2
2    0.5
3    0.2
dtype: float64

So far, so good. But I do not understand the output of the next line of code:

1.3 in gb

Output:

True

Shouldn't the output be False? (I have pandas 0.20.3 on Python 3.6.2)

I know that I could use

1.3 in list(gb.index)

but this is not very efficient if the Series is large.

SIMPLER EXAMPLE TO SHOW THE BUG

import pandas as pd
s = pd.Series([.1,.2,.3])
print(s)

0    0.1
1    0.2
2    0.3
dtype: float64
3.4 in s

False

but, wait for it...

s = pd.Series([.1,.2,.3,.4])
print(s)

0    0.1
1    0.2
2    0.3
3    0.4
dtype: float64
3.4 in s

True
like image 322
user3537808 Avatar asked Jul 27 '18 18:07

user3537808


1 Answers

I believe that the issue is that gb.index is an int64 index:

>>> gb.index
Int64Index([0, 1, 2, 3], dtype='int64')

>>> type(gb.index)
<class 'pandas.core.indexes.numeric.Int64Index'>

and so when doing your comparison to 1.3, that value is being converted to an int. Some evidence for this is that values up to 3.99999 will return True, because converting that to int gives you 3, however, 4.000001 in gb.index returns False because converting 4.000001 to int returns 4 (which is not in gb.index)

If you force it to a float index, you end up getting false, because 1.3 is not in Float64Index([0.0, 1.0, 2.0, 3.0], dtype='float64'):

>>> 1.3 in gb.index.astype('float')
False

tested in pandas '0.21.1', python 3.6.3

like image 131
sacuL Avatar answered Nov 13 '22 12:11

sacuL