I have a weird data set:
year firms age survival
0 1977 564918 0 NaN
2 1978 503991 0 NaN
3 1978 413130 1 0.731310
5 1979 497805 0 NaN
6 1979 390352 1 0.774522
where I have cast the dtype
of the first three columns to be integer:
>>> df.dtypes
year int64
firms int64
age int64
survival float64
But now I want to search in another table based on an index here:
idx = 331
otherDf.loc[df.loc[idx, 'age']]
Traceback (most recent call last):
(...)
KeyError: 8.0
This comes from
df.loc[idx, 'age']
8.0
Why does this keep returning a float value? And how can I perform the lookup in otherDf
? I'm in pandas version 0.15
.
You get back a float because each row contains a mix of float
and int
types. Upon selecting a row index with loc
, integers are cast to floats:
>>> df.loc[4]
year 1979.000000
firms 390352.000000
age 1.000000
survival 0.774522
Name: 4, dtype: float64
So choosing the age
entry here with df.loc[4, 'age']
would yield 1.0
.
To get around this and return an integer, you could use loc
to select from just the age
column and not the whole DataFrame:
>>> df['age'].loc[4]
1
This was a bug in pandas up through version 0.19. It seems to have been fixed in version 0.20. cf. https://github.com/pandas-dev/pandas/issues/11617
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With