I setup a dataframe with UInt64Index like so
df = pandas.DataFrame([[1,2,3],[4,5,9223943912072220999],[7,8,9]], columns=['a','b','c'])
df = df.set_index('c')
>>> df
a b
c
3 1 2
9223943912072220999 4 5
9 7 8
>>> df.index
UInt64Index([3, 9223943912072220999, 9], dtype='uint64', name=u'c')
Now trying to access elements by index values works for the smaller values
>>> df.index[0]
3
>>> df.loc[3]
a 1
b 2
Name: 3, dtype: int64
But trying to do the same thing for the big value causes an error
>>> df.index[1]
9223943912072220999
>>> df.loc[9223943912072220999]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/u1/mprager/.virtualenvs/jupyter/local/lib/python2.7/site-packages/pandas/core/indexing.py", line 1373, in __getitem__
return self._getitem_axis(maybe_callable, axis=axis)
File "/home/u1/mprager/.virtualenvs/jupyter/local/lib/python2.7/site-packages/pandas/core/indexing.py", line 1626, in _getitem_axis
self._has_valid_type(key, axis)
File "/home/u1/mprager/.virtualenvs/jupyter/local/lib/python2.7/site-packages/pandas/core/indexing.py", line 1514, in _has_valid_type
error()
File "/home/u1/mprager/.virtualenvs/jupyter/local/lib/python2.7/site-packages/pandas/core/indexing.py", line 1501, in error
axis=self.obj._get_axis_name(axis)))
KeyError: u'the label [9223943912072220999] is not in the [index]'
I thought it might be some kind of dtype issue but even if I do df.loc[df.index[1]] I get the same error.
This is using pandas 0.22.0 on python 2.7.9
This could be a bug. 9223943912072220999 seems to be too large to fit into a standard C signed long variable, and this is also causing problems with loc. One alternative would be to use df.index.get_loc, get the index, and then use iloc as the indexer for position based indexing.
i = df.index.get_loc(9223943912072220999)
df.iloc[i]
a 4
b 5
Name: 9223943912072220999, dtype: int64
Another alternative would be to deal with the index as an object array -
df.index = df.index.astype(object)
This allows you to work with arbitrarily large numbers (basically, anything that you can hash can now sit inside an object index) -
df.loc[9223943912072220999]
a 4
b 5
Name: 9223943912072220999, dtype: int64
Note that, as far as alternatives go, this is one of the worse ones, and likely less performant.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With