Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why pandas silently ignores .iloc[i, j] assignment with too many indices?

Tags:

python

pandas

Why does pandas behave differently when setting or getting items in a series with erroneous number of indexes:

df = pd.DataFrame({'a': [10]})
# df['a'] is a series, can be indexed with 1 index only

# will raise IndexingError, as expected
df['a'].iloc[0, 0]
df['a'].loc[0, 0]

# will raise nothing, not as expected
df['a'].iloc[0, 0] = 1000 # equivalent to pass
df['a'].loc[0, 0] = 1000 # equivalent to df['a'].loc[0] = 1000

# pandas version 0.18.1, python 3.5

Edit: Reported.

like image 668
max Avatar asked Nov 09 '22 11:11

max


1 Answers

Getting values

If the key is a tuple (as in your example), then the __getitem__ method of the superclass for the loc and iloc objects at some point calls _has_valid_tuple(self, key).

This method has the following code

for i, k in enumerate(key):
    if i >= self.obj.ndim:
        raise IndexingError('Too many indexers')

This raises an IndexingError you would expect.

Setting values

The superclass's __setitem__ makes a call to _get_setitem_indexer and in turn _convert_to_indexer.

This superclass's implementation of _convert_to_indexer is a bit messy but in this case it returns a numpy array [0, 0].

The class of the iLoc indexer, however, overrides _convert_to_indexer. This method returns the original tuple...

def _convert_to_indexer(self, obj, axis=0, is_setter=False):
    ...
    elif self._has_valid_type(obj, axis):
        return obj

Now an indexer variable is a numpy array for the .loc case and tuple for the .iloc case. This causes the difference in setting behavior in the subsequent superclass call to _setitem_with_indexer(indexer, value).

like image 149
Alex Avatar answered Nov 14 '22 21:11

Alex