slicing pandas DataFrame with negative index with ix() method

Question

DataFrame.ix() does not seem to slice the DataFrame that I want when negative indexing is used.

I have a DataFrame object and want to slice the last 2 rows.

    In [90]: df = pd.DataFrame(np.random.randn(10, 4))

    In [91]: df
    Out[91]: 
            0         1         2         3
    0  1.985922  0.664665 -2.800102  1.695480
    1  0.580509  0.782473  1.032970  1.559917
    2  0.584387  1.798743  0.095950  0.071999
    3  1.956221  0.075530 -0.391008  1.692585
    4 -0.644979 -1.959265  0.749394 -0.437995
    5 -1.204964  0.653912 -1.426602  2.409855
    6  1.178886  2.177259 -0.165106  1.145952
    7  1.410595 -0.761426 -1.280866  0.609122
    8  0.110534 -0.234781 -0.819976  0.252080
    9  1.798894  0.553394 -1.358335  1.278704

One way to do it:

    In [92]: df[-2:]
    Out[92]: 
              0         1         2         3
    8  0.110534 -0.234781 -0.819976  0.252080
    9  1.798894  0.553394 -1.358335  1.278704

Anther way to do it:

    In [93]: df.ix[len(df)-2:, :]
    Out[93]: 
              0         1         2         3
    8  0.110534 -0.234781 -0.819976  0.252080
    9  1.798894  0.553394 -1.358335  1.278704

Now I want to use negative indexing, but having problem:

    In [94]: df.ix[-2:, :]
    Out[94]: 
              0         1         2         3
    0  1.985922  0.664665 -2.800102  1.695480
    1  0.580509  0.782473  1.032970  1.559917
    2  0.584387  1.798743  0.095950  0.071999
    3  1.956221  0.075530 -0.391008  1.692585
    4 -0.644979 -1.959265  0.749394 -0.437995
    5 -1.204964  0.653912 -1.426602  2.409855
    6  1.178886  2.177259 -0.165106  1.145952
    7  1.410595 -0.761426 -1.280866  0.609122
    8  0.110534 -0.234781 -0.819976  0.252080
    9  1.798894  0.553394 -1.358335  1.278704

How do I use negative indexing with DataFrame.ix() correctly? Thanks.

Wes McKinney · Accepted Answer

This is a bug:

In [1]: df = pd.DataFrame(np.random.randn(10, 4))

In [2]: df
Out[2]: 
          0         1         2         3
0 -3.100926 -0.580586 -1.216032  0.425951
1 -0.264271 -1.091915 -0.602675  0.099971
2 -0.846290  1.363663 -0.382874  0.065783
3 -0.099879 -0.679027 -0.708940  0.138728
4 -0.302597  0.753350 -0.112674 -1.253316
5 -0.213237 -0.467802  0.037350  0.369167
6  0.754915 -0.569134 -0.297824 -0.600527
7  0.644742  0.038862  0.216869  0.294149
8  0.101684  0.784329  0.218221  0.965897
9 -1.482837 -1.325625  1.008795 -0.150439

In [3]: df.ix[-2:]
Out[3]: 
          0         1         2         3
0 -3.100926 -0.580586 -1.216032  0.425951
1 -0.264271 -1.091915 -0.602675  0.099971
2 -0.846290  1.363663 -0.382874  0.065783
3 -0.099879 -0.679027 -0.708940  0.138728
4 -0.302597  0.753350 -0.112674 -1.253316
5 -0.213237 -0.467802  0.037350  0.369167
6  0.754915 -0.569134 -0.297824 -0.600527
7  0.644742  0.038862  0.216869  0.294149
8  0.101684  0.784329  0.218221  0.965897
9 -1.482837 -1.325625  1.008795 -0.150439

https://github.com/pydata/pandas/issues/2600

Note that df[-2:] will work:

In [4]: df[-2:]
Out[4]: 
          0         1         2         3
8  0.101684  0.784329  0.218221  0.965897
9 -1.482837 -1.325625  1.008795 -0.150439

Zelazny7 · Answer

ix's main purpose is to allow numpy like indexing with support for row and column labels. So I'm not sure your use-case is the intended purpose. Here are a couple of ways I can think of, mostly trivial:

In [142]: df.ix[:][-2:]
Out[142]:
          0         1         2         3
8  0.386882 -0.836112 -0.108250 -0.433797
9  0.642468 -0.399255 -0.911456 -0.497720

In [161]: df.ix[df.index[-2:],:]
Out[161]:
          0         1         2         3
8  0.386882 -0.836112 -0.108250 -0.433797
9  0.642468 -0.399255 -0.911456 -0.497720

I don't think ix supports negative indexing at all. It seems to just ignore it altogether:

In [181]: df.ix[-100:,:]
Out[181]:
          0         1         2         3
0 -1.144137 -1.042034 -2.158838  0.674055
1 -0.424184  1.237318 -1.846130  0.575357
2 -0.844974 -0.541060  2.197364 -0.031898
3  0.846263  1.244450 -1.570566 -0.477919
4 -0.193445  0.171045 -0.235587 -1.185583
5  1.361539 -1.107389 -1.321081 -0.776407
6  0.505907 -1.364414 -2.093770  0.144016
7 -0.888465 -0.329153  0.491264 -0.363472
8  0.386882 -0.836112 -0.108250 -0.433797
9  0.642468 -0.399255 -0.911456 -0.497720

Edit: From the pandas documentation we have:

Label-based indexing with integer axis labels is a thorny topic. It has been discussed heavily on mailing lists and among various members of the scientific Python community. In pandas, our general viewpoint is that labels matter more than integer locations. Therefore, with an integer axis index only label-based indexing is possible with the standard tools like .ix. The following code will generate exceptions:
s = Series(range(5))
s[-1]
df = DataFrame(np.random.randn(5, 4))
df
df.ix[-2:]
This deliberate decision was made to prevent ambiguities and subtle bugs (many users reported finding bugs when the API change was made to stop “falling back” on position-based indexing).

slicing pandas DataFrame with negative index with ix() method

Tags:

slice

indexing

pandas

dataframe

Julia He

2 Answers

Wes McKinney

Zelazny7

Recent Activity

Donate For Us

slicing pandas DataFrame with negative index with ix() method

Tags:

slice

indexing

pandas

dataframe

Julia He

2 Answers

Wes McKinney

Zelazny7

Related questions

Recent Activity

Donate For Us