Select multiple sections of rows by index in pandas

Question

I have large DataFrame with GPS path and some attributes. A few sections of the path are those which I need to analyse. I would like to subset only those sections to a new DataFrame. I can subset one section at the time but the idea is to have them all and to have an original index.

The problem is similar to:

import pandas as pd 
df = pd.DataFrame({'A':[0,1,2,3,4,5,6,7,8,9],'B':['a','b','c','d','e','f','g','h','i','j']},
                  index=range(10,20,))

I want o get something like:

cdf = df.loc[[11:13] & [17:20]] # SyntaxError: invalid syntax

desired outcome:

I know the example is easy with cdf = df.loc[[11,12,13,17,18,19],:] but in the original problem I have thousands of lines and some entries already removed, so listing points is rather not an option.

unutbu · Accepted Answer

You could use np.r_ to concatenate the slices:

In [16]: df.loc[np.r_[11:13, 17:20]]
Out[16]: 
    A  B
11  1  b
12  2  c
17  7  h
18  8  i
19  9  j

Note, however, that df.loc[A:B] selects labels A through B with B included. np.r_[A:B] returns an array of A through B with B excluded. To include B you would need to use np.r_[A:B+1].

When passed a slice, such as df.loc[A:B], df.loc ignores labels that are not in df.index. In contrast, when passed an array, such as df.loc[np.r_[A:B]], df.loc may add a new row filled with NaNs for each value in the array which is not in df.index.

Thus to produce the desired result, you would need to adjust the right endpoint of the slices and use isin to test for membership in df.index:

In [26]: df.loc[df.index.isin(np.r_[11:14, 17:21])]
Out[26]: 
    A  B
11  1  b
12  2  c
13  3  d
17  7  h
18  8  i
19  9  j

Select multiple sections of rows by index in pandas

Tags:

python

slice

pandas

tomasz74

1 Answers

unutbu

Recent Activity

Donate For Us

Select multiple sections of rows by index in pandas

Tags:

python

slice

pandas

tomasz74

1 Answers

unutbu

Related questions

Recent Activity

Donate For Us