Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas Loc select by index as well as boolean condition in single expression

Tags:

python

pandas

I have a simplified Dataframe which can be set up as follows:

indexes =['01/10/2017', '28/10/2018', '27/10/2019', '30/10/2019']
cols = ['Period', 'A', 'B', 'C']
df= pd.DataFrame(index = indexes, columns= cols)
df.Period = 1
df = pd.concat([df, 2*df.copy(), 3*df.copy()])
df.sort_index()

The Dataframe looks like:

    Period        A      B       C
01/10/2017  1   NaN     NaN     NaN
01/10/2017  2   NaN     NaN     NaN
01/10/2017  3   NaN     NaN     NaN
27/10/2019  1   NaN     NaN     NaN
27/10/2019  2   NaN     NaN     NaN
27/10/2019  3   NaN     NaN     NaN
28/10/2018  1   NaN     NaN     NaN
28/10/2018  2   NaN     NaN     NaN
28/10/2018  3   NaN     NaN     NaN
30/10/2019  1   NaN     NaN     NaN
30/10/2019  2   NaN     NaN     NaN
30/10/2019  3   NaN     NaN     NaN

I want to find the rows that are in this list:

FwdTimeChangeDates = ['28/10/2018', '27/10/2019']

with Period that is > 2.

I want to +=2 to the Period with those conditions (So Period 3-->5, and 4-->6, etc.).

How do I filter based on the two conditions?

df.loc[FwdTimeChangeDates] Gives:

Period A B C 28/10/2018 1 NaN NaN NaN 28/10/2018 2 NaN NaN NaN 28/10/2018 3 NaN NaN NaN 27/10/2019 1 NaN NaN NaN 27/10/2019 2 NaN NaN NaN 27/10/2019 3 NaN NaN NaN

and

df.loc[df.Period>2]

Gives

Period A B C 01/10/2017 3 NaN NaN NaN 28/10/2018 3 NaN NaN NaN 27/10/2019 3 NaN NaN NaN 30/10/2019 3 NaN NaN NaN

and I want:

Period A B C 28/10/2018 3 NaN NaN NaN 27/10/2019 3 NaN NaN NaN

But I can't join the two conditions with:

df.loc[FwdTimeChangeDates & df.Period>1] or df.loc[(FwdTimeChangeDates) & (df.Period>1)]

like image 450
A H Avatar asked Nov 07 '17 15:11

A H


People also ask

Can we perform boolean indexing on a DataFrame?

The Boolean values like True & false and 1&0 can be used as indexes in panda dataframe. They can help us filter out the required records. In the below exampels we will see different methods that can be used to carry out the Boolean indexing operations.

What is boolean indexing in DataFrame in pandas?

Boolean indexing is defined as a very important feature of numpy, which is frequently used in pandas. Its main task is to use the actual values of the data in the DataFrame. We can filter the data in the boolean indexing in different ways, which are as follows: Access the DataFrame with a boolean index.

What is the difference between ILOC () and loc ()?

The main distinction between the two methods is: loc gets rows (and/or columns) with particular labels. iloc gets rows (and/or columns) at integer locations.


2 Answers

Combine the two conditions, use isin for the first.

df[df.index.isin(['28/10/2018', '27/10/2019']) & (df.Period > 2)]

            Period    A    B    C
28/10/2018       3  NaN  NaN  NaN
27/10/2019       3  NaN  NaN  NaN 
like image 116
cs95 Avatar answered Nov 14 '22 22:11

cs95


If the dataframe has a multiindex:

df.loc[(df.index.isin(FwdTimeChangeDates, level=0)) & (df.Period > 2), 'Period']
like image 26
Michael C Avatar answered Nov 14 '22 21:11

Michael C