Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas filter dataframe rows with a specific year

I have a dataframe df and it has a Date column. I want to create two new data frames. One which contains all of the rows from df where the year equals some_year and another data frame which contains all of the rows of df where the year does not equal some_year. I know you can do df.ix['2000-1-1' : '2001-1-1'] but in order to get all of the rows which are not in 2000 requires creating 2 extra data frames and then concatenating/joining them.

Is there some way like this?

include = df[df.Date.year == year]
exclude = df[df['Date'].year != year]

This code doesn't work, but is there any similar sort of way?

like image 277
user3494047 Avatar asked Oct 22 '17 19:10

user3494047


People also ask

How do you filter by year in Python?

You can also use df[df['Date']. dt. strftime('%Y')=='2021'] method to filter by year.

How do I filter specific rows from a DataFrame Pandas?

Filter Rows by Condition You can use df[df["Courses"] == 'Spark'] to filter rows by a condition in pandas DataFrame. Not that this expression returns a new DataFrame with selected rows.


2 Answers

You can use datetime accesor.

import datetime as dt
df['Date'] = pd.to_datetime(df['Date'])

include = df[df['Date'].dt.year == year]
exclude = df[df['Date'].dt.year != year]
like image 64
Vaishali Avatar answered Oct 23 '22 08:10

Vaishali


You can simplify it by inverting mask by ~ and for condition use Series.dt.year with int for cast string year:

mask = df['Date'].dt.year == int(year)
include = df[mask]
exclude = df[~mask]
like image 43
jezrael Avatar answered Oct 23 '22 08:10

jezrael