I have a dataframe df
and it has a Date
column. I want to create two new data frames. One which contains all of the rows from df
where the year equals some_year
and another data frame which contains all of the rows of df
where the year does not equal some_year
. I know you can do df.ix['2000-1-1' : '2001-1-1']
but in order to get all of the rows which are not in 2000 requires creating 2 extra data frames and then concatenating/joining them.
Is there some way like this?
include = df[df.Date.year == year]
exclude = df[df['Date'].year != year]
This code doesn't work, but is there any similar sort of way?
You can also use df[df['Date']. dt. strftime('%Y')=='2021'] method to filter by year.
Filter Rows by Condition You can use df[df["Courses"] == 'Spark'] to filter rows by a condition in pandas DataFrame. Not that this expression returns a new DataFrame with selected rows.
You can use datetime accesor.
import datetime as dt
df['Date'] = pd.to_datetime(df['Date'])
include = df[df['Date'].dt.year == year]
exclude = df[df['Date'].dt.year != year]
You can simplify it by inverting mask by ~
and for condition use Series.dt.year
with int
for cast string year
:
mask = df['Date'].dt.year == int(year)
include = df[mask]
exclude = df[~mask]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With