I have a dataframe with a date column that I update daily. I'd like to create a copy of it with just the past 30 day's of data.
I tried the following syntax based on what I know about doing this in R:
df[df[date]>dt.date.today()-30]
The date column is not the index but I'm not opposed to making it so if that helps!
Thanks!
To filter rows based on dates, first format the dates in the DataFrame to datetime64 type. Then use the DataFrame. loc[] and DataFrame. query[] function from the Pandas package to specify a filter condition.
You can use df[df["Courses"] == 'Spark'] to filter rows by a condition in pandas DataFrame. Not that this expression returns a new DataFrame with selected rows. You can also write the above statement with a variable.
Using Loc to Filter With Multiple Conditions The loc function in pandas can be used to access groups of rows or columns by label. Add each condition you want to be included in the filtered result and concatenate them with the & operator. You'll see our code sample will return a pd. dataframe of our filtered rows.
You can use pandas. Series. between() method to select DataFrame rows between two dates. This method returns a boolean vector representing whether series element lies in the specified range or not.
Try this:
import datetime
import pandas as pd
df[df.the_date_column > datetime.datetime.now() - pd.to_timedelta("30day")]
Update: Edited as suggested by Josh.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With