I have DataFrame in which there is a column with event dates (dates are not unique). I need to select all the data that is in this period. I try next:
start_day = datetime.date(datetime.strptime(start_day, '%d.%m.%Y')) #change user data to date format
end_day = datetime.date(datetime.strptime(end_day, '%d.%m.%Y'))
df = df[df['Transaction_date'].between(start_day, end_day)]
The type of column Transaction_date
is datetime64[ns]
.
When I run the code with the request for the period 01/01/2020
to 31/01/2020
- part of the data for the specified period is lost (information is displayed only for 21 days, it should be for 31 days).
I see that the Data Frame is contained data for the entire requested period. Can you help me please, where is my mistake?
Use df. dates1-df. dates2 to find the difference between the two dates and then convert the result in the form of months.
There are two possible solutions: Use a boolean mask, then use df. loc[mask] Set the date column as a DatetimeIndex, then use df[start_date : end_date]
I used the following test DataFrame:
Transaction_date Value
0 2019-12-31 10
1 2020-01-10 10
2 2020-01-15 10
3 2020-01-20 10
4 2020-01-25 10
5 2020-01-28 10
6 2020-01-29 10
7 2020-01-30 10
8 2020-01-31 10
9 2020-02-01 10
10 2020-02-01 10
Start / end dates are:
start_day = '01.01.2020'
end_day = '31.01.2020'
The code is:
# Convert start / end dates to datetime
start_day = pd.to_datetime(start_day)
end_day = pd.to_datetime(end_day)
I noticed that you use datetime module, which I think is a bad practice. To this end use dedicated pandasonic function (to_datetime). It is even clever enough to recognize many of commonly used date formats, so there is no need to specify it on your own.
To get the actual result, run:
df[df['Transaction_date'].between(start_day, end_day)]
The result is:
Transaction_date Value
1 2020-01-10 10
2 2020-01-15 10
3 2020-01-20 10
4 2020-01-25 10
5 2020-01-28 10
6 2020-01-29 10
7 2020-01-30 10
8 2020-01-31 10
So as you can see:
Repeat my example on your computer. Maybe the source of your problem (that the result dous not contain all dates from January) is somewhere else?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With