The data is given as following:
return
2010-01-04 0.016676
2010-01-05 0.003839
...
2010-01-05 0.003839
2010-01-29 0.001248
2010-02-01 0.000134
...
What I want get is to extract all value that is the last day of month appeared in the data .
2010-01-29 0.00134
2010-02-28 ......
If I directly use pandas.resample, i.e., df.resample('M).last(). I would select the correct rows with the wrong index. (it automatically use the last day of the month as the index)
2010-01-31 0.00134
2010-02-28 ......
How can I get the correct answer in a Pythonic way?
Method 1: Use DatetimeIndex. month attribute to find the month and use DatetimeIndex.
Pandas Extract Month and Year using Datetime.strftime() method takes datetime format and returns a string representing the specific format. You can use %Y and %m as format codes to extract year and month respectively from the pandas DataFrame.
Use df. dates1-df. dates2 to find the difference between the two dates and then convert the result in the form of months.
An assumption made here is that your date data is part of the index. If not, I recommend setting it first.
I don't think the resampling or grouper functions would do. Let's group on the month number instead and call DataFrameGroupBy.tail
.
df.groupby(df.index.month).tail(1)
If your data spans multiple years, you'll need to group on the year and month. Using a single grouper created from dt.strftime
—
df.groupby(df.index.strftime('%Y-%m')).tail(1)
Or, using multiple groupers—
df.groupby([df.index.year, df.index.month]).tail(1)
Note—if your index is not a DatetimeIndex
as assumed here, you'll need to replace df.index
with pd.to_datetime(df.index, errors='coerce')
above.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With