I have a Dataframe, df, with the following column:
df['ArrivalDate'] = ... 936 2012-12-31 938 2012-12-29 965 2012-12-31 966 2012-12-31 967 2012-12-31 968 2012-12-31 969 2012-12-31 970 2012-12-29 971 2012-12-31 972 2012-12-29 973 2012-12-29 ...
The elements of the column are pandas.tslib.Timestamp.
I want to just include the year and month. I thought there would be simple way to do it, but I can't figure it out.
Here's what I've tried:
df['ArrivalDate'].resample('M', how = 'mean')
I got the following error:
Only valid with DatetimeIndex or PeriodIndex
Then I tried:
df['ArrivalDate'].apply(lambda(x):x[:-2])
I got the following error:
'Timestamp' object has no attribute '__getitem__'
Any suggestions?
Edit: I sort of figured it out.
df.index = df['ArrivalDate']
Then, I can resample another column using the index.
But I'd still like a method for reconfiguring the entire column. Any ideas?
Except the above formula, you can also apply this formula: =TEXT(A2, "mmm") & "-" & TEXT(A2, "yyyy"). 2. In above formulas, A2 indicates the date cell that you want to use, and the separator “-” is used to separate the month and year, you can change it to any other delimiters you need.
If you want new columns showing year and month separately you can do this:
df['year'] = pd.DatetimeIndex(df['ArrivalDate']).year df['month'] = pd.DatetimeIndex(df['ArrivalDate']).month
or...
df['year'] = df['ArrivalDate'].dt.year df['month'] = df['ArrivalDate'].dt.month
Then you can combine them or work with them just as they are.
The df['date_column']
has to be in date time format.
df['month_year'] = df['date_column'].dt.to_period('M')
You could also use D
for Day, 2M
for 2 Months etc. for different sampling intervals, and in case one has time series data with time stamp, we can go for granular sampling intervals such as 45Min
for 45 min, 15Min
for 15 min sampling etc.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With