Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replacing datetime conditionally - python

I have a 'Posting Date' column in the dataframe in the format of '2017-03-01'. The type is <datetime64>[ns]. And I want to change the value if it is after '2017-03-31' to '2017-03-31', and all others remain unchanged.

When I type df['Posting Date']>'2017-03-31',it can correctly show me all the rows where the condition is met. So I guess the date filtering function works.

However, when I used numpy.where to write the condition as this:

df['Posting Date'] = np.where(df['Posting Date']>'2017-03-31','2017-03-31,'df['Posting Date'])

it incurrs an invalid type promotion error.

I also tried df.loc and the same error occers.

df.loc[df['Posting Date']>'2017-03-31','Posting Date']='2017-03-31'

ValueError: invalid literal for int() with base 10: '2017-03-31'

I'm wondering why the error occurs. How can I replace date correctly? Whatever method which works is fine.

like image 955
Lavender Pan Avatar asked Dec 02 '25 20:12

Lavender Pan


1 Answers

Its because of are trying to replace datetime with string in datetime dtype column so pass a datetime in np.where i.e

df['Posting Date'] = np.where(df['Posting Date']>'2017-03-31',pd.to_datetime(['2017-03-31']),df['Posting Date'])

Example output :

df = pd.DataFrame({'Posting Date': pd.to_datetime(['20-4-2017','20-4-2017','20-4-2017','20-3-2017','20-2-2017'])})
df['Posting Date'] = np.where(df['Posting Date']>'2017-03-31',pd.to_datetime(['2017-03-31']),df['Posting Date'])

Output :

Posting Date
0   2017-03-31
1   2017-03-31
2   2017-03-31
3   2017-03-20
4   2017-02-20

Better one posted by @pirSquared in comment using clip i.e

df['Posting Date'] = df['Posting Date'].clip(upper=pd.Timestamp('2017-03-31')) 
like image 198
Bharath Avatar answered Dec 05 '25 10:12

Bharath



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!