I am preparing a pandas df for output, and would like to remove the NaN and NaT in the table, and leave those table locations blank. An example would be
mydataframesample
col1 col2 timestamp
a b 2014-08-14
c NaN NaT
would become
col1 col2 timestamp
a b 2014-08-14
c
Most of the values are dtypes object, with the timestamp column being datetime64[ns]. In order to fix this, I attempted to use panda's mydataframesample.fillna(' ')
to effectively leave a space in the location. However, this doesn't work with the datetime types. In order to get around this, I'm trying to convert the timestamp column back to object or string type.
Is it possible to remove the NaN/NaT without doing the type conversion? If not, how do I do the type conversion (tried str() and astype(str) but difficulty with datetime being the original format)?
I had the same issue: This does it all in place using pandas apply function. Should be the fastest method.
import pandas as pd
df['timestamp'] = df['timestamp'].apply(lambda x: x.strftime('%Y-%m-%d')if not pd.isnull(x) else '')
if your timestamp field is not yet in datetime
format then:
import pandas as pd
df['timestamp'] = pd.to_datetime(df['timestamp']).apply(lambda x: x.strftime('%Y-%m-%d')if not pd.isnull(x) else '')
This won't win any speed awards, but if the DataFrame is not too long, reassignment using a list comprehension will do the job:
df1['date'] = [d.strftime('%Y-%m-%d') if not pd.isnull(d) else '' for d in df1['date']]
import numpy as np
import pandas as pd
Timestamp = pd.Timestamp
nan = np.nan
NaT = pd.NaT
df1 = pd.DataFrame({
'col1': list('ac'),
'col2': ['b', nan],
'date': (Timestamp('2014-08-14'), NaT)
})
df1['col2'] = df1['col2'].fillna('')
df1['date'] = [d.strftime('%Y-%m-%d') if not pd.isnull(d) else '' for d in df1['date']]
print(df1)
yields
col1 col2 date
0 a b 2014-08-14
1 c
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With