Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove dtype datetime NaT

Tags:

python

pandas

I am preparing a pandas df for output, and would like to remove the NaN and NaT in the table, and leave those table locations blank. An example would be

mydataframesample 

col1    col2     timestamp
a       b        2014-08-14
c       NaN      NaT

would become

col1    col2     timestamp
a       b        2014-08-14
c       

Most of the values are dtypes object, with the timestamp column being datetime64[ns]. In order to fix this, I attempted to use panda's mydataframesample.fillna(' ') to effectively leave a space in the location. However, this doesn't work with the datetime types. In order to get around this, I'm trying to convert the timestamp column back to object or string type.

Is it possible to remove the NaN/NaT without doing the type conversion? If not, how do I do the type conversion (tried str() and astype(str) but difficulty with datetime being the original format)?

like image 952
ding Avatar asked Aug 05 '14 14:08

ding


2 Answers

I had the same issue: This does it all in place using pandas apply function. Should be the fastest method.

import pandas as pd
df['timestamp'] = df['timestamp'].apply(lambda x: x.strftime('%Y-%m-%d')if not pd.isnull(x) else '')

if your timestamp field is not yet in datetime format then:

import pandas as pd
df['timestamp'] = pd.to_datetime(df['timestamp']).apply(lambda x: x.strftime('%Y-%m-%d')if not pd.isnull(x) else '')
like image 99
Alexander McFarlane Avatar answered Oct 13 '22 19:10

Alexander McFarlane


This won't win any speed awards, but if the DataFrame is not too long, reassignment using a list comprehension will do the job:

df1['date'] = [d.strftime('%Y-%m-%d') if not pd.isnull(d) else '' for d in df1['date']]

import numpy as np
import pandas as pd
Timestamp = pd.Timestamp
nan = np.nan
NaT = pd.NaT
df1 = pd.DataFrame({
    'col1': list('ac'),
    'col2': ['b', nan],
    'date': (Timestamp('2014-08-14'), NaT)
    })

df1['col2'] = df1['col2'].fillna('')
df1['date'] = [d.strftime('%Y-%m-%d') if not pd.isnull(d) else '' for d in df1['date']]

print(df1)

yields

  col1 col2        date
0    a    b  2014-08-14
1    c                 
like image 20
unutbu Avatar answered Oct 13 '22 19:10

unutbu