How to convert all columns in Pandas DataFrame to 'object' while ignoring NaN?

Question

I have a dataframe for which I want every column to be in string format. So I do this:

 df = df.astype(str)

The problem is that in this way all the NaN entries are converted to string 'nan'. And isnull returns false. Is there a way to convert to string but keep empty entry as it is?

sacuL · Accepted Answer

When you do astype(str), the dtype is always going to be object, which is a dtype that includes mixed columns. Therefore, one thing you can do is convert it to object using astype(str), as you were doing, but then replace the nan with actual NaN (which is inherently a float), allowing you to access it with methods such as isnull:

df.astype(str).replace('nan',np.nan)

Example:

df = pd.DataFrame({'col1':['x',2,np.nan,'z']})
>>> df
  col1
0    x
1    2
2  NaN
3    z

# Note the mixed str, int and null values:
>>> df.values
array([['x'],
       [2],
       [nan],
       ['z']], dtype=object)

df2 = df.astype(str).replace('nan',np.nan)

# Note that now you have only strings and null values:
>>> df2.values
array([['x'],
       ['2'],
       [nan],
       ['z']], dtype=object)

Alexander · Answer

Convert your null values to empty strings, then cast the dataframe as string type.

df.replace(np.nan, '').astype(str)

Note that you could test for 'nulls' via:

df.apply(lambda s: s.str.len() == 0)

How to convert all columns in Pandas DataFrame to 'object' while ignoring NaN?

Tags:

python

pandas

Catiger3331

2 Answers

sacuL

Alexander

Recent Activity

Donate For Us

How to convert all columns in Pandas DataFrame to 'object' while ignoring NaN?

Tags:

python

pandas

Catiger3331

2 Answers

sacuL

Alexander

Related questions

Recent Activity

Donate For Us