Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas: convert multiple columns to string

I have some columns ['a', 'b', 'c', etc.] (a and c are float64 while b is object)

I would like to convert all columns to string and preserve nans.

Tried using df[['a', 'b', 'c']] == df[['a', 'b', 'c']].astype(str) but that left blanks for the float64 columns.

Currently I am going through one by one with the following:

df['a'] = df['a'].apply(str)
df['a'] = df['a'].replace('nan', np.nan)

Is the best way to use .astype(str) and then replace '' with np.nan? Side question: is there a difference between .astype(str) and .apply(str)?

Sample Input: (dtypes: a=float64, b=object, c=float64)

a, b, c, etc.
23, 'a42', 142, etc.
51, '3', 12, etc.
NaN, NaN, NaN, etc.
24, 'a1', NaN, etc.

Desired output: (dtypes: a=object, b=object, c=object)

a, b, c, etc.
'23', 'a42', '142', etc.
'51', 'a3', '12', etc.
NaN, NaN, NaN, etc.
'24', 'a1', NaN, etc.
like image 477
As3adTintin Avatar asked May 21 '26 04:05

As3adTintin


2 Answers

This gives you the list of column names

lst = list(df)

This converts all the columns to string type

df[lst] = df[lst].astype(str)
like image 99
Raj Avatar answered May 23 '26 17:05

Raj


df = pd.DataFrame({
    'a': [23.0, 51.0, np.nan, 24.0],
    'b': ["a42", "3", np.nan, "a1"],
    'c': [142.0, 12.0, np.nan, np.nan]})

for col in df:
    df[col] = [np.nan if (not isinstance(val, str) and np.isnan(val)) else 
               (val if isinstance(val, str) else str(int(val))) 
               for val in df[col].tolist()]

>>> df
     a    b    c
0   23  a42  142
1   51    3   12
2  NaN  NaN  NaN
3   24   a1  NaN

>>> df.values
array([['23', 'a42', '142'],
       ['51', '3', '12'],
       [nan, nan, nan],
       ['24', 'a1', nan]], dtype=object)
like image 35
Alexander Avatar answered May 23 '26 18:05

Alexander



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!