Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to merge multiple pandas column object type values into one column while ignoring "None"?

Tags:

python

pandas

Starting dataframe:

pd.DataFrame({'col1': ['one', 'None', 'None'], 'col2': ['None', 'None', 'six'], 'col3': ['None', 'eight', 'None']})

enter image description here

End goal:

pd.DataFrame({'col4': ['one', 'eight', 'six']})

enter image description here

What I tried to do:

df['col1'].map(str)+df['col2'].map(str)+df['col3'].map(str)

enter image description here

How can I merge multiple pandas column object type values into one column while ignoring "None" values? By the way, in this dataset, there will never end up being more than one value in the final dataframe cells.

like image 958
pr338 Avatar asked Dec 18 '22 00:12

pr338


1 Answers

You have string Nones, not actual null values, so you'll need to replace them first.

Option 1
replace/mask/where + fillna + agg

df.replace('None', np.nan).fillna('').agg(''.join, axis=1).to_frame('col4')

Or,

df.mask(df.eq('None')).fillna('').agg(''.join, axis=1).to_frame('col4')

Or,

df.where(df.ne('None')).fillna('').agg(''.join, axis=1).to_frame('col4')

    col4
0    one
1  eight
2    six

Option 2
replace + pd.notnull

v = df.replace('None', np.nan).values.ravel()
pd.DataFrame(v[pd.notnull(v)], columns=['col4'])

    col4
0    one
1  eight
2    six

Option 3
A solution leveraging Divakar's excellent justify function:

pd.DataFrame(justify(df.values, invalid_val='None')[:, 0], columns=['col4'])

    col4
0    one
1  eight
2    six

Reference
(Note, you will need to modify the function slightly to play nicely with string data.)

def justify(a, invalid_val=0, axis=1, side='left'):    
    """
    Justifies a 2D array

    Parameters
    ----------
    A : ndarray
        Input array to be justified
    axis : int
        Axis along which justification is to be made
    side : str
        Direction of justification. It could be 'left', 'right', 'up', 'down'
        It should be 'left' or 'right' for axis=1 and 'up' or 'down' for axis=0.

    """

    if invalid_val is np.nan:
        mask = ~np.isnan(a)
    else:
        mask = a!=invalid_val
    justified_mask = np.sort(mask,axis=axis)
    if (side=='up') | (side=='left'):
        justified_mask = np.flip(justified_mask,axis=axis)
    out = np.full(a.shape, invalid_val, dtype='<U8')    # change to be made is here
    if axis==1:
        out[justified_mask] = a[mask]
    else:
        out.T[justified_mask.T] = a.T[mask.T]
    return out
like image 70
cs95 Avatar answered Dec 27 '22 01:12

cs95