Given the following: <pre class="prettyprint"><code>df = pd.DataFrame({'col1' : ["a","b"], 'col2' : ["ab",np.nan], 'col3' : ["w","e"]}) </code></pre> I would like to be able to create a column that joins the content of all three columns into one string, separated by the character "*" while ignoring <code>NaN</code>. so that I would get something like that for example: <pre class="prettyprint"><code>a*ab*w b*e </code></pre> Any ideas? Just realised there were a few additional requirements, I needed the method to work with ints and floats and also to be able to deal with special characters (e.g., letters of Spanish alphabet).

You can use <code>dropna()</code> <pre class="prettyprint"><code>df['col4'] = df.apply(lambda row: '*'.join(row.dropna()), axis=1) </code></pre> UPDATE: Since, you need to convert numbers and special chars too, you can use <code>astype(unicode)</code> <pre class="prettyprint"><code>In [37]: df = pd.DataFrame({'col1': ["a", "b"], 'col2': ["ab", np.nan], "col3": [3, u'\xf3']}) In [38]: df.apply(lambda row: '*'.join(row.dropna().astype(unicode)), axis=1) Out[38]: 0 a*ab*3 1 b*ó dtype: object In [39]: df['col4'] = df.apply(lambda row: '*'.join(row.dropna().astype(unicode)), axis=1) In [40]: df Out[40]: col1 col2 col3 col4 0 a ab 3 a*ab*3 1 b NaN ó b*ó </code></pre>

Concatenate cells into a string with separator pandas python

Tags:

python

string

concatenation

pandas

Given the following:

df = pd.DataFrame({'col1' : ["a","b"],
            'col2'  : ["ab",np.nan], 'col3' : ["w","e"]})

I would like to be able to create a column that joins the content of all three columns into one string, separated by the character "*" while ignoring NaN.

so that I would get something like that for example:

a*ab*w
b*e

Any ideas?

Just realised there were a few additional requirements, I needed the method to work with ints and floats and also to be able to deal with special characters (e.g., letters of Spanish alphabet).

437

asked May 01 '15 08:05

Bastien

2 Answers

In [68]:

df['new_col'] = df.apply(lambda x: '*'.join(x.dropna().values.tolist()), axis=1)
df
Out[68]:
  col1 col2 col3 new_col
0    a   ab    w  a*ab*w
1    b  NaN    e     b*e

UPDATE

If you have ints or float you can convert these to str first:

In [74]:

df = pd.DataFrame({'col1' : ["a","b",3],
            'col2'  : ["ab",np.nan, 4], 'col3' : ["w","e", 6]})
df
Out[74]:
  col1 col2 col3
0    a   ab    w
1    b  NaN    e
2    3    4    6
In [76]:

df['new_col'] = df.apply(lambda x: '*'.join(x.dropna().astype(str).values), axis=1)
df
Out[76]:
  col1 col2 col3 new_col
0    a   ab    w  a*ab*w
1    b  NaN    e     b*e
2    3    4    6   3*4*6

Another update

In [81]:

df = pd.DataFrame({'col1' : ["a","b",3,'ñ'],
            'col2'  : ["ab",np.nan, 4,'ü'], 'col3' : ["w","e", 6,'á']})
df
Out[81]:
  col1 col2 col3
0    a   ab    w
1    b  NaN    e
2    3    4    6
3    ñ    ü    á

In [82]:

df['new_col'] = df.apply(lambda x: '*'.join(x.dropna().astype(str).values), axis=1)

df
Out[82]:
  col1 col2 col3 new_col
0    a   ab    w  a*ab*w
1    b  NaN    e     b*e
2    3    4    6   3*4*6
3    ñ    ü    á   ñ*ü*á

My code still works with Spanish characters

137

answered Oct 10 '22 04:10

EdChum

You can use dropna()

df['col4'] = df.apply(lambda row: '*'.join(row.dropna()), axis=1)

UPDATE:

Since, you need to convert numbers and special chars too, you can use astype(unicode)

In [37]: df = pd.DataFrame({'col1': ["a", "b"], 'col2': ["ab", np.nan], "col3": [3, u'\xf3']})

In [38]: df.apply(lambda row: '*'.join(row.dropna().astype(unicode)), axis=1)
Out[38]: 
0    a*ab*3
1       b*ó
dtype: object

In [39]: df['col4'] = df.apply(lambda row: '*'.join(row.dropna().astype(unicode)), axis=1)

In [40]: df
Out[40]: 
  col1 col2 col3    col4
0    a   ab    3  a*ab*3
1    b  NaN    ó     b*ó

answered Oct 10 '22 02:10

Anish Shah

Related questions
                            
                                How to iterate and update documents with PyMongo?
                            
                                Doing PUT using Python urllib2
                            
                                pyparsing and line breaks
                            
                                How do you edit the default `__author__ = name` line in PyCharm
                            
                                How to build a Django REST-Api that returns a custom list of models?
                            
                                Why can't I addstr() to last row/col in python curses window?
                            
                                OpenCV - Fastest method to check if two images are 100% same or not
                            
                                If statements and one line python scripts from command line
                            
                                Is it possible to create Blender file (.blend) programmatically with Python?
                            
                                How can I remove "&amp;nbsp" from html contents?
                            
                                Issue with a python function returning a generator or a normal object
                            
                                Finding "decent" numbers algorithm reasoning?
                            
                                How to get most informative features for scikit-learn classifier for different class?
                            
                                How to split comma-separated key-value pairs with quoted commas
                            
                                How to keep track of retries in celery
                            
                                How to get innerHTML of a node using scrapy Selector?
                            
                                Get output of system ping without printing to the console
                            
                                WingIDE C:\Python27 __init__.py" raise CodecRegistryError SyntaxError: invalid syntax
                            
                                How do I print in the middle of the screen?
                            
                                Run .py file until specified line number

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With