I have columns in a dataframe (imported from a CSV) containing text like this.
"New york", "Atlanta", "Mumbai"
"Beijing", "Paris", "Budapest"
"Brussels", "Oslo", "Singapore"
I want to collapse/merge all the columns into one single column, like this
New york Atlanta
Beijing Paris Budapest
Brussels Oslo Singapore
How to do it in pandas?
A faster (but uglier) version is with .cat
:
df[0].str.cat(df.ix[:, 1:].T.values, sep=' ')
0 New york Atlanta Mumbai
1 Beijing Paris Budapest
2 Brussels Oslo Singapore
Name: 0, dtype: object
On a larger (10kx5) DataFrame:
%timeit df.apply(" ".join, axis=1)
10 loops, best of 3: 112 ms per loop
%timeit df[0].str.cat(df.ix[:, 1:].T.values, sep=' ')
100 loops, best of 3: 4.48 ms per loop
Suppose you have a DataFrame
like so:
>>> df
0 1 2
0 New york Atlanta Mumbai
1 Beijing Paris Budapest
2 Brussels Oslo Singapore
Then, a simple use of the pd.DataFrame.apply
method will work nicely:
>>> df.apply(" ".join, axis=1)
0 New york Atlanta Mumbai
1 Beijing Paris Budapest
2 Brussels Oslo Singapore
dtype: object
Note, I have to pass axis=1
so that it is applied across the columns, rather than down the rows. I.e:
>>> df.apply(" ".join, axis=0)
0 New york Beijing Brussels
1 Atlanta Paris Oslo
2 Mumbai Budapest Singapore
dtype: object
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With