I have the following Pandas Dataframe:
df=pd.DataFrame({0:["a","b","c","d"], 1:["e","f","g",None], 2:["h",None,None,None]})
0 1 2
0 a e h
1 b f None
2 c g None
3 d None None
I like to create a new DataFrame with one column where each row is a concatenated string, with a seperator ",":
0
0 a,e,h
1 b,f
2 c,g
3 d
For a single row I could use
df.iloc[0,:].str.cat(sep=",")
but how can I apply this to the whole DataFrame, without using a for-loop (if possible)
Use DataFrame.append() method to concatenate DataFrames on rows. For E.x, df. append(df1) appends df1 to the df DataFrame.
Concatenating string columns in small datasets For relatively small datasets (up to 100–150 rows) you can use pandas.Series.str.cat() method that is used to concatenate strings in the Series using the specified separator (by default the separator is set to '' ).
merge() for combining data on common columns or indices. . join() for combining data on a key column or an index. concat() for combining DataFrames across rows or columns.
Stacking removes nulls by default. Follow-up with a groupby
on level=0
df.stack().groupby(level=0).apply(','.join)
0 a,e,h
1 b,f
2 c,g
3 d
dtype: object
To duplicate OP's output, use to_frame
df.stack().groupby(level=0).apply(','.join).to_frame(0)
0
0 a,e,h
1 b,f
2 c,g
3 d
for i, r in df.iterrows():
print(r.str.cat(sep=","))
as a new dataframe:
ndf = pd.DataFrame([r.str.cat(sep=",") for i, r in df.iterrows()])
print(ndf)
0
0 a,e,h
1 b,f
2 c,g
3 d
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With