I am trying to concat multiple Pandas DataFrame columns with different tokens.
For example, my dataset looks like this :
dataframe = pd.DataFrame({'col_1' : ['aaa','bbb','ccc','ddd'],
'col_2' : ['name_aaa','name_bbb','name_ccc','name_ddd'],
'col_3' : ['job_aaa','job_bbb','job_ccc','job_ddd']})
I want to output something like this:
features
0 aaa <0> name_aaa <1> job_aaa
1 bbb <0> name_bbb <1> job_bbb
2 ccc <0> name_ccc <1> job_ccc
3 ddd <0> name_ddd <1> job_ddd
Explanation :
concat each column with "<{}>" where {} will be increasing numbers.
What I've tried so far:
I don't want to modify original DataFrame so I created two new dataframe:
features_df = pd.DataFrame()
final_df = pd.DataFrame()
for iters in range(len(dataframe.columns)):
features_df[dataframe.columns[iters]] = dataframe[dataframe.columns[iters]] + ' ' + "<{}>".format(iters)
final_df['features'] = features_df[features_df.columns].agg(' '.join, axis=1)
There is an issue I am facing, It's adding <2> at last but I want output like above, also this is not panda's way to do this task, How I can make it more efficient?
By use + operator simply you can concatenate two or multiple text/string columns in pandas DataFrame.
Concat function concatenates dataframes along rows or columns. We can think of it as stacking up multiple dataframes. Merge combines dataframes based on values in shared columns. Merge function offers more flexibility compared to concat function because it allows combinations based on a condition.
When concatenating datasets vertically, assuming the dataframes have the same column names and the order of the columns is the same, we can simply use the pandas. concat() method to perform the concatenation.
from itertools import chain
dataframe['features'] = dataframe.apply(lambda x: ''.join([*chain.from_iterable((v, f' <{i}> ') for i, v in enumerate(x))][:-1]), axis=1)
print(dataframe)
Prints:
col_1 col_2 col_3 features
0 aaa name_aaa job_aaa aaa <0> name_aaa <1> job_aaa
1 bbb name_bbb job_bbb bbb <0> name_bbb <1> job_bbb
2 ccc name_ccc job_ccc ccc <0> name_ccc <1> job_ccc
3 ddd name_ddd job_ddd ddd <0> name_ddd <1> job_ddd
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With