I am trying to concat multiple Pandas DataFrame columns with different tokens.
For example, my dataset looks like this :
dataframe = pd.DataFrame({'col_1' : ['aaa','bbb','ccc','ddd'], 
                          'col_2' : ['name_aaa','name_bbb','name_ccc','name_ddd'], 
                          'col_3' : ['job_aaa','job_bbb','job_ccc','job_ddd']})
I want to output something like this:
    features
0   aaa <0> name_aaa <1> job_aaa
1   bbb <0> name_bbb <1> job_bbb
2   ccc <0> name_ccc <1> job_ccc
3   ddd <0> name_ddd <1> job_ddd
Explanation :
concat each column with "<{}>" where {} will be increasing numbers.
What I've tried so far:
I don't want to modify original DataFrame so I created two new dataframe:
features_df = pd.DataFrame()
final_df    = pd.DataFrame()
for iters in range(len(dataframe.columns)):
    features_df[dataframe.columns[iters]] = dataframe[dataframe.columns[iters]] + ' ' + "<{}>".format(iters)
final_df['features'] = features_df[features_df.columns].agg(' '.join, axis=1)
There is an issue I am facing, It's adding <2> at last but I want output like above, also this is not panda's way to do this task, How I can make it more efficient?
By use + operator simply you can concatenate two or multiple text/string columns in pandas DataFrame.
Concat function concatenates dataframes along rows or columns. We can think of it as stacking up multiple dataframes. Merge combines dataframes based on values in shared columns. Merge function offers more flexibility compared to concat function because it allows combinations based on a condition.
When concatenating datasets vertically, assuming the dataframes have the same column names and the order of the columns is the same, we can simply use the pandas. concat() method to perform the concatenation.
from itertools import chain
dataframe['features'] = dataframe.apply(lambda x: ''.join([*chain.from_iterable((v, f' <{i}> ') for i, v in enumerate(x))][:-1]), axis=1)
print(dataframe)
Prints:
  col_1     col_2    col_3                      features
0   aaa  name_aaa  job_aaa  aaa <0> name_aaa <1> job_aaa
1   bbb  name_bbb  job_bbb  bbb <0> name_bbb <1> job_bbb
2   ccc  name_ccc  job_ccc  ccc <0> name_ccc <1> job_ccc
3   ddd  name_ddd  job_ddd  ddd <0> name_ddd <1> job_ddd
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With