Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Adding two columns in Python

I am trying to add two columns and create a new one. This new column should become the first column in the dataframe or the output csv file.

column_1 column_2
84       test
65       test

Output should be

column         column_1 column_2
trial_84_test   84      test
trial_65_test   65      test

I tried below given methods but they did not work:

sum = str(data['column_1']) + data['column_2']

data['column']=data.apply(lambda x:'%s_%s_%s' % ('trial' + data['column_1'] + data['column_2']),axis=1)

Help is surely appreciated.

like image 680
New User Avatar asked Dec 13 '25 10:12

New User


2 Answers

Create sample data:

df = pd.DataFrame({'column_1': [84, 65], 'column_2': ['test', 'test']})

Method 1: Use assign to create new column, and then reorder.

>>> df.assign(column=['trial_{}_{}'.format(*cols) for cols in df.values])[['column'] + df.columns.tolist()]
          column  column_1 column_2
0  trial_84_test        84     test
1  trial_65_test        65     test

Method 2: Create a new series and then concatenate.

s = pd.Series(['trial_{}_{}'.format(*cols) for cols in df.values], index=df.index, name='column')
>>> pd.concat([s, df], axis=1)
          column  column_1 column_2
0  trial_84_test        84     test
1  trial_65_test        65     test

Method 3: Insert the new values at the first index of the dataframe (i.e. column 0).

df.insert(0, 'column', ['trial_{}_{}'.format(*cols) for cols in df.values])
>>> df
          column  column_1 column_2
0  trial_84_test        84     test
1  trial_65_test        65     test

Method 3 (alternative way to create values for new column):

df.insert(0, 'column', df.astype(str).apply(lambda row: 'test_' + '_'.join(row), axis=1))

By the way, sum is a keyword so you do not want to use it as a variable name.

like image 74
Alexander Avatar answered Dec 15 '25 23:12

Alexander


Do not use lambda for this, as it is just a thinly veiled loop. Here is a vectorised solution. Care needs to be taken to convert non-string values to str type.

df['column'] = 'trial_' + df['column_1'].astype(str) + '_' + df['column_2']

df = df.reindex_axis(sorted(df.columns), axis=1)  # sort columns alphabetically

Result:

          column  column_1 column_2
0  trial_84_test        84     test
1  trial_65_test        65     test
like image 25
jpp Avatar answered Dec 16 '25 00:12

jpp



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!