I have a dataframe with multiple lists stored as:
I have two dataframes as:
df1.ix[1:3]
DateTime Col1 Col2
2018-01-02 [1, 2] [11, 21]
2018-01-03 [3, 4] [31, 41]
I want to sum the lists in the df1 to get:
DateTime sumCol
2018-01-02 [12, 23]
2018-01-03 [34, 45]
I tried numpy.sum(df1, axis=1) but that causes list concatenation instead of sum.
Edit: My original dataframe has more than 2 columns.
Using a list comprehension and np.array:
df.assign(sumCol=[np.array(x) + np.array(y) for x, y in zip(df.Col1, df.Col2)])
DateTime Col1 Col2 sumCol
0 2018-01-02 [1, 2] [11, 21] [12, 23]
1 2018-01-03 [3, 4] [31, 41] [34, 45]
If the arrays are always the same length:
df.assign(sumCol=[np.stack([x,y]).sum(0) for x, y in zip(df.Col1, df.Col2)])
To apply this to many columns, you can use iloc
zip(*df.iloc[:, 1:].values.T)
Here is an example on a wider DataFrame:
A B C D
0 1 [1, 2] [1, 2] [1, 2]
1 2 [3, 4] [3, 4] [3, 4]
2 3 [5, 6] [5, 6] [5, 6]
Using zip with df.values
df.assign(sumCol=[np.stack(a).sum(0) for a in zip(*df.iloc[:, 1:].values.T)])
A B C D sumCol
0 1 [1, 2] [1, 2] [1, 2] [3, 6]
1 2 [3, 4] [3, 4] [3, 4] [9, 12]
2 3 [5, 6] [5, 6] [5, 6] [15, 18]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With