Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Summing multiple lists stored in dataframe

I have a dataframe with multiple lists stored as:

I have two dataframes as:

df1.ix[1:3]
DateTime      Col1     Col2 
2018-01-02    [1, 2]   [11, 21]
2018-01-03    [3, 4]   [31, 41]

I want to sum the lists in the df1 to get:

DateTime      sumCol 
2018-01-02    [12, 23]
2018-01-03    [34, 45]

I tried numpy.sum(df1, axis=1) but that causes list concatenation instead of sum.

Edit: My original dataframe has more than 2 columns.

like image 421
Zanam Avatar asked Feb 18 '26 15:02

Zanam


1 Answers

Using a list comprehension and np.array:

df.assign(sumCol=[np.array(x) + np.array(y) for x, y in zip(df.Col1, df.Col2)])

     DateTime    Col1      Col2    sumCol
0  2018-01-02  [1, 2]  [11, 21]  [12, 23]
1  2018-01-03  [3, 4]  [31, 41]  [34, 45]

If the arrays are always the same length:

df.assign(sumCol=[np.stack([x,y]).sum(0) for x, y in zip(df.Col1, df.Col2)])

To apply this to many columns, you can use iloc

zip(*df.iloc[:, 1:].values.T)

Here is an example on a wider DataFrame:

   A       B       C       D
0  1  [1, 2]  [1, 2]  [1, 2]
1  2  [3, 4]  [3, 4]  [3, 4]
2  3  [5, 6]  [5, 6]  [5, 6]

Using zip with df.values

df.assign(sumCol=[np.stack(a).sum(0) for a in zip(*df.iloc[:, 1:].values.T)])

   A       B       C       D    sumCol
0  1  [1, 2]  [1, 2]  [1, 2]    [3, 6]
1  2  [3, 4]  [3, 4]  [3, 4]   [9, 12]
2  3  [5, 6]  [5, 6]  [5, 6]  [15, 18]
like image 119
user3483203 Avatar answered Feb 21 '26 06:02

user3483203