Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

All row sum with pandas except one

I have several tables on a PostgreSQL database that look more or less like that:

gid      col2       col1        col3
6        15         45          77
1        15         45          57
2        14         0.2         42
3        12         6           37
4        9          85          27
5        5          1           15

For each table, numbers and columns' names change (I created them in a loop in python).

I would like to have back another column called sum for each table with the sum of each calumn except for the gid. The goal is having something like that:

gid     col2       col1        col3     sum 
6        15         45          77      137
1        15         45          57      117
2        14         0.2         42      56.2
3        12         6           37      55
4        9          85          27      121 
5        5          1           15      21

I cannot use column name: the only one with no changes is gid.

Some idea to make it with python (pandas, numpy) or psql?

like image 856
Glori P. Avatar asked May 16 '17 13:05

Glori P.


People also ask

How do I sum specific rows in pandas?

To sum only specific rows, use the loc() method. Mention the beginning and end row index using the : operator. Using loc(), you can also set the columns to be included. We can display the result in a new column.

How do I get every column except one pandas?

To select all columns except one column in Pandas DataFrame, we can use df. loc[:, df. columns != <column name>].

How do I sum only certain columns in pandas?

Calculate Sum of Given Columns To sum given or list of columns then create a list with all columns you wanted and slice the DataFrame with the selected list of columns and use the sum() function. Use df['Sum']=df[col_list]. sum(axis=1) to get the total sum.


1 Answers

Use drop + sum:

df['sum'] = df.drop('gid', axis=1).sum(axis=1)
print (df)
   gid  col2  col1  col3    sum
0    6    15  45.0    77  137.0
1    1    15  45.0    57  117.0
2    2    14   0.2    42   56.2
3    3    12   6.0    37   55.0
4    4     9  85.0    27  121.0
5    5     5   1.0    15   21.0

If gid is always first column, select by iloc all columns without first and then sum them:

df['sum'] = df.iloc[:, 1:].sum(axis=1)
print (df)
   gid  col2  col1  col3    sum
0    6    15  45.0    77  137.0
1    1    15  45.0    57  117.0
2    2    14   0.2    42   56.2
3    3    12   6.0    37   55.0
4    4     9  85.0    27  121.0
5    5     5   1.0    15   21.0
like image 143
jezrael Avatar answered Oct 03 '22 14:10

jezrael