Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to sum up a column in numpy

i got a dataframe which i convert in an array (that is a testscenario because i have problems with the results in pandas). Now i want to sum up one column.

I have the following code:

import sys
import pandas as pd
import numpy as np
import os
from tkinter import *


#data_rbu = np.genfromtxt('tmp_fakt_daten.csv', delimiter=',', dtype=None)
data_rbu = pd.read_excel('tmp_fakt_daten.xlsx')
array_rbu = data_rbu.as_matrix()
print(array_rbu)
summe1 = np.sum(array_rbu, axis=9, dtype=float)
print(summe1)

This is the Array! I want to sum up KW_WERT and NETTO_EURO.

After executing the code i get this error:

Traceback (most recent call last):
  File "C:\Users\----------\[INPROGRESS] Faktura_sylvia\csv_einlesen bzgl. float\test2.py", line 12, in <module>
    summe1 = np.sum(array_rbu, axis=9, dtype=float)
  File "C:\Users\---------\Winpython\python-3.4.3\lib\site-packages\numpy\core\fromnumeric.py", line 1724, in sum
    out=out, keepdims=keepdims)
  File "C:\Users\----------\Winpython\python-3.4.3\lib\site-packages\numpy\core\_methods.py", line 32, in _sum
    return umr_sum(a, axis, dtype, out, keepdims)
ValueError: 'axis' entry is out of bounds

I understand that the problem is the axis number.. but i dont know what im exactly doing wrong. I checked the documentation for numpy.sum...

Hope you can help me!

Damian

like image 419
Damian Avatar asked Jan 20 '26 19:01

Damian


2 Answers

As you said the values are in array:

In[10]:arr
Out[10]: 
array([['ZPAF', '2015-12-10', '2015-12-31', 'T-HOME ICP', 'B',
        1001380363.0, 'B60ETS', 0.15, 18.9, 'SDH'],
       ['ZPAF', '2015-12-10', '2015-12-31', 'T-HOME ICP', 'B',
        1001380363.0, 'B60ETS', 0.145, 18.27, 'SDH'],
       ['ZPAF', '2015-12-10', '2015-12-31', 'T-HOME ICP', 'B',
        1001380363.0, 'B60ETS', 0.145, 18.27, 'SDH'],
       ['ZPAF', '2015-12-10', '2015-12-31', 'T-HOME ICP', 'B',
        1001380363.0, 'B60ETS', 0.15, 18.9, 'SDH'],
       ['ZPAF', '2015-12-10', '2015-12-31', 'T-HOME ICP', 'B',
        1001380363.0, 'B60ETS', 0.15, 18.9, 'SDH'],
       ['ZPAF', '2015-12-10', '2015-12-31', 'T-HOME ICP', 'B',
        1001380363.0, 'B60ETS', 0.145, 18.27, 'SDH'],
       ['ZPAF', '2015-12-10', '2015-12-31', 'T-HOME ICP', 'B',
        1001380363.0, 'B60ETS', 0.15, 18.9, 'SDH'],
       ['ZPAF', '2015-12-10', '2015-12-31', 'T-HOME ICP', 'E',
        1001380594.0, 'B60ETS', 3.011, 252.92, 'DSLAM/MSAN']], dtype=object)

you can do using arr.sum:

sum_arr=arr.sum(axis=0)

axis=0 it will sum column wise,then you can access the column based on its index.In your case for columns KW_WERT and NETTO_EURO you can get the sum as:

In[25]:sum_arr[7]
Out[25]: 4.046

In[26]:sum_rr[8]
In[23]: 383.33
like image 127
shivsn Avatar answered Jan 23 '26 08:01

shivsn


do it directly in pandas:

data_rbu = pd.read_excel('tmp_fakt_daten.xlsx')
summe1 = data_rbu['KW_WERT'] + data_rbu['NETTO_EURO'] # gets you a series
summe1.sum() # gets you the total sum (if that's what you are after)
like image 45
Julien Avatar answered Jan 23 '26 08:01

Julien



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!