Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

graphlab SFrame sum all values in a column

How to sum all values in a column of SFrame graphlab. I tried looking into the official documentation and it is given only for SaArray(doc) without any example.

like image 455
Prashant Bhanarkar Avatar asked Sep 01 '16 17:09

Prashant Bhanarkar


2 Answers

>>> import graphlab as gl
>>> sf = gl.SFrame({'foo':[1,2,3], 'bar':[4,5,6]})
>>> sf
Columns:
        bar     int
        foo     int

Rows: 3

Data:
+-----+-----+
| bar | foo |
+-----+-----+
|  4  |  1  |
|  5  |  2  |
|  6  |  3  |
+-----+-----+
[3 rows x 2 columns]
>>> sf['foo'].sum()
6
like image 58
Adrien Renaud Avatar answered Nov 14 '22 17:11

Adrien Renaud


I think the question from the op was more about how to do this across all (or a list of) columns at once. Here's the comparison between pandas and graphlab.

# imports
import graphlab as gl    
import pandas as pd
import numpy as np

# generate data
data = np.random.randint(0,10,size=100).reshape(10,10)
col_names = list('ABCDEFGHIJ')

# make dataframe and sframe
df = pd.DataFrame(data, columns=names)
sf = graphlab.SFrame(df)

# get sum for all columns (pandas).  Returns a series.
df.sum().sort_values(ascending=False)

D    65
A    61
J    59
B    50
H    46
G    46
I    45
F    43
C    37
E    36

# sf.sum() does not work
# get sum for each of the columns (graphlab)
for col in col_names:
    print col, sf[col].sum()

A 61
B 50
C 37
D 65
E 36
F 43
G 46
H 46
I 45
J 59

I had the same question. Pandas provides an easy interface to apply an aggregating function across rows or columns of a dataframe. Could not find the same for a SFrame? Only way I could think to do it was to iterate on a list of columns.

Is there a better way?

like image 3
Randall Goodwin Avatar answered Nov 14 '22 16:11

Randall Goodwin