Filtering and displaying values in GraphLab Sframe?

Question

So, I started working with Graphlab for my Machine learning class a week ago. I am still very new to Graphlab and i read through the API but couldn't quite get the solution I was looking for. So, here is the question. I have this data with multiple columns e.g- bedrooms,bathrooms,square ft,zipcode etc. These are basically the features and my goal is to work with various ML algorithms to predict the price of a house. Now, I am supposed to find the average price of the houses with zipcode - 93038. So, i broke down the problem into smaller bits as i am quite naive and decided to use my instincts. This is what i tried so far.Firstly, I tried to find a way to create a filter such that i can extract only the prices of the house with zipcode - 93038.

import graphlab
sf = graphlab.SFrame('home_data.gl')
sf[(sf['zipcode']=='93038')]

These showed me all the columns with zipcode 93038 but then i only want to display the price and zipcode column with value 93038. I tried so many different ways but just couldn't figure things out.

Also, lets say i want to find the mean of the prices with zipcode value 93038.How do i do that?

Thanks in advance.

import graphlab
sf = graphlab.SFrame('home_data.gl')
sf[(sf['zipcode']=='93038')]

These showed me all the columns with zipcode 93038 but then i only want to display the price and zipcode column with value 93038. I tried so many different ways but just couldn't figure things out.

Also, lets say i want to find the mean of the prices with zipcode value 93038.How do i do that?

Thanks in advance.

Adrien Renaud · Accepted Answer

You could try:

import graphlab as gl
sf = gl.SFrame({'price':[1,4,2],'zipcode':['93038','93038','93037']})

# Filtering
filter_sf = sf[(sf['zipcode']=='93038')] 

# Displaying
print filter_sf[['price', 'zipcode']]

# Averaging a column
print filter_sf['price'].mean()

naman1994 · Answer

Use GroupBy operation and topk() function

import graphlab.aggregate as agg
sf_ = sf.groupby(key_columns = 'zipcode', operations={'Mean by ZipCode' : agg.MEAN('price')})
sf_.topk('Mean by ZipCode', k=1)

Filtering and displaying values in GraphLab Sframe?

Tags:

machine-learning

graphlab

sframe

Lesley

2 Answers

Adrien Renaud

naman1994

Recent Activity

Donate For Us

Filtering and displaying values in GraphLab Sframe?

Tags:

machine-learning

graphlab

sframe

Lesley

2 Answers

Adrien Renaud

naman1994

Related questions

Recent Activity

Donate For Us