How does statistical calculation of "similar products/music/..." from customer buying/listening behaviour work?

Question

I mean product suggestions on Amazon or more specifically similar band recommendation on Last.fm.

Given that you can store the complete listening/buying behaviour of your users (WHO listened to WHAT how OFTEN?), how do you calculate which bands are similar to any given bands, and how much?

I've found some sites on Wikipedia (Association rule learning, Affinity analysis) but I'd like to get some information from a programmer's point of view and preferably some pseudocode or Python code for it.

Given that I have

 dic = {
"Alice"   : { "AC/DC" : 2, "The Raconteurs" : 3, "Mogwai" : 1 },
"Bob"     : { "The XX" : 4, "Lady Gaga" : 3, "Mogwai" : 1, "The Raconteurs" : 1 }
"Charlie" : { "AC/DC" : 7, "Lady Gaga" : 7 }
 }

where the numbers are play counts, how would I iterate over this to find the similarity of the bands?

nikow · Accepted Answer

The book "Programming Collective Intelligence: Building Smart Web 2.0 Applications" is a classic and uses Python. Among other things it also deals with recommendation engines.

enter image description here

ars · Answer

You might find the Association Rules widget (among others) in Orange helpful in getting started. Another useful package, available with source, is pysuggest which implements a number of recsys/collaborative filtering algorithms.

How does statistical calculation of "similar products/music/..." from customer buying/listening behaviour work?

Tags:

python

statistics

similarity

data-mining

Felix Dombek

2 Answers

nikow

ars

Recent Activity

Donate For Us

How does statistical calculation of "similar products/music/..." from customer buying/listening behaviour work?

Tags:

python

statistics

similarity

data-mining

Felix Dombek

2 Answers

nikow

ars

Related questions

Recent Activity

Donate For Us