We're trying to find similarity between items (and later users) where the items are ranked in various lists by users (think Rob, Barry and Dick in Hi Fidelity). A lower index in a given list implies a higher rating.
I suppose a standard approach would be to use the Pearson correlation and then invert the indexes in some way.
However, as I understand it, the aim of the Pearson correlation is to compensate for differences between users who typically rate things higher or lower but have a similar relative ratings.
It seems to me that if the lists are continuous (although of arbitrary length) it's not an issue that the ratings implied from the position will be skewed in this way.
I suppose in this case a Euclidean based similarity would suffice. Is this the case? Would using the Pearson correlation have a negative effect and find correlation that isn't appropriate? What similarity measure might best suit this data?
Additionally while we want position in the list to have effect we don't want to penalise rankings that are too far apart. Two users both featuring an item in a list with very differing ranking should still be considered similar.
Jaccard Similarity looks better in your case. To include the rank you mentioned, you can take a bag-of-items approach.
Using your example of (Rob
, Barry
, Dick
) with their rating being (3,2,1) respectively, you insert Rob
3 times into this user a
's bag.
Rob, Rob, Rob.
Then for Barry
, you do it twice. The current bag looks like below,
Rob, Rob, Rob, Barry, Barry.
You put Dick
into the bag finally.
Rob, Rob, Rob, Barry, Barry, Dick
Suppose another user b
has a bag of [Dick, Dick, Barry]
, you calculate the Jaccard Similarity as below:
a
and b
= [Dick, Barry]
a
and b
= [Rob, Rob, Rob, Barry, Barry, Dick, Dick]
that is, the number of items in the intersection divided by the number of items in the union.
This similarity measure does NOT penalize rankings that are far apart. You can see that:
Two users both featuring an item in a list with very differing ranking should still be considered similar.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With