Spark MLLib Collaborative Filtering with new user

Question

I'm trying out the Collaborative Filtering algorithm implemented in Spark and am running into the following issue:

Suppose I train a model with the following data:

u1|p1|3
u1|p2|3
u2|p1|2
u2|p2|3

Now if I test it with the following data:

u1|p1|1
u3|p1|2
u3|p2|3

I never see any ratings for the user 'u3', presumably because that user does not appear in the training data. Is this because of the cold start issue? I was under the impression that this issue would apply only to a new product. In this case, I would have expected a prediction for 'u3' since 'u1' and 'u2' in the training data have similar rating information to 'u3'. Is this the distinction between model-based and memory-based collaborative filtering?

stholzm · Accepted Answer

I assume you are talking about the ALS algorithm?

'u3' is not pair of your training set and therefore your model does not know anything about that user. All one could to is maybe return the mean rating over all users.

Looking into the Spark 1.3.0 Scala code: The MatrixFactorizationModel returned by ALS.train() tries to lookup user and product in the feature vectors when you call predict(). I get a NoSuchElementException when I try to predict a rating of an unknown user. It is just implemented that way.

Spark MLLib Collaborative Filtering with new user

Tags:

apache-spark

apache-spark-mllib

collaborative-filtering

Navin Viswanath

1 Answers

stholzm

Recent Activity

Donate For Us

Spark MLLib Collaborative Filtering with new user

Tags:

apache-spark

apache-spark-mllib

collaborative-filtering

Navin Viswanath

1 Answers

stholzm

Related questions

Recent Activity

Donate For Us