Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to augment matrix factors in Spark ALS recommender? [duplicate]

I am a beginner to the world of Machine Learning and the usage of Apache Spark.
I have followed the tutorial at https://databricks-training.s3.amazonaws.com/movie-recommendation-with-mllib.html#augmenting-matrix-factors, and was succesfully able to develop the application. Now, as it is required that today's web application need to be powered by real time recommendations, I would like my model to be ready for new data that keeps coming on the server. The site has quoted:

A better way to get the recommendations for you is training a matrix factorization model first and then augmenting the model using your ratings.

How do I do that? I am using Python to develop my application. Also, please tell me how do I persist the model to use it again, or an idea how do I interface this with a web service. Thanking you

like image 227
AnishM Avatar asked Feb 25 '15 16:02

AnishM


People also ask

Is ALS matrix factorization?

Alternating Least Square (ALS) is also a matrix factorization algorithm and it runs itself in a parallel fashion. ALS is implemented in Apache Spark ML and built for a larges-scale collaborative filtering problems.

What is regParam in ALS?

regParam specifies the regularization parameter in ALS (defaults to 1.0). implicitPrefs specifies whether to use the explicit feedback ALS variant or one adapted for implicit feedback data (defaults to false which means using explicit feedback).

What is rank in ALS?

rank is the number of features to use (also referred to as the number of latent factors). iterations is the number of iterations of ALS to run. ALS typically converges to a reasonable solution in 20 iterations or less. lambda specifies the regularization parameter in ALS.

What is the significance of alternating least squares in collaborative filtering?

Also, the matrix factorization using Alternating Least Squares (ALS) algorithm which is a type of collaborative filtering is used to solve overfitting issues in sparse data and increases prediction ac-curacy. The overfitting problem arises in the data as the user-item rating matrix is sparse.


1 Answers

I don't think online learning is possible for ALS in Spark. That means you can't update the model while getting the data in real time. However, you can use the model to get the predictions.

Also, refer to: How to update Spark MatrixFactorizationModel for ALS

like image 186
pissall Avatar answered Nov 05 '22 17:11

pissall