I have MatrixFactorizationModel object. If I'm trying to recommend products to single user right after constructing model through ALS.train(...) then it takes 300ms (for my data and hardware). But if I save model to disk and load it back then recommendation takes almost 2000ms. Also Spark warns:
15/07/17 11:05:47 WARN MatrixFactorizationModel: User factor does not have a partitioner. Prediction on individual records could be slow.
15/07/17 11:05:47 WARN MatrixFactorizationModel: User factor is not cached. Prediction could be slow.
15/07/17 11:05:47 WARN MatrixFactorizationModel: Product factor does not have a partitioner. Prediction on individual records could be slow.
15/07/17 11:05:47 WARN MatrixFactorizationModel: Product factor is not cached. Prediction could be slow.
How can I create/set partitioner and cache user and product factors after loading model? Following approach didn't help:
model.userFeatures().cache();
model.productFeatures().cache();
Also I was trying to repartition those rdds and create new model from repartitioned versions but that also didn't help.
You don't have to use parenthesis, userFeatures is the RDD of (Int, Array[Double]) which does not take parameters.
This will help you:
model.userFeatures.cache
model.productFeatures.cache
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With