Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to recommend top 10 products in Spark ALS for all the users?

How can we get top 10 recommended products in PySpark. I understand there are methods like recommendProducts to recommend products for a single user and predictAll to predict rating for the {user,item} pair. But is there a efficient way i can output the top 10 items for each user for all the users?

like image 625
None Avatar asked Jul 21 '15 16:07

None


1 Answers

I wrote this function which multiplies user features and product features by partitions so that it gets distributed then it gets the ratings for each product by user and sorts them by rating and outputs the list of 8 top recommended products.

#Collect product feature matrix
 productFeatures = bestModel.productFeatures().collect() 
 productArray=[]
 productFeaturesArray=[]
 for x in productFeatures:
    productArray.append(x[0])
    productFeaturesArray.append(x[1])  
 matrix=np.matrix(productFeaturesArray)
 productArrayBroadCast=sc.broadcast(productArray)
 productFeaturesArraybroadcast=sc.broadcast(matrix.T)

 def func(iterator):
      userFeaturesArray = []
      userArray = []
      for x in iterator:
          userArray.append(x[0])
          userFeaturesArray.append(x[1])
          userFeatureMatrix = np.matrix(userFeaturesArray)
          userRecommendationArray = userFeatureMatrix*(productFeaturesArraybroadcast.value)
          mappedUserRecommendationArray = []
          #Extract ratings from the matrix
          i=0
          for i in range(0,len(userArray)):
              ratingdict={}
              j=0
              for j in range(0,len(productArrayBroadcast.value)):
                   ratingdict[str(productArrayBroadcast.value[j])]=userRecommendationArray.item((i,j))
                   j=j+1
              #Take the top 8 recommendations for the user
              sort_apps=sorted(ratingdict.keys(), key=lambda x: x[1])[:8]
              sort_apps='|'.join(sort_apps)
              mappedUserRecommendationArray.append((userArray[i],sort_apps))
              i=i+1
      return [x for x in mappedUserRecommendationArray]


recommendations=model.userFeatures().repartition(2000).mapPartitions(func)
like image 79
None Avatar answered Oct 08 '22 01:10

None