Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to apply Spark ALS for Implicit data

I have dataset is history of purchase like this:

+---+-----------+---------+
|usn|    page_id|    click|
+---+-----------+---------+
| 11| 9000001012|       10|
|169| 2010008901|      100|
|169| 9000001007|        4|
|169| 2010788901|        1|
|169| 8750001007|        4|
|169| 9003601012|       10|
|169| 9000001007|        4|
|613| 9000050601|        8|
|613| 9000011875|        3|
|613| 2010010401|        6|
|613| 9000001007|        4|
|613| 2010008801|        1|
|836| 9000050601|       20|
|916| 9000050601|       10|
|916| 9000562601|       30|
|916| 9000001007|        4|
|916| 9000001012|       10|
+---+-----------+---------+

I have been read docs in Spark (http://spark.apache.org/docs/latest/ml-collaborative-filtering.html) but i don't know how to use Collaborative Filtering for Implicit Preference in this problem.

And now i want to apply ALS for Implicit Preference to this dataset. How to do it? Can I apply this dataset for Explicit Data?

Please help me use it and Give me an example code python about Implicit Preference if you have

like image 915
Phong Nguyen Avatar asked May 29 '26 01:05

Phong Nguyen


1 Answers

A little late my answer, but the main thing is to scale the values de 'click'. In my case work:

from pyspark.sql import Window

ww = Window.partitionBy("usn")
scaled_score = (
    0.00001 + 10*(col("click") - min("click").over(ww)) / (max("click").over(ww) - min("click").over(ww))
).cast(DecimalType(7, 5))

After creating a strategy for the most visiteds page_id, remember that the values to be modeled should reflect the client's tastes

like image 138
Victor Villacorta Avatar answered May 31 '26 13:05

Victor Villacorta



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!