Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

500px.com Ranking Algorithm

I was recently wondering how http://500px.com calculates their "Pulse" rating. The "Pulse" is a score from 1..100 based on the popularity of the photo.

I think it might use some of the following criteria:

  • Number of likes
  • Number of "favorites"
  • Number of comments
  • Total views
  • maybe the time since the photo has been uploaded
  • maybe some other non-obvious criteria like the users follower count, user rank, camera model or similar

How would I achieve some sort of algorithm like this?

Any advice on how to implement an algorithm with this criteria (and maybe some code) would be appreciated too.

like image 784
alex Avatar asked Dec 07 '22 15:12

alex


2 Answers

I don't know too much about the site but systems like this generally work the same way. Normalize a set of weighted values to produce a single comparable value.

Define your list of rules, weight them based on importance, then run them all together to get your final value.

In this case it would be something like.

  1. Total number of visits = 10%
  2. Total number of Likes = 10%
  3. Number of vists / number of likes = 40% (popularity = percentage of visitors that liked it)
  4. number of Likes in last 30 days = 20% (current popularity)
  5. author rating = 20%

Now we need to normalize the values for those rules. Depending on what your data is, scale of numbers etc this will be different for each rule so we need a workable value, say between 1 and 100.

Example normalizations for the above:

  1. = percentage of vistors out of 50,000 vists (good number of vists)

    (vists / 50000 ) * 100

  2. = percentage of likes out of 10,000 likes (good number of likes)

    (likes / 10000) * 100

  3. = percentage of vistors that liked it

    (likes / vists) * 100

  4. = percentage of likes in last 30 days out of 1,000 likes (good number of likes for a 30 day period)

    (likesIn30Days / 1000) * 100

  5. = arbitrary rating of the author

Make sure all of these have a maximum value of 100 (if it's over bring it back down). Then we need to combine all these up depending on their weighting:

Popularity = (1 * 0.1) + (2 * 0.1) + (3 * 0.4) + (4 * 0.2) + (5 * 0.2)

This is all off the top of my head and rough. There are obviously also much more efficient ways to this as you don't need to normalize to a percentage at every stage but I hope it helps you get the gist.

Update

I've not really got any references or extra reading. I've never really worked with it as a larger concept only in small implementations.

I think most of what you read though is going to be methodological ranking systems in general and theories. Because depending on your rules and data format, your implementation will be very different. It seems such a huge concept when actually it will probably come down to arround 10 lines of code, not counting aggregating your data.

like image 184
Paystey Avatar answered Dec 30 '22 07:12

Paystey


You may want to also refer to the following

  • How Reddit ranking algorithms work
  • How Hacker News ranking algorithm works
  • How to Build a Popularity Algorithm You can be Proud of
like image 24
user799188 Avatar answered Dec 30 '22 07:12

user799188