Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Popularity decay algorithm for popular website posts

I'm looking for an algorithm to sort website results by popularity.. like Reddit's so the older a post the less power it's votes/score has.

Here is the generally accepted solution as used by reddit:

t = (time of entry post) - (Dec 8, 2005)
x = upvotes - downvotes

y = {1 if x > 0, 0 if x = 0, -1 if x < 0)
z = {1 if x < 1, otherwise x}

rank = log(z) + (y * t)/45000

I've been over Reddit's algorithm and although it will fit for one situation, what I really need is two algorithms, one for Popular posts and another for Upcoming posts:

  • Popular Posts
  • Upcoming Posts

Popular will decay slower, giving more weight to slightly older posts where Upcoming posts will focus more on popular posts today, dropping off sharply after N hours/days/etc.

I'm writing this using Sphinx expressions so I can't write a hugly complicated algo and I only have access to the following functions:

http://sphinxsearch.com/docs/current.html#numeric-functions

So I have the following data per post:

  • Post age in seconds
  • Post score

Here is my current solution:

Exponent = 0.01 (Popular), 0.5 (Upcoming)
SecondsSincePublised = abs(CurTimeInSecondsSinceDate-PubTimeInSecondsSinceDate)
Rank = (log10(PostScore)*10000) / pow(SecondsSincePublised,Exponent) 

Although this solution does work its not ideal. A new and popular post in the last couple of hours often ranks high in both popular and upcoming which is not really what I want.

Can anyone suggest another algorithm that I can modify an exponent component to adjust the decay?

like image 586
antfx Avatar asked Oct 07 '13 07:10

antfx


2 Answers

Have you tried ranking algorithm used by Hacker news? It is simple to implement.

Score = (P-1) / (T+2)^G

where,
P = points of an item (and -1 is to negate submitters vote)
T = time since submission (in hours)
G = Gravity, defaults to 1.8 in news.arc

You can vary Gravity to adjust the decay.

For more information refer to How Hacker News ranking algorithm works

like image 114
Rozuur Avatar answered Nov 07 '22 12:11

Rozuur


Did you try using different decay functions for "popular" and "upcoming"? for example, use an exponential decay rate for "upcoming" and polynomial decay rate for "popular", this way, after a few hours (if optimised correctly), there's a very little chance a post will score high on upcoming. while in polynomial decay functions the relation between adjacent times are getting smaller, this is not the case with an exponential decay function.

Here's an example (the parameter 0.01 and 1.0005 are arbitrary and should be optimised according to your goal).

Popular:

SecondsSincePublised = abs(CurTimeInSecondsSinceDate-PubTimeInSecondsSinceDate)
Rank = (log10(PostScore)*10000) / pow(SecondsSincePublised,0.01)

Upcoming:

SecondsSincePublised = abs(CurTimeInSecondsSinceDate-PubTimeInSecondsSinceDate)
Rank = (log10(PostScore)*10000) / pow(1.0005,SecondsSincePublised)
like image 5
Ron Teller Avatar answered Nov 07 '22 11:11

Ron Teller