Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Item rankings, order by confidence using Reddit Ranking Algorithms

I am interested to use this ranking class, based off of an article by Evan Miller to rank a table I have that has upvotes and downvotes. I have a system very similar to Stack Overflow's up/down voting system for an events site I am working on, and by using this ranking class I feel as though results will be more accurate. My question is how do I order by the function 'hotness'?

private function _hotness($upvotes = 0, $downvotes = 0, $posted = 0) {
    $s = $this->_score($upvotes, $downvotes);
    $order = log(max(abs($s), 1), 10);

    if($s > 0) {
        $sign = 1;
    } elseif($s < 0) {
        $sign = -1;
    } else {
        $sign = 0;
    }

    $seconds = $posted - 1134028003;

    return round($order + (($sign * $seconds)/45000), 7);
}

I suppose each time a user votes I could have a column in my table that has the hotness data recalculated for the new vote, and order by that column on the main page. But I am interested to do this more on-the-fly incorporating the function above, and I am not sure if that is possible.

From Evan Miller, he uses:

SELECT widget_id, ((positive + 1.9208) / (positive + negative) - 
                   1.96 * SQRT((positive * negative) / (positive + negative) + 0.9604) / 
                          (positive + negative)) / (1 + 3.8416 / (positive + negative)) 
       AS ci_lower_bound FROM widgets WHERE positive + negative > 0 
       ORDER BY ci_lower_bound DESC;

But I rather not do this calculation in the sql as I feel this is messy and difficult to change down the line if I utilize this code on multiple pages .etc.

like image 647
Alex Avatar asked Mar 10 '13 00:03

Alex


1 Answers

Accessing the corresponding "Posts" table for anything (reading, writing, sorting, comparing, etc.) is extremely quick and thus relying on the database is the "most on-the-fly" alternative you have for non-temporary data storage (memory/sessions are still quicker but, logically, cannot be used to store this information).

You should be more worried about building a good ranking algorithm delivering the results you want (you are proposing two different systems, delivering different results) and working on making the whole code and the code-database communication as efficient as possible.

In principle, small codes with iterative simple orders offer the quickest and most reliable solution for this kind of situations. Example:

  1. Ranking function (like the first one you propose or any other one built on the ranking rules you want) called every time a vote is given. It writes to the corresponding column(s) in the "Posts" table (the simpler the query, the better: you can create a ranking system as complex as you wish, but try to rely on PHP rather than on queries).

  2. Every time a comparison between posts is required, the "Posts" table is read with a simple SELECT ordering the records by ranking (you can have various "assessing columns" (e.g., up-votes, down-votes, further considerations); but better having one with the definitive ranking).

like image 101
varocarbas Avatar answered Sep 17 '22 19:09

varocarbas