For those web masters dealing with user feedback looking to weight content finding the right algorithm can be challenging. From experience, there is going to be no out of the box solution since each site and the requirements will be unique. Getting started and putting a solid foundation is the first step and of course, refining over time to get just the right recipe. The following is a binomial proportion confidence interval (what?).  It is a PHP implementation using the Wilson Score Interval to weight the feedback.

class Rating
{
  public static function ratingAverage($positive, $total, $power = '0.05')
  {
    if ($total == 0)
      return 0;
 
    $z = Rating::pnormaldist(1-$power/2,0,1);
    $p = 1.0 * $positive / $total;
    $s = ($p + $z*$z/(2*$total) - $z * sqrt(($p*(1-$p)+$z*$z/(4*$total))/$total))/(1+$z*$z/$total);
    return $s;
  } 
 
  public static function pnormaldist($qn)
  {
    $b = array(
      1.570796288, 0.03706987906, -0.8364353589e-3,
      -0.2250947176e-3, 0.6841218299e-5, 0.5824238515e-5,
      -0.104527497e-5, 0.8360937017e-7, -0.3231081277e-8,
      0.3657763036e-10, 0.6936233982e-12);
 
    if ($qn < 0.0 || 1.0 < $qn)
      return 0.0;
 
    if ($qn == 0.5)
      return 0.0;
 
    $w1 = $qn;
 
    if ($qn > 0.5)
      $w1 = 1.0 - $w1;
 
    $w3 = - log(4.0 * $w1 * (1.0 - $w1));
    $w1 = $b[0];
 
    for ($i = 1;$i <= 10; $i++)
      $w1 += $b[$i] * pow($w3,$i);
 
    if ($qn > 0.5)
      return sqrt($w1 * $w3);
 
    return - sqrt($w1 * $w3);
  }
}
 

The function takes 3 parameters: the positive votes, total votes, and the power. The power can be adjusted, 0.10 to have a 95% chance that your lower bound is correct, 0.05 to have a 97.5% chance, etc.  Sample usage:

sample(1,0);
sample(100,50);
sample(250,100);
sample(1000,500);
 
function sample($p,$n)
{
  echo Rating::ratingAverage($p,$p+$n);
}
 

Output:

Positive Negative Score
1 0 0.20654931654388
100 50 0.58789756740385
250 100 0.6648317184611
1000 500 0.6424116916199

When dealing with sites like Reddit, Digg, and the like you have a certain "freshness" element. The above solution might be a working model for the entire span of the site, but for that front page element you will need to implement some form of "gravity". This can be done by taking the raw score and decaying it over time, like so:

 
class Rating
{
  ...
  public static function gravityRating($positive, $total, $time, $power = '0.05')
  {
    if ($total == 0)
      return 0;
    return (Rating::ratingAverage($positive, $total, $power) / pow($time,0.5));
  }
  ...
}
 
sample(100,50,'0.5');
sample(100,50,'1');
sample(100,50,'4');
sample(100,50,'8');
sample(100,50,'24');
 
function sample($p,$n,$time)
{
  echo Rating::gravityRating($p,$p+$n,$time)."\n";
}
 

In the example above, $time represents the age (in hours) and you can see the decay in the output:

0.83141271310867
0.58789756740385
0.29394878370192
0.20785317827717
0.12000408843024

My recommendation would be to "cap" the time to stop decay after a fixed period such as 12 or 24 hours to stop the initial boost of fresh content and let it normalize quickly. The rate of decay of course, can be adjusted as fast or as slow as you want and again the individual weighting you want to apply will vary from site to site. Depending on the volatility of your content, a front page "freshness" that will encompass a week would not merit a 12 hour decay, but rather a week long decay. Hopefully the above code is enough to get started with content rating and making better use of user feedback and can help lead web masters to making a more intelligent calculation of their content beyond the traditional "5-Star Rating".

[ c9maji2tvz ]