Introduction
In this article I discuss the voting system that I thought is used on CodeProject's website and compare it with the "standard" voting system.
I will still refer to this voting system as to the CodeProject's voting system but note that the real system that is used on the CodeProject website is discussed in the following article: "Is the CodeProject's Voting system really smart?".
Nevertheless this article is still significant as it compares two different voting systems that are widely used on the websites.
Standard voting model
Usually when we want to estimate the quality of the article, we create a voting model for it. The structure of that model is usually as follows:
- Each person can leave his vote only once
- Vote is a number from the following set {1,2,3,4,5}
- The average value is calculated as
where
is considered to be the current grade of the article
Here N is the number of received votes and
is the vote value. It is obvious that we have a stochastic process
that is defined by the formula (1) and
are the discrete random variables that take values from the set {1,2,3,4,5}. The distribution of each
is unknown but each of these variables is independent from another and all of them have the same distribution.
Now let us introduce another voting model that used on the CodeProject website and as we will see seems to be seriously different from the standard one although it is pretty much the same.
CodeProject voting model
The voting system that is used on the CodeProject web site differs from the one represented above. On this website the average value is defined by the following formula
. In other words the average value is calculated as an average between the previous article mark and the new vote value.
It is obvious that the difference between this voting model and the standard model is that this one calculates grade as average between last vote value and previous grade and the standard one calculates the total average for all votes.
In the following section we are going to state questions that will help us to investigate what is the real difference between these models.
Compare voting models
It is clear that in the standard voting model, described in the first part of the article, every vote takes an equal part in the average grade. For the CodeProject voting model this statement could be more then doubtful.
There are at list two questions about these models we might want to answer to:
- If we consider the same votes
what will be the difference between
and
values?
- Is the votes order significant for
and
values?
Let us try to answer the first two questions.
Does
for the same
and whether the
order is significant?
To answer the question stated in the title let us try to use the mathematical induction:
- Prove that
: This is obvious as far as
and
.
- Assume that
- Prove that
Note that
and
. Thus we have the following formula that we need to prove:
Hence we have the following system of inequalities:
By multiplying the first inequality by
and subtracting it from the second one we have
from which in its turn we have
that is true in case when
as
.
This is an important result, as we now understand that:
- The inequality
holds only on some special occasions
- The value of
depends on the order of
. Because
doesn't depend on their order we have either
in case of
and
or possibly
otherwise
We can suspect now that the
value with respect to
is not so good to estimate average grade as the order of
influences its value.
Is
really bad?
To understand this, let us define the following random variable:
. To understand the difference of influence on the
value we might be interested in the value of
where
and
.
From the last formula we see that was obvious before. The influence of
depends on the values of
and
. Thus we come to the probability theory, as
is a random variable. Even more as
depends only on
then this is a Markov process.
If
then the late vote influences the average grade less then the early vote. Form
we have
and this is quite obvious but leads us to nowhere as it only shows that the influence of vote on this or that stage may influence the average grade in different ways that depend on the preceding average grade values.
That is why recall that
and
are random variables that depend on the sequence of independent random variables
with equal distributions. To compare
and
let us use their mean values
and
.
Let's assume that
is true, as
have equal distributions. Then
also note that
and thus
From these equalities we see that the expected value is the same for
and
and hence they can be considered to be quite the same from the point of the average grade estimation.
Conclusions
In this article I tried to show that the article voting system used on the CodeProject website is quite good to estimate the average article grade although this might be not so clear.
The advantage of their approach from the software point of view is that there is no need to store all votes but only the last average. It significantly reduces resources needed for calculation of average grades of articles.
At the same time further investigations may show that the variance of
and
are different and it might be interesting to estimate and compare their values.
See also
To get the analysis of the voting system that is really used on CodeProject's website, read the following article: