Abstract
This article covers five discrete probability distributions most common for use in computer simulations, gaming, artificial intelligence decision making and environment modeling. The covered distributions are the Binomial, Geometric, NegativeBinomial, HyperGeometric and Poisson probability distributions, along with an abstract probability class. The classes differ from other routine mathematical attempts at probability computation with a few implementation tricks that extend the range of allowable inputs. A comparison with Monte Carlo techniques to probability computation is given.
Overview
If you are a computer science or mathematics major then it is likely you will take a course on probability and statistics. Probability is a large area of study but basic concepts can be learned by anyone. The following five probability classes are most likely the first you will encounter in studies and fall under the heading of 'discrete' probabilities meaning that the events they model happen or they don't. I will give a brief introduction to each probability and an example of when it might be used. Discrete probabilities use a lot of factorials or often require multiplying very large numbers with very small numbers. An example of some of the problems this causes and the work around is discussed along with a very common alternative many programmers use to generate probabilities using random numbers.
Probability Basics
A probability is the chance of an event happening. A probability is always between 0.0 and 1.0 inclusive. A probability can ALWAYS be defined as "the number of ways the event can happen in an experiment" divided by "the total number of ways an experiment can turn out". Rolling a die is the most common example next to flipping a coin. The chance to roll any number is one divided by the number of sides on the die. Usually 1/6 for a normal die. The chance to land heads on a flipped coin is 1/2. These are pretty obvious observations but are the basis to more meaningful questions like what is the chance to roll the same number 3 times in a row. This is not so obvious. The five probability distributions we are going to see all depend on the prior knowledge of a discrete event such as rolling a die, flipping a coin, hitting a ball, shooting a target, failure of a device, a lottery, or drilling for oil to name a few. If we can know the likelihood of a discrete event then we can start to ask questions of how this event plays out under certain circumstances.
The Binomial Probability
The binomial probability models the probability of a number of successful trials, Y (also called a 'random variable'), out of a total number of trials (N). Each trial has the EXACT same chance of success or failure. For example rolling a die, the chance to roll a one is 1/6. If we say that success, denoted by p, p=1/6 then the chance to not roll a one must be 1.0 - p, denoted by q. The symbol p means success and the symbol q means failure. Let's say we have five die or we are rolling the same die over 5 times and we want to know the probabilty that a one comes up exactly 3 out of the 5 times. 3 being the number of successful trials of rolling a one. The full equation for the Binomial Probability is
P(Y) = N! / ((N-Y)! * Y!) * pY * q (N-Y)
and for this example N = 5, Y = 3, p=1/6, and q=5/6. The probability is notated like so: P(Y=3). If we wanted to know the probability of 'at least three' or 'three or less' ones in five rolls we would say P(Y<=3). This is called a cumulative probability and is simply the addition of P(Y=0)+P(Y=1)+P(Y=2)+P(Y=3). The abstract class Probability contains a member variable called m_RVT meaning 'random variable type', which is an enumeration as to the sign of P(Y ? y). The GetResult
method looks at this enumeration and calls the ComputeResult
function in the derived class for each P(Y=y) in the cumulative probability.
double Probability::GetResult() throw (ProbabilityException)
{
if(m_probability != UNDEFINED )
return m_probability;
try{
m_probability = 0.0;
int i = 0;
switch(m_RVT)
{
case EQUAL:
m_probability = ComputeResult();
break;
case LESSTHAN:
for(SetRV(i); i < m_RV; SetRV(++i))
m_probability += ComputeResult();
break;
case GREATERTHANOREQUAL:
for(SetRV(i),m_probability = 1.0; i < m_RV; SetRV(++i))
m_probability -= ComputeResult();
break;
case GREATERTHAN:
for(SetRV(i),m_probability = 1.0; i <= m_RV; SetRV(++i))
m_probability -= ComputeResult();
break;
case LESSTHANOREQUAL:
for(SetRV(i); i <= m_RV; SetRV(++i))
m_probability += ComputeResult();
break;
case NOTEQUAL:
m_probability = 1.0 - ComputeResult();
return m_probability;
default:
throw ProbabilityException(
"Probability error- Random Variable type is unset");
}
}
catch(ProbabilityException pe)
{
m_probability = UNDEFINED;
SetRV(m_RV);
throw pe;
}
SetRV(m_RV);
return m_probability;
}
The definition for the abstract class Probability is small and fairly straight forward
class Probability
{
public:
enum RandomVariableType { EQUAL, LESSTHAN, GREATERTHAN,
LESSTHANOREQUAL, GREATERTHANOREQUAL, NOTEQUAL };
Probability( int rv=0, RandomVariableType rvt=EQUAL, double prob=UNDEFINED)
:m_RV(rv), m_RVT(rvt), m_probability(prob){};
virtual ~Probability(){};
class ProbabilityException
{
public:
ProbabilityException(const char* whatString):What(whatString){}
inline const char* what() const { return What; }
protected:
const char* What;
};
virtual double GetResult() throw (ProbabilityException) ;
virtual double GetExpectedValue() const = 0;
virtual double GetVariance() const = 0;
protected:
virtual double ComputeResult() = 0;
virtual void SetRV(int Y) = 0;
RandomVariableType m_RVT;
double m_probability;
int m_RV;
};
Each specialized probability class constructor will call this base constructor like so:
BinomialProbability::BinomialProbability(int N , int Y, double p,
Probability::RandomVariableType rvt)
:m_trials(N),
m_successes(Y),
m_chance_of_success(p),
Probability(Y, rvt)
{
assert(Y<=N && Y >=0);
assert(p >=0.0 && p<=1.0);
}
Tackling Factorials and Large Multiplications
Any factorial over 12! promptly causes an overflow in an unsigned int. All of the probability classes involve multiplying a very large number by a very small number to get an end value between 0.0 and 1.0 inclusive. I use this knowledge to my advantage when computing the result so that overflow is not an issue. Each equation in the class's ComputeResult() method is broken up into its individual factors. If the running result is over 1.0 then we know that we want to divide by something or multiply by a fraction. If the running result is under 1.0 then we want to multiply by a factor. This behavior causes the result to float around the value of 1.0 as it is being computed. This also reduces loss of significant digits. Most of the probability classes have a combination that needs computing which is basically a factorial divided by two other factorials. An interesting and very handy cancelation of digits occurs in a combination that reduces the number of multiplications dramatically. Many uses of factorials often require division by other factorials or multiplication by small numbers, so this general method of looking at the whole equation then optimizing it from what the range of the result will be is a good way to calculate factorials or other equations that normally have intermediate results of very large or small numbers.
double BinomialProbability::ComputeResult() throw (ProbabilityException)
{
if(m_trials == 0)
return 0.0;
double result = 1.0;
int Y = m_successes;
int N = m_trials;
double P = m_chance_of_success;
double Q = 1.0 - P;
int range = 0, np =0, nq = 0, nnumer = 0, ndenom = 0;
assert( Y<=N && Y >=0);
assert( P <= 1.0 && P >=0.0);
if(Y == 0){
return result = pow(Q,N);
}
if(Y == N){
return result = pow(P,Y);
}
if(Y < N-Y){
range = Y;
}
else{
range = N-Y;
}
np = Y;
nq = N-Y;
ndenom = range;
nnumer = N;
while(np > 0 || nq > 0 || ndenom > 0 || nnumer >(N-range)){
if(result >= 1.0 || nnumer ==(N-range)){
if(ndenom > 0){
result /= ndenom;
--ndenom;
}
else if(nq > 0){
result *= Q;
--nq;
}
else if(np > 0){
result *= P;
--np;
}
else {
throw(ProbabilityException(
"Binomial Probability computation error- \
check success percentage between 0 and 1"));
}
}
else if(result < 1.0 || np==0 ){
if(nnumer >(N-range)){
result *= nnumer;
--nnumer;
}
else{
throw(ProbabilityException(
"Binomial Probability computation error- \
unknown error"));
}
}
else{
throw(ProbabilityException(
"Binomial Probability computation error- \
possible value infinity or NaN"));
}
}
return result;
}
Monte Carlo Methods
One of the most intuitive ways for figuring probability is by using a random number generator and then counting the number of hits or misses. Again we will try to figure out the probability of rolling exactly three ones out of five trials. Consider the following code.
int Accuracy = 1000;
int N = 5;
int Y = 3;
int successess = 0;
for(int i =0; i < Accuracy; i++)
{
int hits = 0;
for(int j =0;j < N ; j++)
{
die = rand()%6 +1;
if(die == 1)
hits++;
}
if(hits == Y)
successes++;
}
double result = 1.0 * successes/Accuracy;
To get the percentage of chance of this experiment you must run the rolling of 5 die many times and then count how many times that 3 ones were rolled and divide by the number of times the experiment was done. The problem with this method is that there is no guarantee for consistency. If the random number generator is not as random as one would like then accuracy is moot. To increase accuracy with this method you must run the experiment many times. Usually to get a decent result you would run the experiment at least 1000 times. To get accuracy to more decimal places you have to increase the value of Accuracy. In complex situations all this takes a lot of cpu cycles, proportionately more for the accuracy you need.
The Geometric Probability
This probability class models the number of the trial for which an event succeeds (or fails). Lets assume that a weapon has a 2% chance of misfiring in competition. What is the chance the weapon will misfire on the 4th shot? Here we only care about the 4th shot, P(Y=4) but we could use the cumulative P(Y<=4) to see the chance of a misfire before the 5th shot. When we say P(Y=4) we are saying "what is the chance for a fire,fire,fire then a misfire?" For our example the arguments are simply:
GeometricProbability gp(4, .02);
GeometricProbability gpc(4, .02,
Probability::RandomVariableType::LESSTHANOREQUAL);
The Negative Binomial probability
This probabilty class is similar to the Geometric and models the number of the trial for which the Kth event succeeds (or fails). Like the last example we could ask what is the chance that the weapon misfires three times by the 20th shot. This means that there will be two misfires within the first 19 shots and that the last shot is the third misfire.
NegativeBinomialProbability np(20,3,.02);
The HyperGeometric Probability
This class models a more complex event choosing from two sets. It is also the probability that models modern day lotteries. Suppose we have a bag with 3 red chips and 5 black chips. We want to know the chance of picking one red chip out of three picks. The bag is dark and picking from the bag is assumed random. The HyperGeometric probability considers 4 arguments.
- N - the population size, which is 8 chips total, 3 red and 5 black
- n - the sample size, we are making 3 picks (this assumes chips are not put back into the bag after picking!!)
- r - the set of items we are interested in. We want red chips and there are 3 in the set.
- y - the number of items from the sample set we are interested in. We only care about getting EXACTLY one. If we wanted to consider 1 or more or 2 or less or 3 or less then it would be a cumulative probability.
HyperGeometricProbability hgp(8,3,3,1);
A more practical example might be finding the probability of an event in a legal case. 20 workers consisting of three minorities are chosen from to fill two types of positions, 16 regular labor positions and 4 heavy labor positions. The minorities claim in a lawsuit that all of them are chosen for heavy labor positions while the employer contends the assignments are random. To figure out what the chance that all minorities are chosen for the heavy labor job we set our population size, N=20, the sample size is, n=4, for the number of heavy labor positions, the set of items we are interested in is r=3 for the number of minorities and we are interested in when y=3 or when all the minorities are chosen.
HyperGeometricProbability hgp(20,4,3,3);
The Poisson Probability
The Poisson probability models events based around a known average. The average is based on events that can happen at anytime and are considered instantaneous (meaning two events don't happen at the exact same time). For example, the number of typing errors per page or the number of traffic accidents at a street intersection in a year have an average that is Poisson-"ish". If we know that a certain intersection has an average of 7 traffic accidents a year and we want to output what is the probability that there will be 5 or less accidents this year...
cout<< PoissonProbability(5,7,
Probability::RandomVariableType::LESSTHANOREQUAL).GetResult();
Set Theory and Composite events
One of the things not included with these probability classes is the set theory involved in mixing the probabilities to get your own relevant answers. To model more complex events you need to understand the concept of Independence and Mutual Exclusiveness between sub events. Probability has addition and multiplication rules for independency and mutual exclusiveness between events. These involve the concepts of Union and Intersection in Set theory. It is beyond the scope of this article to explain it in detail. The reader should learn about Bayes theorem as well. There are numerous examples and explanations on the web.
For example, the chance that a seven is rolled with two dice involves a bit more thought. To show modeling of composite events, we would want to multiply the probabilities of Random Variables, Y1 and Y2, where Y1 + Y2 = 7, because each event of rolling a die is independent of the other. The notation being P(Y1 = a) intersect P(Y2 = 7-a) which is P(Y1=a) * P(Y2=7-a).
int Y1, Y2;
int N = 6;
int Roll = 7;
double p = 1.0/6.0;
double result = 0.0;
for(Y1=1; Y1<=6; Y1++)
{
Y2 = Roll - Y1;
BinomialProbability bp1( N, Y1, p);
BinomialProbability bp2( N, Y2, p);
result += bp1.GetResult() * bp2.GetResult()
}
cout << result;
Also included with each probability is the Expected Value or Mean of the distribution as well as the Variance. These can be retrieved with the GetExpectedValue
and GetVariance
functions.