# A Suite of Discrete Probability Classes

**Joe Nellis**on

**February 20th, 2003**

### WEBINAR:

On-Demand

Building the Right Environment to Support AI, Machine Learning and Deep Learning

__Environment:__Any, C++

### Abstract

This article covers five discrete probability distributions most common for use in computer simulations, gaming, artificial intelligence decision making, and environment modeling. The covered distributions are the Binomial, Geometric, NegativeBinomial, HyperGeometric, and Poisson probability distributions, along with an abstract probability class. The classes differ from other routine mathematical attempts at probability computation with a few implementation tricks that extend the range of allowable inputs. A comparison with Monte Carlo techniques to probability computation is given.

### Overview

If you are a computer science or mathematics major, it is likely you will take a course on probability and statistics. Probability is a large area of study, but basic concepts can be learned by anyone. The following five probability classes are most likely the first you will encounter in studies and fall under the heading of 'discrete' probabilities, meaning that the events they model happen or they don't. I will give a brief introduction to each probability and an example of when it might be used. Discrete probabilities use a lot of factorials or often require multiplying very large numbers with very small numbers. An example of some of the problems this causes and the work around is discussed, along with a very common alternative many programmers use to generate probabilities using random numbers.

### Probability Basics

### The Binomial Probability

The binomial probability models the probability of a number of successful trials, Y (also called a 'random variable'), out of a total number of trials (N). Each trial has the EXACT same chance of success or failure. For example, when rolling a die, the chance to roll a one is 1/6. If we say that success, denoted by p, p=1/6, is the chance to not roll a one must be 1.0 - p, denoted by q. The symbol p means success and the symbol q means failure. Let's say we have five dice or we are rolling the same die over five times and we want to know the probabilty that a one comes up exactly three out of the five times, three being the number of successful trials of rolling a one. The full equation for the Binomial Probability is:

P(Y) = N! / ((N-Y)! * Y!) * pand for this example N = 5, Y = 3, p=1/6, and q=5/6. The probability is notated like so: P(Y=3). If we wanted to know the probability of at least three or three or less ones in five rolls we would say P(Y<=3). This is called a cumulative probability and is simply the addition of P(Y=0)+P(Y=1)+P(Y=2)+P(Y=3). The abstract class Probability contains a member variable called m_RVT, meaning 'random variable type', which is an enumeration as to the sign of P(Y ? y). The GetResult() method looks at this enumeration and calls the ComputeResult() function in the derived class for each P(Y=y) in the cumulative probability.^{Y}* q^{(N-Y)}

double Probability::GetResult() throw (ProbabilityException) { if(m_probability != UNDEFINED ) return m_probability; try{ m_probability = 0.0; int i = 0; switch(m_RVT) { case EQUAL: m_probability = ComputeResult(); break; case LESSTHAN: for(SetRV(i); i < m_RV; SetRV(++i)) m_probability += ComputeResult(); break; case GREATERTHANOREQUAL: for(SetRV(i),m_probability = 1.0; i < m_RV; SetRV(++i)) m_probability -= ComputeResult(); break; case GREATERTHAN: for(SetRV(i),m_probability = 1.0; i <= m_RV; SetRV(++i)) m_probability -= ComputeResult(); break; case LESSTHANOREQUAL: for(SetRV(i); i <= m_RV; SetRV(++i)) m_probability += ComputeResult(); break; case NOTEQUAL: m_probability = 1.0 - ComputeResult(); return m_probability; default: throw ProbabilityException("Probability error - Random Variable type is unset"); } } catch(ProbabilityException pe) { m_probability = UNDEFINED; SetRV(m_RV); throw pe; } SetRV(m_RV); return m_probability; }

The definition for the abstract class Probability is small and fairly straightforward:

```
class Probability
{
public:
enum RandomVariableType { EQUAL, LESSTHAN, GREATERTHAN,
LESSTHANOREQUAL, GREATERTHANOREQUAL,
NOTEQUAL };
/* Constructor Parameters:
rv - The Random Variable value.
rvt - Random Variable type, whether it is cumulative, and
which way it is cumulative.
prob - The probability of the event. This can be set
beforehand if it is known and the GetResult function
will return it.
*/
Probability( int rv=0, RandomVariableType rvt=EQUAL,
double prob=UNDEFINED)
:m_RV(rv), m_RVT(rvt), m_probability(prob){};
virtual ~Probability(){};
class ProbabilityException
{
public:
ProbabilityException(const char* whatString):What(whatString){}
inline const char* what() const { return What; }
protected:
const char* What;
};
virtual double GetResult() throw (ProbabilityException) ;
virtual double GetExpectedValue() const = 0;
virtual double GetVariance() const = 0;
protected:
virtual double ComputeResult() = 0;
virtual void SetRV(int Y) = 0;
RandomVariableType m_RVT;
double m_probability;
int m_RV;
};
```

Each specialized probability class constructor will call this base constructor, like so:

BinomialProbability::BinomialProbability(int N, int Y, double p, Probability::RandomVariableType rvt) :m_trials(N), m_successes(Y), m_chance_of_success(p), Probability(Y, rvt) { assert(Y<=N && Y >=0); assert(p >=0.0 && p<=1.0); }

### Tackling Factorials and Large Multiplications

Any factorial over 12! promptly causes an overflow in an unsigned double. All of the probability classes involve multiplying a very large number by a very small number to get an end value between 0.0 and 1.0 inclusive. I use this knowledge to my advantage when computing the result so that overflow is not an issue. Each equation in the class's ComputeResult() method is broken up into its individual factors. If the running result is over 1.0, we know that we want to divide by something or multiply by a fraction. If the running result is under 1.0, we want to multiply by factor. This behavior causes the result to float around the value of 1.0 as it is being computed. This also reduces loss of significant digits. Most of the probability classes have a combination that needs computing, which is basically a factorial divided by two other factorials. An interesting and very handy cancelation of digits occurs in a combination that reduces the number of multiplications dramatically. Many uses of factorials often require division by other factorials or multiplication by small numbers, so this general method of looking at the whole equation, optimizing it from what the range of the result will be is a good way to calculate factorials or other equations that normally have intermediate results of very large or small numbers.

double BinomialProbability::ComputeResult() throw (ProbabilityException) { if(m_trials == 0) return 0.0; // initialize some variables double result = 1.0; int Y = m_successes; int N = m_trials; double P = m_chance_of_success; double Q = 1.0 - P; int range = 0, np =0, nq = 0, nnumer = 0, ndenom = 0; // validate assert( Y<=N && Y >=0); assert( P <= 1.0 && P >=0.0); // check optimizations if(Y == 0){ return result = pow(Q,N); } if(Y == N){ return result = pow(P,Y); } // reorder the factorials to account for cancellations // in numerator and denominator. if(Y < N-Y){ range = Y; // N-Y! cancels out } else{ range = N-Y; // Y! cancels out } np = Y; nq = N-Y; ndenom = range; nnumer = N; while(np > 0 || nq > 0 || ndenom > 0 || nnumer >(N-range)){ // If the result is greater than one, we want to divide by // a denominator digit or multiply by percentage p or q. // If we are out of numerator digits, finish multiplying with // our powers of p or q or dividing by a denom digit. if(result >= 1.0 || nnumer ==(N-range)){ if(ndenom > 0){ //m_resut *= (1.0/ndenom); result /= ndenom; --ndenom; } else if(nq > 0){ result *= Q; --nq; } else if(np > 0){ result *= P; --np; } else { throw(ProbabilityException ("Binomial Probability computation error - check success percentage between 0 and 1")); } } // If the result is less than one, we want to multiply // by a numerator digit. If we are out of denominator digits, // powers of p or powers of q then multiply rest of result // by numerator digits. else if(result < 1.0 || np==0 ){ if(nnumer >(N-range)){ result *= nnumer; --nnumer; } else{ throw(ProbabilityException ("Binomial Probability computation error - unknown error")); } } else{ throw(ProbabilityException ("Binomial Probability computation error - possible value infinity or NaN")); } } return result; }

### Monte Carlo Methods

One of the most intuitive ways for figuring probability is by using a random number generator and then counting the number of hits or misses. Again, we will try to figure out the probability of rolling exactly three ones out of five trials. Consider the following code.

int Accuracy = 1000; int N = 5; int Y = 3; int successess = 0; for(int i =0; i < Accuracy; i++) { int hits = 0; for(int j =0;j < N ; j++) { die = rand()%6 +1; // give a random number from 1 to 6 if(die == 1) // check for a one rolled hits++; } if(hits == Y) // check to see if 3 hits successes++; } double result = 1.0 * successes/Accuracy;

To get the percentage of chance of this experiment, you must run the rolling of five dice many times and then count how many times that three ones were rolled and divide by the number of times the experiment was done. The problem with this method is that there is no guarantee for consistency. If the random number generator is not as random as one would like, accuracy is moot. To increase accuracy with this method, you must run the experiment many times. Usually, to get a decent result, you would run the experiment at least 1000 times. To get accuracy to more decimal places, you have to increase the value of Accuracy. In complex situations, all this takes a lot of CPU cycles, proportionately more for the accuracy you need.

### The Geometric Probability

This probability class models the number of the trial for which an event succeeds (or fails). Let's assume that a weapon has a 2% chance of misfiring in competition. What is the chance the weapon will misfire on the 4th shot? Here, we only care about the 4th shot P(Y=4), but we could use the cumulative P(Y<=4) to see the chance of a misfire before the 5th shot. When we say P(Y=4), we are saying "what is the chance for a fire, fire, fire, then a misfire?" For our example, the arguments are simply:

```
GeometricProbability gp(4, .02);
// and for cumulative
GeometricProbability gpc(4, .02, Probability::RandomVariableType
::LESSTHANOREQUAL);
```

### The Negative Binomial Probability

This probabilty class is similar to the Geometric and models the number of the trial for which the Kth event succeeds (or fails). Like the last example, we could ask what is the chance that the weapon misfires three times by the 20th shot. This means that there will be two misfires within the first 19 shots and that the last shot is the third misfire.

NegativeBinomialProbability np(20,3,.02);

### The HyperGeometric Probability

This class models a more complex event choosing from two sets. It is also the probability that models modern day lotteries. Suppose we have a bag with three red chips and five black chips. We want to know the chance of picking one red chip out of three picks. The bag is dark and picking from the bag is assumed random. The HyperGeometric probability considers four arguments.

- N—the population size, which is eight chips total, three red and five black
- n—the sample size, we are making three picks (this assumes chips are not put back into the bag after picking!!)
- r—the set of items we are interested in. We want red chips and there are three in the set.
- y—the number of items from the sample set we are interested in. We only care about getting EXACTLY one. If we wanted to consider one or more or two or less or three or less, it would be a cumulative probability.

HyperGeometricProbability hgp(8,3,3,1);

A more practical example might be finding the probability of an event in a legal case. 20 workers consisting of three minorities are chosen from to fill two types of positions, 16 regular labor positions and four heavy labor positions. The minorities claim in a lawsuit that all of them are chosen for heavy labor positions while the employer contends the assignments are random. To figure out what the chance that all minorities are chosen for the heavy labor job, we set our population size, N=20, the sample size is, n=4, for the number of heavy labor positions, the set of items we are interested in is r=3 for the number of minorities and we are interested in when y=3 or when all the minorities are chosen.

HyperGeometricProbability hgp(20,4,3,3);

### The Poisson Probability

The Poisson probability models events based around a known average. The average is based on events that can happen at any time and are considered instantaneous (meaning two events don't happen at the EXACT same time). For example, the number of typing errors per page or the number of traffic accidents at a street intersection in a year have an average that is Poisson-"ish". If we know that a certain intersection has an average of seven traffic accidents a year and we want to output the probability that there will be five or fewer accidents this year...

cout<< PoissonProbability(5,7,Probability ::RandomVariableType ::LESSTHANOREQUAL).GetResult();

### Set Theory and Composite Events

One of the things not included with these probability classes is the set theory involved in mixing the probabilities to get your own relevant answers. To model more complex events, you need to understand the concept of Independence and Mutual Exclusiveness between sub events. Probability has addition and mulitplication rules for independency and mutual exclusiveness between events. These involve the concepts of Union and Intersection in Set theory. It is beyond the scope of this article to explain it in detail. The reader should learn about the Bayes theorem as well. There are numerous examples and explanations on the Web.

For example, the chance that a seven is rolled with two dice involves a bit more thought. To show modeling of composite events, we would want to multiply the probabilities of Random Variables, Y1 and Y2, where Y1 + Y2 = 7, because the each event of rolling a die is independent of the other. The notation being P(Y1 = a) intersect P(Y2 = 7-a) which is P(Y1=a) * P(Y2=7-a).

```
int Y1, Y2;
int N = 6;
int Roll = 7;
double p = 1.0/6.0;
double result = 0.0;
for(Y1=1; Y1<=6; Y1++)
{
Y2 = Roll - Y1;
BinomialProbability bp1( N, Y1, p);
BinomialProbability bp2( N, Y2, p);
// The answer is all the ways a seven can be rolled so we
// accumulate each independent
// probability of P(y1=a)*P(y2=7-a)
result += bp1.GetResult() * bp2.GetResult()
}
cout << result;
```

Also included with each probability is the Expected Value or Mean of the distribution as well as the Variance. These can be retrieved with the GetExpectedValue() and GetVariance() functions.

### Downloads

Download demo project - 81 KbDownload source - 19 Kb

## Comments

## comment about: "A Suite of Discrete Probability Classes"

Posted byGerson do Nascimentoon08/25/2018 07:00amDear Joe Nellis, your work is sensational. Please make more posts like this, such as predictive analytics, predictive models (predictive models), machine learning, deep learning, data science ... etc. Your work is wonderful. Congratulations on the post.

Reply