Primality Test

gbenussi

4.25/5 (4 votes)

25 Sep 2012CPOL8 min read

52.5K

168

This article show how we optimize a Primality Test for know if a number is prime or not, and also presents an algorithms (Sieve of Eratosthenes) for calculating the prime numbers for a numbers less than or equal a given numbers efficiently.

Download algorithms from this article - 1 Kb

Introduction

In this article I'll show you how the brute force algorithm for calculate if a number is primer or not can be optimized. I'll show step by step, with proofs, how this is done.

Calculate prime numbers rapidly is often need in programming contest, where speed is crucial. Additionally i explain a algorithm that is usefult and rapidly when we have to know what numbers of a set of numbers are primes.

This article is divide into two parts, the first parts shows how to know if a particularly number is prime or not. The second part shows how to know what numbers of a set of numbers are primes.

I say from now that this article covers mainly the mathematical part of the methods for know if a number is prime or not. So if you're not friend of mathematics, you get bored.

Note 1: I will use C++ code for the examples.

Note 2: We consider that 1 is not prime.

First Part:

Brute Force Algorithm

By definition, a prime number is a number that has no positive divisors other than 1 and itself. By this definition born the most easy method to calculate if a number "number" is prime or not is. Consist in check for each number between 2 and number-1 if number is divisible at least by one of them. We call this method IsPrimeByBruteForce, which receive one parameter, the number to evaluate his primality, and returns a boolean value indicating if is prime or not.

Here is the code for IsPrimeByBruteForce:

C++

bool IsPrimeByBruteForce(int number)
{
    if(number<2)
        return false;
    for(int i=2; i<number; i++)
    {
        //A number "number" is divisible by another number "i"
        //if the rest of the division of number divided i equals to zero
        if(number%i==0)
            return false;
    }
    //If no exist a number between 2 and number-1 that divides number
    return true;
}

Now becomes the question: ¿Can this be optimized?

The answer is: Yes (because for this i'm writing this article), and this can be done by not checking the divisibility of number for each number between 2 and number-1. Here becomes the second method that we will see.

Optimizing the algorithm

I said before that we can optimized the I<code>sPrimeByBruteForce algorithm by not checking the divisibility of number for each number between 2 and number-1. So, for what numbers?. The answer to this question is that we have to check the divisibility of number only for each number between 2 and number/2.

The proof of this as follows:

If a number "number" is not prime, then exists at least two numbers (we call them a and b) that multiplied equals number.

(1)

I say that we have to check the divisibility of number for each number between 2 and number-1. So if we are right, in the above equation if "a" is greater than number/2, then "b" must be less than or equal than number/2. Then we have to prove:

If we prove this, we can be sure that only we need to check each number between 2 and number/2 to prove if number is prime or not. Because "a" or "b" (in the above ecuation) necessarily is less than or equal than number/2 (if we are right).

Now suppose that a is greater than number/2. That is

(2)

Then, a is at least number/2 + 1 (Because number/2 + 1 is greater than number/2) Now, we suposse that a is number/2 + 1.

Then we have (replacing (2) in (1)):

Then we have:

(3)

Now we have to prove that

Now, we suppose:

(4)

Replacing (3) in (4) and continue solving the equation:

Then we have prove that if number>=2, then "a" or "b" is less than or equal to number/2.

Another thing that i want to have in mind is the two possible cases of the value of number/2 due to if number is odd or even. Because the division of our program calculate the floor value of the division (e.g. 5/2 = 2).

Case 1: If number is even, for example 6, then number/2 is 3 with no remainder, so we do not worry about this case.

Case 2: If number is odd, for example 5, then number/2 is 2 with residue 1. In this case, we have to prove his divisibility until 2, because if we multiply 3 * 2 (floor(number/2) * 2), the result is major than number.

Said this, we conclude that is sufficient to check de divisibility until floor(numbe/2) that is what gives the computer.

So, if we want to check if a number is prime or not, we only have to check if is divisible for each number between 2 and floor(number/2). We call this algorithm "IsPrimeOptimizationOne", that receives the same parameters and returns the same type that "IsPrimeByBruteForce" method. We define this function here:

C++

bool IsPrimeOptimizationOne(int number)
{
    if(number<2)
        return false;
    if(number==2)
        return true;
    for(int i=2;i<=number/2;i++)
    {
        if(number%i==0)
            return false;
    }
    //If no exist a number between 2 and number/2 that divides number
    return true;
}

The question than becomes now is: can the method above be optimized? The answer is yes.

We now prove that we only need to check the divisibility of number until square root of number (we call this sqrt(number)). We prove this below (in the same way that out first prove):

(5)

(6)

Now, we replacing the value of b (6), in our inequation (5):

Then we prove that for number>=0, we can prove his primality only checking his divisibility for each number between 2 and sqrt(number).

You maybe ask how i know that this method is best that the first one. We prove this here:

We have proved that if number>=2, then the square root of number is less than or equal that number/2. So, if we check until sqrt(number), we do less comparisons than if we check until number/2. The algorithm for this is equal than "IsPrimeOptimizationOne" except the number until we check. We call this algorithm "IsPrimeOptimizationTwo", with the same parameters and returns that "IsPrimeByBruteForce" and "IsPrimeOptimizationOne" methods. The algorithm is defined here:‎

C++

bool IsPrimeOptimizationTwo(int number)
{
    if(number<2)
        return false;
    if(number==2)
        return true;
    //sqrt() function is defined in cmath library (#include&ltcmath&gt)
    for(int i=2;i<=sqrt(number);i++)
    {
        if(number%i==0)
            return false;
    }
    //If no exist a number between 2 and number/2 that divides number
    return true;
}

One last optimization is based in the fact that if number is even, then it's not prime. If a number is divisible by 2, then is divisible by 4, 6, 8, ..., number/2. So we only need to check if is divisible by 2, otherwise we need to check only number of the form 2*k + 1 (i.e. odd numbers). So in a new algorithm based in "IsPrimeOptimizationTwo", we check before the for loop if number is even. If it is, then is not prime. Otherwise we checking the divisibility of number by i, staring i from 3 until sqrt(number), increasing i by two in each cycle. We call this algorithm "IsprimeOptimizationThree", that is defined here:

C++

bool IsPrimeOptimizationThree(int number)
{
    if(number<2)
        return false;
    if(number==2)
        return true;
    if(number%2==0)
        return false;
    for(int i=3;i<=sqrt(number);i += 2)
    {
        if(number%i==0)
            return false;
    }
    return true;
}

That's all the optimization that i know that is possible to know if a number is prime or not, if you know another optimization, please comment.

Second Part

Note: I have received many comments because exists a previous article that covers this method, if you want see it, here is the link to the article: Finding prime numbers. I mention briefly this here because can be useful knowing the method and when use it.

Previously we see algorithms for know if a particular number is prime or not. But if we need to know all numbers prime for a set of numbers. We can call the funcion "IsPrimeOptimizationThree" for each number in the set, but this is inefficient because we repeat the same divisions for each number. For example, if we check if the set {13,14,26}, we divide the thre numbers by 2, the three numbers by 3, etc.

For this case, the best algorithm to use is an algorithm called Sieve of Eratosthenes. This algorithm avoids making repeated divisions. For explaint this algorithm, i will use a image that i liked from Wikipedia:

Note: All the images below are from Wikipedia, specifically, this article: Sieve of Eratosthenes

We want to know all the prime numbers less than or equal to 120. First, we have all the numbers in gray:

Then we discard all numbers multiples of 2, because they aren't prime numbers.

Will do the same thing for all multiples of 3.

The next number in gray is 5, so we do the same thing for all multiples of 5.

We do the same until the min number in gray is less than or equal sqrt(number) (in this case sqrt(120) is aproximately 10.9544, his floor value is 10).

You maybe ask how this can be implemented. Basically will create an array of booleans values initialized to false (in gray) and for each number i between 2 and sqrt(number) we mark the multiples of i to true (colour).

The algorithm described above is called "SieveOfEratosthenes" and receives as parameter a number "number" and returns an array (really a pointer of type bool) "prime" of size "number" containing for each number i a boolean value prime[i] indicating if the number i is prime or not. This algorithm is defined below.

C++

bool * SieveOfEratosthenes(int number)
{
    bool * prime = new bool[number+1];
    for(int i=0;i<=number;i++)
        prime[i] = true;
    //0 and 1 are not prime, and we mark because we do not check in the algorithm
    //(because will cause bad results, if we divide by 0, we go to hell, and if we divide one by one, we will mark all numbers a nonprime.
    prime[0] = false;
    prime[1] = false;
    int squareRoot = sqrt(number);
    for(int i=2;i<=squareRoot;i++)
    {
        //If is gray (Is prime)
        if(prime[i])
        {
            //We start j by the next multiple of i (that is: 2*i), and we increase it by i each time until j is less than or equal to sqrt(number)
            for(int j=2*i;j<=number; j += i)
                prime[j] = false;
        }
    }
    return prime;
}

The resulting array is an array that if the value of primes[i] is true, then i is prime.

An example of use of this algorithm here:

C++

int main()
{
    int number = 30;
    bool * primes;
    primes = SieveOfEratosthenes(number);
    for(int i=0;i<number;i++)
    {
        if(primes[i])
            cout << i << " ";
    }
    return 0;
}
//Produces:
//2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 101 103 107 109 113

Conclusion

We saw an analysis of how we can optimize and algorithm that prove that out method is valid and performed less operations that another algorithms also presented.

That was all, I hope that this article be interesting and/or helpful for you. I'll open for news algorithms for optimizing the primality test. So if you know someone, please comment.-

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)