Manipulating Prime Numbers

26. Manipulating Prime Numbers#

In this topic, we will manipulate prime numbers. The goal is to work on for loops, tuples, lists, and slicing, while also intuitively exploring the properties (complexity, randomization) of algorithms and their impact on the programs we develop.

26.1. Primality of an Integer#

Write a function is_prime that determines whether an integer is prime or not. This function must be deterministic, meaning that for a given input, it always produces the same result. It should be based on the algorithm of integer division by the first \(\sqrt(n)\) integers.

from math import sqrt

def is_prime(n):
    pass

26.1.1. Counting Prime Numbers#

We want to represent the quantity of prime numbers among integers. Write a function that returns a list of \(7\) elements, where the \(i\)th element represents the quantity of prime numbers between \(1\) and \(10^i\).

def count_primes(k):
    pass

quantity = count_primes(6)
quantity

We can represent the number of primes using the following function.

def plot_primes_count(quantity, k):
    x = [10**i for i in range(0, k+1)]
    plt.xscale("log")
    plt.yscale("log")
    plt.title('Quantity of Primes in Integer')
    plt.ylabel('Number of Primes')
    plt.xlabel('Range')
    plt.plot(x, quantity)
    plt.show()  

#plot_primes_count(quantity, 6)
    

26.2. Generating Prime Numbers#

In cryptography, we often need to generate very large prime numbers. Let’s attempt to do the same. Let’s see what we can achieve by using the is_prime function.

Write a function generate_primes, which takes as parameters n (the number of prime numbers to generate), min_value (the minimum value for the generated prime numbers), and prime_test_fun (a function that tests whether a number is prime). The function returns a list of n randomly generated prime numbers between min_value and 10 * min_value.

The prime_test_fun parameter is necessary because later we will study a second method to test a number’s primality.

def generate_primes(n, min_value, prime_test_fun):
    pass

generate_primes(10, 100000, is_prime)

26.2.1. Limit of This Method#

Find the limit of your machine. From what min_value does your computer start to take some time, say 3 or 4 seconds, for n=1000? On a recent computer, we can typically reach min_value=10**8.

Note: The numbers we will manipulate are tiny compared to those used in cryptography. Prime numbers between 10**8 and 10**9 require 30 bits, whereas in cryptography, we typically look for numbers with 1024, 2048, or 4096 bits…

primes = generate_primes(1000, 10**8, is_prime)

Fermat’s Test

Fermat’s test is a method to determine whether a number is prime or not. It relies on advanced mathematics that we will not explore here. However, it is relatively simple to describe. Here is the pseudocode for its algorithm.

fonction FermatPrimalityTest(N)
    randomly choose a positive integer a < N
    if a**(N-1) ≡ 1 mod N then
        return Yes (N is probably prime)
     else
        return No (N is definitely composite)

This method is a Monte Carlo type: if it detects that a number is composite, it is correct; however, it may be wrong if it claims that the number is prime.

Initially, we will focus on a slightly simpler version where the value of a is not randomly determined but is always 2.

fonction FermatPrimalityTest2(N)
    if 2**(N-1) ≡ 1 mod N then
        return Yes (N is probably prime)
     else
        return No (N is definitely composite)

Write a function fermat_test(n, a) that implements this method. The function takes as parameters an integer to test and the base a.

def fermat_test(n, a):
    pass

fermat_test(37, 2) # True
fermat_test(38, 2) # False
fermat_test(341, 2) # True, but... 341 = 11*31!
fermat_test(341, 3) # False, we found a witness, 341 is not prime.

26.2.2. Repeating the Method#

Numbers for which the algorithm is wrong are called base-2 pseudoprimes. There are only 22 such numbers, less than 10,000, that are base-2 pseudoprimes. The first values are 341, 561, 645, and 1105. The probability that this algorithm makes an error on an integer N tends to 0 as the number of bits in N increases. For a 1024-bit number declared prime by the algorithm with a = 2, there is a 1 in \(10^{41}\) chance that it is not prime.

If this probability seems too high, we can repeat the method by changing the base. The PGP test consecutively uses the following four bases: 2, 3, 5, and 7.

Write a function gpg_test that takes an integer n to test and a number k of repetitions. The function returns true if after k repetitions of the primality test, there is still no proof that the number is not prime.

import matplotlib.pyplot as plt
import random

def gpg_test(n, k=4):
    pass

gpg_test(341, k=4)

26.2.3. Representing the Error#

We can use the following code to generate numbers from 1 to 10**5 and count the number of times the function makes a mistake. We will plot the results on a curve based on the number of repetitions we perform.

def test_gpg_test(k):
    faults = []
    for n in range(10, 10**5):
        if gpg_test(n, k=k) :
            if not is_prime(n):
                faults.append(n)
    return faults


def plot_gpg_faults():
    plt.xscale("log")
    plt.yscale("log")
    plt.title('Number of integer failing at GPG test\n(i.e. pseudo-primes for base 2, 3, 5 and 7)')
    plt.ylabel('Number of pseudo-primes')
    plt.xlabel('Integer value')
    for k in range(1, 5):
        faults = test_gpg_test(k)
        plt.plot(faults, range(1, 1+len(faults)), label=f"False prime for k={k+1}")
    plt.legend(loc='upper left')
    plt.show()       

#plot_gpg_faults()

26.3. Comparison#

We have seen two methods for determining whether a number is prime or not. The second method is not always correct; it may sometimes be wrong. Is this trade-off acceptable? To answer this, we need to study the execution time of both methods. To do this, we will “simply” measure the execution time.

The following code contains two functions that you can reuse.

from random import randint
from timeit import default_timer as timer


def generate_statistics():
    x, y, y2 = [], [], []
    for i in range(1, 8):
        v = 10**i
        x.append(v)
        start = timer()
        generate_primes(10000, v, is_prime)
        end = timer()
        y.append(end-start)
        
        start = timer()
        generate_primes(10000, v, gpg_test)
        end = timer()
        y2.append(end-start)

    return x, y, y2

def plot_times(x, y, y2):
    plt.plot(x, y, label="Classic Prime Test")
    plt.plot(x, y2, label="Fermat Prime Test")
    plt.yscale("log")
    plt.xscale("log")
    plt.title('Time to Generate of 1000 Primes')
    plt.legend()
    plt.ylabel('Running time (seconds)')
    plt.xlabel('Range of search between x and 10*x')
    plt.show() 


#x, y, y2 = generate_statistics()
#plot_times(x, y, y2)

26.3.1. Very Large Numbers#

Python allows us to handle arbitrarily large integers. Therefore, it is entirely possible to test the primality of very large numbers. On the machine of the author of this topic, it is possible to generate a prime number larger than \(10^{500}\) in one minute.

a = 300
b = 304
x = random.randint(10**a, 10**b)
#print(x)
#print(gpg_test(x))

#generate_primes(1, 10**500, gpg_test)