Prime Polynomial Approximation

June 5, 2024

Prime numbers

Prime numbers are a fundamental concept in mathematics, with applications in various fields such as cryptography, number theory, and computer science. While prime numbers have unique properties that make them interesting to study, they can also be challenging to work with due to their unpredictable distribution. In this article, we explore how to approximate prime numbers using polynomial functions in Python with NumPy, a powerful library for numerical computing.

Why Approximate Prime Numbers?

The distribution of prime numbers is a well-known problem in mathematics, with no simple formula to generate all prime numbers. While there are algorithms to efficiently check if a number is prime, finding the nth prime number or generating a list of prime numbers within a range can be computationally expensive. Approximating prime numbers using polynomial functions offers a way to estimate prime numbers without explicitly checking each number for primality.

Polynomial Approximation of Prime Numbers

Polynomial functions are widely used in mathematics to approximate complex functions or data points. In the context of prime numbers, we can use polynomial functions to generate a sequence of numbers that closely resemble prime numbers. By fitting a polynomial function to a set of known prime numbers, we can predict the next prime number in the sequence.

Let's consider a simple example of approximating prime numbers using a polynomial function. We start by generating a list of prime numbers using a sieve algorithm or any other method. Next, we fit a polynomial function to the list of prime numbers using NumPy's polyfit function. The resulting polynomial function can then be used to predict prime numbers beyond the list of known primes.

import numpy as np

# Generate a list of prime numbers
primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
# The next prime number is 31

# Fit a polynomial function to the list of prime numbers
coefficients = np.polyfit(range(len(primes)), primes, deg=len(primes)-1)
poly_func = np.poly1d(coefficients)

# show the loss of the polynomial function
loss = np.square(poly_func(range(len(primes))) - primes).sum()
print(f"Loss: {loss}")

# Predict the next prime number in the sequence
next_prime = poly_func(len(primes))
print(f"The next prime number is: {next_prime}")

Here are the results: Loss: 6.680285609990136e-20 The next prime number is: -426.0000000029478.

As we can see the loss is very small, but instead of predicting the next prime number, we get a negative number. So our polynomial function has zero predictive power.

A basic neural network approach

here I try to fit a polynomial function to approximate a sequence of prime numbers. The code generates a set of input data x ranging from 0 to 1 with 303 data points. The output data y is constructed as an array of prime numbers from 2 to 2000, with a total of 2000 elements.

I then define a function generate_polynomial_features to generate polynomial features up to a specified degree. The is_prime function checks if a number is prime.

After generating the input features X by applying the generate_polynomial_features function, I randomly initialize weights for the polynomial model.

In the training loop, I perform forward pass to compute predicted y, calculate the loss (mean squared error), and then perform backpropagation to update the weights. This process repeats for 2,000,000 iterations.

Finally, the resulting polynomial function is printed in the form of y = w0 + w1 * x^1 + w2 * x^2 + ... + wn * x^n, where w0, w1, ..., wn are the learned weights.

import numpy as np
import math

# Function to generate polynomial features
def generate_polynomial_features(x, degree):
    features = np.zeros((len(x), degree))
    for i in range(degree):
        features[:, i] = np.power(x, i + 1)
    return features

# y should be an array of prime numbers 2000 elements long
def is_prime(n):
    """Function to check if a number is prime"""
    if n <= 1:
        return False
    if n <= 3:
        return True
    if n % 2 == 0 or n % 3 == 0:
        return False
    i = 5
    while i * i <= n:
        if n % i == 0 or n % (i + 2) == 0:
            return False
        i += 6
    return True

# Create random input and output data
x = np.linspace(0, 1, 303)

# Generate y as an array of prime numbers
y = np.array([i for i in range(2, 2002) if is_prime(i)])
# y = np.sin(x)

# Define the degree of the polynomial
degree = 303

# Generate polynomial features
X = generate_polynomial_features(x, degree)

# Randomly initialize weights
weights = np.random.randn(degree)

learning_rate = 1e-3

for t in range(2000000):
    # Forward pass: compute predicted y
    y_pred =, weights)

    # Compute and print loss
    loss = np.square(y_pred - y).sum()
    if t % 100 == 30:
        print(t, loss)

    # Backpropagation to compute gradients of weights with respect to loss
    grad_y_pred = 2.0 * (y_pred - y)
    gradients =, grad_y_pred)

    # Update weights
    weights -= learning_rate * gradients

# Print the result
result = "y = "
for i in range(degree):
    if i == 0:
        result += f"{weights[i]}"
        result += f" + {weights[i]} * x^{i+1}"

# print(result)

The best loss achieved during training was 11017.60, which occurred at iteration 1999930. This loss represents how well the polynomial function fits the sequence of prime numbers. Despite being relatively high, it's important to note that prime numbers can be quite irregularly distributed, making them challenging to model accurately with a polynomial function.

Damn it! I was not able to predict the next prime number. These prime numbers are so unpredictable haha. Of course, I knew that from the start, and this is just a fun experiment. I hope you enjoyed it as much as I did. If you have any questions or suggestions, feel free to reach out. Happy coding!