Calculating Linear or Polynomial Regression in Python

This program performs a linear or polynomial regression on a given dataset and visualizes the result with a graph.

It uses two external Python modules:

  • numpy for mathematical operations and calculating the polynomial coefficients.
  • matplotlib for generating and displaying the graph.

So, the first step is to import these libraries into the program.

import numpy as np
import matplotlib.pyplot as plt

Next, I define two arrays, `x` and `y`, which represent the observed data (5 pairs of values).

x = np.array([1, 2, 3, 4, 5])
y = np.array([2, 3, 5, 4, 6])

Then, I calculate the polynomial regression using the polyfit(x, y, n) method, where `n` is the degree of the polynomial for the regression curve.

In this case, I choose a third-degree polynomial.

coefficients = np.polyfit(x, y, 3)

The np.polyfit(x, y, 3) function computes the coefficients of a third-degree polynomial (four coefficients) that best fits the data points in \(x\) and \(y\).

Here are the coefficients:

print(coefficients)

array([ 0.16666667, -1.57142857, 5.26190476, -2. ])

After obtaining the coefficients, I use them to define the polynomial function using the poly1d() method.

polynomial = np.poly1d(coefficients)

Then, I generate the predicted values from the polynomial curve for the same \(x\) values.

y_pred_poly = polynomial(x)

Finally, I display a graph that shows the observed data points as a scatter plot (blue dots) and the third-degree polynomial curve in green.

plt.figure(figsize=(8, 6))
plt.scatter(x, y, color='blue', label='Observed Data') # Observed points
plt.plot(np.linspace(min(x), max(x), 100), polynomial(np.linspace(min(x), max(x), 100)), color='green', label='Third-degree Polynomial Curve')
plt.title("Third-degree Polynomial Regression", fontsize=14)
plt.xlabel("x", fontsize=12)
plt.ylabel("y", fontsize=12)
plt.legend()
plt.grid(True)
plt.show()

This process allows me to visualize the graph of the curve that fits the data points.

The result is a graph that shows how the third-degree polynomial fits the data.

data regression

What about linear regression?

Simply change the degree of the polynomial used to fit the data.

coefficients = np.polyfit(x, y, 1)

In this case, np.polyfit(x, y, 1) calculates the coefficients of a first-degree polynomial (two coefficients) that fits the data points at \(x\) and \(y\) with a regression line.

linear regression example

As you increase the degree of the polynomial, the curve fits the points more closely.

For example, if I use a sixth-degree polynomial:

coefficients = np.polyfit(x, y, 6)

The regression curve will pass exactly through the observed data points.

sixth-degree polynomial regression

And so on.

 
 

Please feel free to point out any errors or typos, or share suggestions to improve these notes. English isn't my first language, so if you notice any mistakes, let me know, and I'll be sure to fix them.

FacebookTwitterLinkedinLinkedin
knowledge base

Python

  1. The Python Language
  2. How to Install Python on Your PC
  3. How to Write a Program in Python
  4. How to Use Python in Interactive Mode
  5. Variables
  6. Numbers
  7. Logical Operators
  8. Iterative Structures (or Loops)
  9. Conditional Structures
  10. Exceptions
  11. Files in Python
  12. Classes
  13. Modules

Miscellaneous

Source