Pollard's ρ Method

Pollard's ρ method is a factorization algorithm designed to find a nontrivial divisor of a composite number \( n \).

The key idea is to construct a sequence of numbers \( x_0, x_1, x_2, \dots \) using an iterative function, and then detect repeated values within the sequence.

When a repetition is found, we compute the GCD of the difference between two terms and \( n \); this often reveals a proper divisor of \( n \).

Note. The method is named after the Greek letter ρ (rho) because the structure of the sequence resembles a “tail” leading into a cycle-mirroring the shape of the letter ρ. It is particularly effective when \( n \) has small prime factors and leverages the pigeonhole principle to detect collisions in a sequence defined modulo \( n \).

How does it work?

To factor a composite number \( n \), follow these steps:

  1. Choose an iterative function
    Select a polynomial function \( f(x) \) to generate the sequence. A common choice is: \[ f(x) = x^2 + 1 \] The function should be computationally efficient and non-injective modulo \( n \).
  2. Generate the sequence
    Choose an initial value \( x_0 \), then define the sequence recursively: \[ x_1 = f(x_0) \mod n \] \[ x_2 = f(x_1) \mod n \] \[ x_3 = f(x_2) \mod n \\ \vdots \] Compute several terms (a handful is usually enough).

    Note. Since the set \( \mathbb{Z}_n \) is finite, the sequence is guaranteed to eventually repeat, forming a cycle. As soon as a duplicate value appears modulo \( n \), further iterations will only cycle through the same values-so continuing beyond that point is unnecessary.

  3. Search for a divisor using the GCD
    Compute the greatest common divisor of the difference between two terms \( x_i - x_j \) and \( n \): \[ GCD(|x_i - x_j|, n) \] If the result is greater than 1 but less than \( n \), then you've found a proper factor of \( n \). Otherwise, return to the previous step and generate more terms if needed.

Why does the method work?

It hinges on the pigeonhole principle: in any finite set (such as \( \mathbb{Z}_n \)), if you generate enough elements, some are bound to repeat.

Once a repetition occurs, the difference between the corresponding values often shares a nontrivial factor with \( n \), which can be extracted via the GCD.

Note. This is often referred to as the “slow” version of Pollard’s ρ method, because it first generates a potentially long sequence before checking for divisors-possibly wasting time if a factor was already detectable early on. There is also a “fast” variant that computes the GCD at each step, allowing it to discover a factor as soon as one emerges.

A practical example

Let’s factor the number \( n = 111 \).

We pick a function and an initial value. For example:

$$ f(x) = x^2 + 1 $$

and set the starting point:

$$ x_0 = 1 $$

Now compute the first few terms of the sequence modulo 111:

\[ x_1 = f(1) = 1^2 + 1 = 2 \pmod{111} \]

\[ x_2 = f(2) = 2^2 + 1 = 5 \pmod{111} \]

\[ x_3 = f(5) = 5^2 + 1 = 26 \pmod{111} \]

\[ x_4 = f(26) = 26^2 + 1 = 677 \equiv 11 \pmod{111} \]

\[ x_5 = f(11) = 11^2 + 1 = 122 \equiv 11 \pmod{111} \]

At this point, the sequence begins to repeat: \( x_5 \equiv x_4 \mod 111 \), indicating the start of a cycle.

Now compute \( GCD(|x_i - x_j|, n) \) for a pair \( i, j \) such that \( x_i - x_j \) may reveal a shared factor with \( n \):

\[ GCD(x_2 - x_1, 111) = GCD(5 - 2, 111) = GCD(3, 111) = 3 \]

Since \( 3 \) is greater than 1 and less than 111, it is a proper divisor of 111.

Once one factor is known, finding the other is straightforward:

$$ \frac{111}{3} = 37 $$

So, \( 37 \) is the other prime factor of the number.

Hence, the prime factorization of 111 is:

\[ 111 = 3 \times 37 \]

The factorization is complete.

Example 2

Let’s attempt to factor the number 8051.

\[ n = 8051 \]

We'll use the same function as in the previous example, now working modulo 8051:

$$ f(x) = x^2 + 1 \mod 8051 $$

We'll begin with the initial value:

$$ x_0 = 2 $$

Now we generate the first few terms of the sequence \( x_i \):

$$ x_0 = 2 $$

$$ x_1 = f(2) = 2^2 + 1 = 5 $$

$$ x_2 = f(5) = 5^2 + 1 = 26 $$

$$ x_3 = f(26) = 26^2 + 1 = 677 $$

$$ x_4 = f(677) = 458330 \mod 8051 = 7474 $$

$$ x_5 = f(7474) = 7474^2 + 1 \mod 8051 = 2839 $$

$$ x_6 = f(2839) = 2839^2 + 1 = 8059922 \mod 8051 = 871 $$

Once we have a few terms, we start computing GCDs between the absolute differences \( |x_i - x_j| \) and \( n = 8051 \).

A good starting point is to compare with \( x_0 \), as the computations are often simpler. That said, any other base term would work equally well.

$$ \text{GCD}(x_1 - x_0, 8051) = \text{GCD}(5 - 2, 8051) = \text{GCD}(3, 8051) = 1 $$

$$ \text{GCD}(x_2 - x_0, 8051) = \text{GCD}(26 - 2, 8051) = \text{GCD}(24, 8051) = 1 $$

$$ \text{GCD}(x_3 - x_0, 8051) = \text{GCD}(677 - 2, 8051) = \text{GCD}(675, 8051) = 1 $$

$$ \text{GCD}(x_4 - x_0, 8051) = \text{GCD}(7474 - 2, 8051) = \text{GCD}(7472, 8051) = 1 $$

$$ \text{GCD}(x_5 - x_0, 8051) = \text{GCD}(2839 - 2, 8051) = \text{GCD}(2837, 8051) = 1 $$

$$ \text{GCD}(x_6 - x_0, 8051) = \text{GCD}(871 - 2, 8051) = \text{GCD}(869, 8051) = 1 $$

All GCDs are 1, so no factor has been found yet.

Let’s now compute GCDs using differences from \( x_1 = 5 \):

$$ \text{GCD}(x_2 - x_1, 8051) = \text{GCD}(26 - 5, 8051) = \text{GCD}(21, 8051) = 1 $$

$$ \text{GCD}(x_3 - x_1, 8051) = \text{GCD}(677 - 5, 8051) = \text{GCD}(672, 8051) = 1 $$

$$ \text{GCD}(x_4 - x_1, 8051) = \text{GCD}(7474 - 5, 8051) = \text{GCD}(7469, 8051) = 97 $$

There it is a nontrivial factor of 8051: 97.

To find the cofactor, simply divide 8051 by 97:

$$ \frac{8051}{97} = 83 $$

So, the prime factorization of 8051 is:

$$ 8051 = 83 \times 97 $$

Factorization complete.

Accelerated Version of Pollard’s Rho Method

The fast (or “optimized”) version of Pollard’s ρ method improves efficiency by computing the greatest common divisor (GCD) during the sequence generation itself.

This allows the algorithm to detect a nontrivial factor as soon as it emerges, without having to build a long sequence in advance.

How does it work?

The algorithm proceeds as follows:

  1. Choose a simple function to generate the sequence, such as \( f(x) = x^2 + 1 \mod n \), and initialize both \( x = x_0 \) and \( y = x_0 \).
  2. Generate two sequences in parallel-one advancing at a regular pace (the tortoise), the other twice as fast (the hare):
    - \( x_i = f(x_{i-1}) \mod n \)    (tortoise: 1 step per iteration)
    - \( y_i = f(f(y_{i-1})) \mod n \)    (hare: 2 steps per iteration)
  3. At each step, compute the GCD of the difference between the two sequences:
    $$ d = \text{GCD}(|x - y|, n) $$
  4. Check the result:
    - If \( d = 1 \), continue with the next iteration
    - If \( 1 < d < n \), a nontrivial factor has been found
    - If \( d = n \), the attempt has failed-restart with a different function or a new initial value

Example

Let’s revisit the factorization of \( n = 8051 \), this time using the fast variant of Pollard’s ρ method (also known as the tortoise and hare approach).

We define the function:

$$ f(x) = x^2 + 1 \mod 8051 $$

and choose the starting value:

$$ x = 2 $$

$$ y = 2 $$

Now the iterations begin.

Iteration 1

Compute:

$$ x_1 = f(2) = 2^2 + 1 = 5 $$

$$ y_1 = f(f(2)) = f(5) = 5^2 + 1 = 26 $$

Then evaluate:

$$ d = \text{GCD}(|5 - 26|, 8051) = \text{GCD}(21, 8051) = 1 $$

No factor found-continue.

Iteration 2

Compute the next values:

$$ x_2 = f(5) = 5^2 + 1 = 26 $$

$$ y_2 = f(f(26)) = f(677) = 677^2 + 1 = 458330 \mod 8051 = 7474 $$

And check:

$$ d = \text{GCD}(|26 - 7474|, 8051) = \text{GCD}(7448, 8051) = 1 $$

Still no factor-proceed.

Iteration 3

Compute:

$$ x_3 = f(26) = 26^2 + 1 = 677 $$

$$ y_3 = f(f(7474)) = f(2839) = 2839^2 + 1 = 8059922 \mod 8051 = 871 $$

Then:

$$ d = \text{GCD}(|871 - 677|, 8051) = \text{GCD}(194, 8051) = 97 $$

Success! We've found a nontrivial factor of 8051: 97, after only three iterations.

To find the cofactor, divide:

$$ \frac{8051}{97} = 83 $$

Therefore, the prime factorization of 8051 is:

$$ 8051 = 83 \times 97 $$

The same result as before, but reached more efficiently.

Remarks

Some observations on this method:

  • Advantages and limitations of Pollard’s rho method
    This algorithm is highly efficient for integers with small prime factors. It avoids exhaustive trial division up to \( \sqrt{n} \), and is easy to implement with low computational overhead for numbers of moderate size. However, it doesn't always yield a factor immediately-especially if the generated cycle fails to reveal one. It tends to perform poorly when all prime factors of \( n \) are large, and may require several attempts with different functions \( f(x) \) and initial values before succeeding.

And so on.

 
 

Please feel free to point out any errors or typos, or share suggestions to improve these notes. English isn't my first language, so if you notice any mistakes, let me know, and I'll be sure to fix them.

FacebookTwitterLinkedinLinkedin
knowledge base

Factorization

Calculator