Pollard's ρ Method

Pollard's ρ method is a factorization algorithm designed to find a nontrivial divisor of a composite number $ n $.

The key idea is to construct a sequence of numbers $ x_0, x_1, x_2, \dots $ using an iterative function, and then detect repeated values within the sequence.

When a repetition is found, we compute the GCD of the difference between two terms and $ n $; this often reveals a proper divisor of $ n $.

Note. The method is named after the Greek letter ρ (rho) because the structure of the sequence resembles a “tail” leading into a cycle-mirroring the shape of the letter ρ. It is particularly effective when $ n $ has small prime factors and leverages the pigeonhole principle to detect collisions in a sequence defined modulo $ n $.

How does it work?
A practical example
Accelerated Version of Pollard’s Rho Method
Remarks

How does it work?

To factor a composite number $ n $, follow these steps:

Choose an iterative function
Select a polynomial function $ f(x) $ to generate the sequence. A common choice is: \[ f(x) = x^2 + 1 \] The function should be computationally efficient and non-injective modulo $ n $.
Generate the sequence
Choose an initial value $ x_0 $, then define the sequence recursively: \[ x_1 = f(x_0) \mod n \] \[ x_2 = f(x_1) \mod n \] \[ x_3 = f(x_2) \mod n \\ \vdots \] Compute several terms (a handful is usually enough).
Note. Since the set $ \mathbb{Z}_n $ is finite, the sequence is guaranteed to eventually repeat, forming a cycle. As soon as a duplicate value appears modulo $ n $, further iterations will only cycle through the same values-so continuing beyond that point is unnecessary.
Search for a divisor using the GCD
Compute the greatest common divisor of the difference between two terms $ x_i - x_j $ and $ n $: \[ GCD(|x_i - x_j|, n) \] If the result is greater than 1 but less than $ n $, then you've found a proper factor of $ n $. Otherwise, return to the previous step and generate more terms if needed.

Why does the method work?

It hinges on the pigeonhole principle: in any finite set (such as $ \mathbb{Z}_n $), if you generate enough elements, some are bound to repeat.

Once a repetition occurs, the difference between the corresponding values often shares a nontrivial factor with $ n $, which can be extracted via the GCD.

Note. This is often referred to as the “slow” version of Pollard’s ρ method, because it first generates a potentially long sequence before checking for divisors-possibly wasting time if a factor was already detectable early on. There is also a “fast” variant that computes the GCD at each step, allowing it to discover a factor as soon as one emerges.

A practical example

Let’s factor the number $ n = 111 $.

We pick a function and an initial value. For example:

$$ f(x) = x^2 + 1 $$

and set the starting point:

$$ x_0 = 1 $$

Now compute the first few terms of the sequence modulo 111:

\[ x_1 = f(1) = 1^2 + 1 = 2 \pmod{111} \]

\[ x_2 = f(2) = 2^2 + 1 = 5 \pmod{111} \]

\[ x_3 = f(5) = 5^2 + 1 = 26 \pmod{111} \]

\[ x_4 = f(26) = 26^2 + 1 = 677 \equiv 11 \pmod{111} \]

\[ x_5 = f(11) = 11^2 + 1 = 122 \equiv 11 \pmod{111} \]

At this point, the sequence begins to repeat: $ x_5 \equiv x_4 \mod 111 $, indicating the start of a cycle.

Now compute $ GCD(|x_i - x_j|, n) $ for a pair $ i, j $ such that $ x_i - x_j $ may reveal a shared factor with $ n $:

\[ GCD(x_2 - x_1, 111) = GCD(5 - 2, 111) = GCD(3, 111) = 3 \]

Since $ 3 $ is greater than 1 and less than 111, it is a proper divisor of 111.

Once one factor is known, finding the other is straightforward:

$$ \frac{111}{3} = 37 $$

So, $ 37 $ is the other prime factor of the number.

Hence, the prime factorization of 111 is:

\[ 111 = 3 \times 37 \]

The factorization is complete.

Example 2

Let’s attempt to factor the number 8051.

\[ n = 8051 \]

We'll use the same function as in the previous example, now working modulo 8051:

$$ f(x) = x^2 + 1 \mod 8051 $$

We'll begin with the initial value:

$$ x_0 = 2 $$

Now we generate the first few terms of the sequence $ x_i $:

$$ x_0 = 2 $$

$$ x_1 = f(2) = 2^2 + 1 = 5 $$

$$ x_2 = f(5) = 5^2 + 1 = 26 $$

$$ x_3 = f(26) = 26^2 + 1 = 677 $$

$$ x_4 = f(677) = 458330 \mod 8051 = 7474 $$

$$ x_5 = f(7474) = 7474^2 + 1 \mod 8051 = 2839 $$

$$ x_6 = f(2839) = 2839^2 + 1 = 8059922 \mod 8051 = 871 $$

Once we have a few terms, we start computing GCDs between the absolute differences $ |x_i - x_j| $ and $ n = 8051 $.

A good starting point is to compare with $ x_0 $, as the computations are often simpler. That said, any other base term would work equally well.

$$ \text{GCD}(x_1 - x_0, 8051) = \text{GCD}(5 - 2, 8051) = \text{GCD}(3, 8051) = 1 $$

$$ \text{GCD}(x_2 - x_0, 8051) = \text{GCD}(26 - 2, 8051) = \text{GCD}(24, 8051) = 1 $$

$$ \text{GCD}(x_3 - x_0, 8051) = \text{GCD}(677 - 2, 8051) = \text{GCD}(675, 8051) = 1 $$

$$ \text{GCD}(x_4 - x_0, 8051) = \text{GCD}(7474 - 2, 8051) = \text{GCD}(7472, 8051) = 1 $$

$$ \text{GCD}(x_5 - x_0, 8051) = \text{GCD}(2839 - 2, 8051) = \text{GCD}(2837, 8051) = 1 $$

$$ \text{GCD}(x_6 - x_0, 8051) = \text{GCD}(871 - 2, 8051) = \text{GCD}(869, 8051) = 1 $$

All GCDs are 1, so no factor has been found yet.

Let’s now compute GCDs using differences from $ x_1 = 5 $:

$$ \text{GCD}(x_2 - x_1, 8051) = \text{GCD}(26 - 5, 8051) = \text{GCD}(21, 8051) = 1 $$

$$ \text{GCD}(x_3 - x_1, 8051) = \text{GCD}(677 - 5, 8051) = \text{GCD}(672, 8051) = 1 $$

$$ \text{GCD}(x_4 - x_1, 8051) = \text{GCD}(7474 - 5, 8051) = \text{GCD}(7469, 8051) = 97 $$

There it is a nontrivial factor of 8051: 97.

To find the cofactor, simply divide 8051 by 97:

$$ \frac{8051}{97} = 83 $$

So, the prime factorization of 8051 is:

$$ 8051 = 83 \times 97 $$

Factorization complete.

Accelerated Version of Pollard’s Rho Method

The fast (or “optimized”) version of Pollard’s ρ method improves efficiency by computing the greatest common divisor (GCD) during the sequence generation itself.

This allows the algorithm to detect a nontrivial factor as soon as it emerges, without having to build a long sequence in advance.

How does it work?

The algorithm proceeds as follows:

Choose a simple function to generate the sequence, such as $ f(x) = x^2 + 1 \mod n $, and initialize both $ x = x_0 $ and $ y = x_0 $.
Generate two sequences in parallel-one advancing at a regular pace (the tortoise), the other twice as fast (the hare):
- $ x_i = f(x_{i-1}) \mod n $ (tortoise: 1 step per iteration)
- $ y_i = f(f(y_{i-1})) \mod n $ (hare: 2 steps per iteration)
At each step, compute the GCD of the difference between the two sequences:
$$ d = \text{GCD}(|x - y|, n) $$
Check the result:
- If $ d = 1 $, continue with the next iteration
- If $ 1 < d < n $, a nontrivial factor has been found
- If $ d = n $, the attempt has failed-restart with a different function or a new initial value

Example

Let’s revisit the factorization of $ n = 8051 $, this time using the fast variant of Pollard’s ρ method (also known as the tortoise and hare approach).

We define the function:

$$ f(x) = x^2 + 1 \mod 8051 $$

and choose the starting value:

$$ x = 2 $$

$$ y = 2 $$

Now the iterations begin.

Iteration 1

Compute:

$$ x_1 = f(2) = 2^2 + 1 = 5 $$

$$ y_1 = f(f(2)) = f(5) = 5^2 + 1 = 26 $$

Then evaluate:

$$ d = \text{GCD}(|5 - 26|, 8051) = \text{GCD}(21, 8051) = 1 $$

No factor found-continue.

Iteration 2

Compute the next values:

$$ x_2 = f(5) = 5^2 + 1 = 26 $$

$$ y_2 = f(f(26)) = f(677) = 677^2 + 1 = 458330 \mod 8051 = 7474 $$

And check:

$$ d = \text{GCD}(|26 - 7474|, 8051) = \text{GCD}(7448, 8051) = 1 $$

Still no factor-proceed.

Iteration 3

Compute:

$$ x_3 = f(26) = 26^2 + 1 = 677 $$

$$ y_3 = f(f(7474)) = f(2839) = 2839^2 + 1 = 8059922 \mod 8051 = 871 $$

Then:

$$ d = \text{GCD}(|871 - 677|, 8051) = \text{GCD}(194, 8051) = 97 $$

Success! We've found a nontrivial factor of 8051: 97, after only three iterations.

To find the cofactor, divide:

$$ \frac{8051}{97} = 83 $$

Therefore, the prime factorization of 8051 is:

$$ 8051 = 83 \times 97 $$

The same result as before, but reached more efficiently.

Remarks

Some observations on this method:

Advantages and limitations of Pollard’s rho method
This algorithm is highly efficient for integers with small prime factors. It avoids exhaustive trial division up to $ \sqrt{n} $, and is easy to implement with low computational overhead for numbers of moderate size. However, it doesn't always yield a factor immediately-especially if the generated cycle fails to reveal one. It tends to perform poorly when all prime factors of $ n $ are large, and may require several attempts with different functions $ f(x) $ and initial values before succeeding.

And so on.