
When you first implement PCA from scratch, one algorithm looks tempting: Power Iteration. It’s intuitive, it makes mathematical sense, and it works perfectly on paper.
But in production? It can silently give you completely wrong results — and you might never notice.
In this post, we’ll break down exactly why Power Iteration is a trap, what happens under the hood, and why numpy.linalg.eigh is the tool you should reach for instead. No prior deep math knowledge required.
Quick Background: What Are We Actually Doing in PCA?
PCA finds the directions in your data where the most variance (spread) exists. Mathematically, this means finding the eigenvectors and eigenvalues of the covariance matrix.
- Eigenvectors tell you the direction of each principal component.
- Eigenvalues tell you how much variance exists in that direction.
The eigenvector with the largest eigenvalue is your first principal component (PC1), the one that captures the most information.
The Tempting Approach: Power Iteration + Deflation
Power Iteration is a classic algorithm for finding eigenvectors. Here is how it works at a high level:
- Start with a random vector.
- Multiply it by the matrix repeatedly.
- Eventually, it converges to the eigenvector with the largest eigenvalue.
Once you have the first eigenvector, you use a technique called Deflation to “remove” it from the matrix, then repeat the process for the next one.
The deflation formula looks like this:
C_new = C - λ * (v * vᵀ)
Where λ is the eigenvalue, v is the eigenvector you just found, and C is your covariance matrix.
Sounds logical, right? Remove what you found, then search again. Simple.
The problem is what happens inside your computer when you actually run this.
Problem 1: Floating-Point Noise
Let’s walk through a concrete example.
Imagine your dataset lies perfectly on the line y = x in 2D space:
X = | 1 1 |
| 2 2 |
| 3 3 |
After mean subtraction, the covariance matrix is:
C = | 1 1 |
| 1 1 |
All the variance in this dataset (total = 1 + 1 = 2, the sum of the diagonal) lives in exactly one direction. The second eigenvalue should be a perfect 0.0.
Step 1 — Power Iteration finds the first component:
λ₁ = 2
v₁ = [0.707, 0.707]
Great. Now deflation:
C_new = | 1 1 | - 2 * | 0.5 0.5 | = | 0 0 |
| 1 1 | | 0.5 0.5 | | 0 0 |
On paper, the result is a zero matrix. Perfect.
But here is what Python actually gives you:
C_new ≈ | 1.11e-16 -2.22e-16 |
| -2.22e-16 1.11e-16 |
Why? Because 0.707 is actually 1 / sqrt(2), an irrational number. Your computer cannot store infinite decimal places. So instead of a perfect zero, you get tiny rounding leftovers — numbers around 10⁻¹⁶.
This is called floating-point noise, and it is completely normal in numerical computing. The issue is what Power Iteration does next.
It does not know these are rounding errors. It treats them as real data. So it runs the iteration again on this noise, amplifies it, and hands you back a garbage eigenvector that represents nothing real in your data.
Your code runs without errors. No exceptions are thrown. The output just happens to be completely wrong.
Problem 2: Sign Ambiguity
There is a second, subtler problem.
Eigenvectors are not unique in sign — if v is an eigenvector, so is -v. Both are mathematically valid. Power Iteration will randomly converge to one or the other depending on its starting point.
During deflation, this sign choice affects the next iteration. Small sign differences compound across components. By the time you reach the third or fourth principal component, the errors can be significant.
The Right Tool: np.linalg.eigh
Here is the fix, and it is simple: use numpy.linalg.eigh instead.
The h at the end stands for Hermitian, which in practice means symmetric matrices — exactly what a covariance matrix always is (C = Cᵀ).
Why eigh and not just eig?
eig is the general-purpose solver. It works for any matrix, but it does not take advantage of structure. eigh is specifically designed for symmetric matrices and uses a fundamentally different algorithm under the hood: LAPACK's dsyevd (Divide-and-Conquer).
The key difference: eigh decomposes the entire matrix at once rather than finding one component, deflating, finding the next, and so on.
Because there is no deflation step, there is no accumulation of floating-point noise and no sign ambiguity.
Power Iteration + Deflation np.linalg.eigh Algorithm Iterative, sequential LAPACK dsyevd (all-at-once) Deflation Yes, after every component None Floating-point noise Accumulates Does not occur Sign ambiguity Yes No Zero eigenvalues Returns ~1e-16 noise Returns exact 0.0 Complex eigenvalues Possible Mathematically impossible
The Code
Let’s see both approaches side by side:
"""
PCA: Power Iteration Deflation Trap vs np.linalg.eigh
"""
import numpy as np
def demonstrate_floating_point_noise() -> None:
"""
Shows how deflation creates floating-point noise on a rank-deficient
covariance matrix, and how eigh handles it cleanly.
"""
print("Experiment: The Floating-Point Trap in PCA")
print("=" * 50)
# Rank-deficient covariance matrix (data lies perfectly on y = x)
C = np.array([[1.0, 1.0],
[1.0, 1.0]])
# Analytically known: lambda_1 = 2, v_1 = [1/sqrt(2), 1/sqrt(2)]
lambda_1 = 2.0
v_1 = np.array([1 / np.sqrt(2), 1 / np.sqrt(2)])
# Deflation: C_new = C - lambda_1 * outer(v_1, v_1)
outer_product = np.outer(v_1, v_1)
C_new = C - (lambda_1 * outer_product)
print("\n1. After manual deflation (Power Iteration approach):")
print(" Expected: [[0, 0], [0, 0]]")
print(" Actual:\n")
print(repr(C_new))
print("\n" + "-" * 50)
# The correct approach for symmetric matrices
eigenvalues, _ = np.linalg.eigh(C)
print("\n2. Using np.linalg.eigh:")
print(" Note: eigh returns eigenvalues in ascending order")
print(f" Eigenvalues: {repr(eigenvalues)}")
print("\n" + "=" * 50)
if __name__ == "__main__":
demonstrate_floating_point_noise()
Output:
Experiment: The Floating-Point Trap in PCA
==================================================
1. After manual deflation (Power Iteration approach):
Expected: [[0, 0], [0, 0]]
Actual:
array([[ 1.11022302e-16, -2.22044605e-16],
[-2.22044605e-16, 1.11022302e-16]])
--------------------------------------------------
2. Using np.linalg.eigh:
Note: eigh returns eigenvalues in ascending order
Eigenvalues: array([0., 2.])
==================================================
Power Iteration gives you 1.11e-16 where the answer should be 0. eigh gives you exactly 0.0.
One Small Thing to Know About eigh
eigh returns eigenvalues in ascending order (smallest to largest). In PCA you typically want them in descending order (largest first). Just reverse them before use:
eigenvalues, eigenvectors = np.linalg.eigh(C)
# Reverse to get descending order
eigenvalues = eigenvalues[::-1]
eigenvectors = eigenvectors[:, ::-1]
Key Takeaways
- Power Iteration + Deflation is theoretically correct but numerically fragile. It accumulates floating-point errors and can produce garbage eigenvectors without raising any errors.
- PCA’s covariance matrix is always symmetric. eigh is built for exactly this case.
- eigh uses a divide-and-conquer algorithm that decomposes the whole matrix at once — no deflation, no noise accumulation, no sign ambiguity.
- In production PCA, always use np.linalg.eigh.
The difference between the two is not just performance. It is correctness.
PCA in Production: Why Power Iteration Fails and np.linalg.eigh is the Right Choice was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.