Jan 12, 2019       tags: | math notes | visualisation |

Power iteration algorithm — a visualization

The power method is a simple iterative algorithm that can be used to find the eigenvector of a matrix () associated with the largest absolute eigenvalue. Given some vector , the next best approximation of the eigenvector is given by the normalized product of and

Why does it work? Let’s consider the simplest case of a diagonizable matrix . Using eigendecomposition,


where is the matrix of eigenvectors, and is the the diagonal matrix of eigenvalues

We can ignore normalization for now since it does not affect the direction of the resulting vector. Then power iteration simplifies to calculating where is some starting vector

Given some vector , , where can be interpreted as presenting vector in the eigenbasis.

Since we don’t care about the legnth of the eigenvector at this point, we are free to scale by any number. Let us scale it by

At a sufficiently large n, all entries but one become close enough to zero. The entry corresponding to the largest eigenvalue in remains 1. This matrix has the property of stripping vector (presented in the eigenbasis!) of all coordinates but the one corresponding to the eigenvector with the largest eigenvalue thus giving us a vector collinear with that eigenvector. Finally, applying matrix to this new vector presents it in our original basis.

What if we picked normal to the eigenvector thought? The power iteration algorithm would fail, since vector produces 0 as the coordinate corresponding to the dominant eigenvector. Applying will produce a zero vector. Fortunately, if vector is picked at random, it is unlikely for it to be normal to the dominant eigenvector.

Is it possible to use power iteration to find a non-dominant eigenvector? Yes. Substituting with makes it possible to obtain the eigenvector associated with the smallest eigenvalue. It is not difficult to show that the eigendecomposition of this new matrix is . Scaling by will result in a matrix that nullifies all vector coordinates except the one associated with the least significant eigenvector.

In the canvas below (powered by vtvt) you can visualize 20 iterations of the power method. Vectors and set the columns of , and the eigenvectors are calculated analytially and updated whenever the matrix changes. Vector sets the initial iteration of the power method. Note how much worse convergence becomes whenever the eigenvalues become similar.

Suggested post:

The real reason you use the MSE and cross-entropy loss functions
If you learned machine learning from MOOCs, there's a good chance you haven't been taught the true significance of the mean squared error and cross-entropy loss functions.