Something perhaps not so well-emphasized in the literature is the fact that unlike eigenvalues, singular values are not invariant under similarity transform. Or more precisely, given a square matrix and an invertible matrix , it is not true in general that
when is not normal. As an easy example, let
Then it’s easy to check that the characteristic polynomial of is whereas that of is . When is normal, i.e., , the singular values are simply the absolute values of eigenvalues of . This is easily seen from the fact that and since commute, they have the same set of eigenvectors, which in term agree with those of . This fact also is needed to show that being normal is unitarily diagonalizable, an easy exercise. The fact that commuting family of matrices with one diagonalizable element can be simultaneously diagonalized (i.e., have the same complete set of eigenvectors) requires some care: it is easiest when all the eigenvalues are distinct. Now say the diagonalizable element has an eigenspace of dimension . Then is also invariant subspace of the other elements of , i.e., all elements of can be thought of as mutually commuting liear operators on . They must share a common eigenvector by looking at the minimum nontrivial invariant subspace (necessarily one dimensional) within (see Lemma 4.2.1 of http://www.math.tamu.edu/~dallen/m640_03c/lectures/chapter4.pdf)
Now by looking at complementary subspace of this 1-dimensional subspace we conclude by induction that all elements of have a common set of complete eigenvectors (note it is possible to choose some elements to have a complete set of eigenvectors that are not common to the other elements, as when one of its eigenspaces has dimension greater than ).
Aside: Schur, Jacobson, and Mirzakhani in separate papers over the span of many years have given shorter and shorter proofs of the fact that the maximum cardinality of mutually commuting linear independent family of complex matrices is . To me it seems a bit surprising that there are research level results in even such a basic topic.
Obviously I learned this stuff earlier as an undergraduate (perhaps in much haste). The real motivation came from reading a paper of Gromov: “Entropy, Homology, and Semialgebraic geometry (after Y. Yomdin)”, where he introduced the notion of topological entropy and logarithmic volume growth of a self-map on a smooth manifold f: X -> X, defined by means of the supremum over the whole manifold of norm of the derivative linear operator . In particular he defined
where , and the supremum is over .
He then says that is dependent on the metric chosen, whereas is independent. I was puzzled by this. But now the first statement is clear from the linear algebra above that a different choice of metric corresponds to a similarity transform, and if is not normal at every point, then it might also change. To prove the second statement, consider an invertible and . Then
It is clear that the conjugation by and will affect the norm of less and less, because if , then would only decrease by a constant factor, which when is negligible.