Monday, February 6, 2012

The Mystery of the Complete Set of Commuting Observables (CSCO) - Part 1

I was thinking about joint probabilities in quantum mechanics. In order to have a joint probability distribution of two quantities, their corresponding observables should commute. Non-commutation prohibits the existence of simultaneous eigenvectors, leads to uncertainty relations and all the complicated stuff.

Commuting Operators and Their Simultaneous Eigenvectors
Lets start with the simple proof of the sameness of eigenvectors of commuting operators. Take two commuting operators: \( \left[A,B\right] = 0 \). Say \( \psi \) is an eigenvector of \( A \). \( A\psi=a\psi\). Check the relation between \(B\) and \(\psi\). \( \left[A,B\right]\psi = AB\psi - BA\psi \) \( = AB\psi - Ba\psi = (A-a)B\psi = 0\) \( \Rightarrow A(B\psi) = a(B\psi) \) (Eq. 1).

"Remember your linear algebra!" course: If \(\psi\) is an eigenvector of \(A\) with the eigenvalue a, then \(\psi' = c\psi\) is an eigenvector too with the same eigenvalue. The parameter \(c\) defines a set of eigenvectors with the same direction but different magnitude, all of them are eigenvectors belonging to the same eigenvalue. But this set does not mean degeneracy. To have degenerate eigenvectors, the two vectors belonging to the same eigenvalue have to have different directions.

In this light, (Eq. 1) means that \( B\psi \) is too an eigenvector of \(A\), with the eigenvalue a, but have a different magnitude. How can it be? There are two cases:

Either i) \(B\) only changes the magnitude of the vector \(\psi\). Changing only the magnitude is the idiosyncrasy of operator-eigenvector relations. Hence \(\psi\) is an eigenvector of \(B\) too. (\(B\psi=b\psi\). (Eq. 1) \( \Rightarrow A(b\psi) = a(b\psi) \) \( \Rightarrow A\psi' = a\psi' \)

Or, ii) We have the more complicated case of "eigenvalue \(a\) of \(A\) is degenerate".

Again "remember your linear algebra!": If \(\psi_1\) and \(\psi_2\) are two eigenvectors with two different eigenvalues, then their sum (or any other linear combination) is not an eigenvector. \(A\psi_1=a_1\psi_1\), \(A\psi_2=a_2\psi_2\) \( \Rightarrow A\left(c_1\psi_1+c_2\psi_2\right) \) \( = \left(a_1 c_1\psi_1+a_2 c_2\psi_2\right) \) \( \neq a_n \left(c_1\psi_1+c_2\psi_2\right) = a_n \psi_n \).

But if  \(\psi_1\) and \(\psi_2\) are two degenerate eigenvectors with the same eigenvalue, then their linear combinations are eigenvectors with the same eigenvalue too. \(A\psi_1=a\psi_1\), \(A\psi_2=a\psi_2\) \( \Rightarrow A\left(c_1\psi_1+c_2\psi_2\right) \) \( = \left(a c_1\psi_1+a c_2\psi_2\right) \)  \( = a \left(c_1\psi_1+c_2\psi_2\right) = a \psi_3 \). One can call the space spanned by these two degenerate eigenvectors an "eigensubspace",\( \varepsilon_a \). [2] (Now, besides the eigen prefix, we are also borrowing the power of the German Language in building compound words.)

An arbitrary \( \psi_3 \) may not be an eigenvector of \(B\). But one can try to find eigenvectors of B living in  \( \varepsilon_a \) by constructing them using \(\psi_1\) and \(\psi_2\). (Assume \(\psi_i\)s are orthonormal, or find and use \(\psi_i''\)s which are orthonormal and found by applying Gram–Schmidt process on  \(\psi_i\)s.) The eigenvectors of \(B\) will satisfy \(B\psi_3 = b\psi_3 \). \( B\left(d_1\psi_1+d_2\psi_2\right) = b\left(d_1\psi_1+d_2\psi_2\right) \) (Eq. 2) Using the orthonormality of \(\psi_i\)s, one can get matrix elements of \(B\) by sandwiching it with \(\psi_i\)s. \(B_{ij} = \left\langle \psi_i | B | \psi_j \right\rangle\). Writing (Eq. 2) as \(d_1 B \left| \psi_1 \right\rangle + d_2 B \left| \psi_2 \right\rangle \) \(= d_1 b \left| \psi_1 \right\rangle + d_2 b \left| \psi_2 \right\rangle\) and hitting with first \( \left\langle \psi_1 \right| \) and then \( \left\langle \psi_2 \right| \) from the left, one gets two equations. \(d_1 B_{11} + d_2 B_{12} = b d_1 \) and \( d_1 B_{21} + d_2 B_{22} = b d_2 \). We have 2 equations and 2 unknowns (\(d_1\) and \(d_2\)). Expressing these equations in matrix form
\[

\begin{pmatrix}
B_{11} - b & B_{12} \\
B_{21} & B_{22} - b
\end{pmatrix}

\begin{pmatrix} d1 \\ d2 \end{pmatrix}
=
\begin{pmatrix} 0 \\ 0 \end{pmatrix} 


\]
The nontrivial solution for \(d\)s exists if the determinant is zero. \( \left(B_{11} - b\right)\left(B_{22} - b\right)-B_{12}B_{21}=0 \) (Eq. 3). From this second order equation in \(b\), one gets

ii.i) either two distinct \(b\) values, each one giving a different \(\left(d_1, d_2\right)\), meaning having different linear combinations of \(\psi_1\) and \(\psi_2\), hence different eigenvectors of \(B\). This way the degeneracy is resolved. \(A\) had degenerate eigenvectors (which span \(\varepsilon_a\)) for the eigenvalue \(a\). We found two nondegenerate eigenvectors of \(B\) in  \(\varepsilon_a\), with eigenvalues \(b_1\) and \(b_2\)

ii.ii) or,  \(b_1\) and \(b_2\) are equal. We failed in our try of searching distinct eigenvalues of \(B\) in \(\varepsilon_a\). \(\psi_3\) is a degenerate eigenvector of \(B\) too, with the eigenvalue \(b\). \(\varepsilon_a = \varepsilon_b\). But no worries, there is definitely a third operator \(C\) which commutes with both \(A\) and \(B\) and have nondegenerate eigenvectors in \(\varepsilon_a\). (I don't know why, yet).

These can be generalized to higher dimensions. The degeneracy will be decreased by the amount of the distinct roots of the (Eq. 3).

Why the fuss? What is the big deal with these simultaneous eigenvectors, and resolving the degeneracy business?
In QM observables are hermitian operators and the eigenvectors of a hermitian operators build a set of complete orthogonal basis if there is no degeneracy.

In the nondegenerate case i) \( A\left|\psi\right\rangle = a\left|\psi\right\rangle \) and \(B\left|\psi\right\rangle=b\left|\psi\right\rangle\). According to the convention of labeling eigenvectors by their corresponding eigenvalues, we can label \( \left|\psi\right\rangle \) as \( \left|a\right\rangle \), \( \left|b\right\rangle \) or \(\left|a,b\right\rangle\). If \(A\) operates on \(\left|a,b\right\rangle\) the eigenvalue will be \(a\) and if \(B\) operators on \(\left|a,b\right\rangle\) the eigenvalue will be \(b\). The eigenvectors of \(A\) and \(B\) are the same but they correspond to different eigenvalues for \(A\) and \(B\).

If there is degeneracy, the nondegenerate eigenvectors are orthonormal and the degenerate ones span subspaces, \(\varepsilon_{a_{d}}\), for each degenerate eigenvalue \(a_d\). Any vector on \(\varepsilon_{a_{d}}\) is another eigenvector. One can use Gram-Schmidt process and find \(N\) orthonormal vectors in each \(N\)-dimensional eigensubspace \(\varepsilon_{a_{d}}\). Actually it is always possible to find a complete orthonormal basis from eigenvectors of \(A\) anyhow (either some of them are degenerate or not). [3] The contribution of adding \(B\) is to change the eigenvalues of vectors in \(\varepsilon_{a_{d}}\) and eliminating the degeneracy. (We don't like degeneracies in this town.)

In the degenerate case ii.i) \(B\) is used for trying to lift the degeneracy. Again we could label \( \left|\psi\right\rangle \) as \( \left|a_d\right\rangle \) but this time \( \left|\psi\right\rangle \) is not unique. And, in general, it is not an eigenvector of \(B\). But it is possible to find eigenvectors of \(B\) in the eigensubspace, \(\varepsilon_a\) spanned by the degenerate eigenvectors of \(A\) corresponding the degenerate eigenvalue \(a_d\). If somehow distinct eigenvectors of \(B\) is found, we can label the simultaneous eigenvectors as such: \( \left|a_d, b_1\right\rangle \), \( \left|a_d, b_2\right\rangle \) etc. Now they are unique distinct functions. If \(A\) hits on all of these vectors, \(a_d\) will be the eigenvalue. If \(B\) hits them \(b_1\), \(b_2\) etc will be the eigenvalue. Any vector can be written in this orthonormal basis \( \left|\phi\right\rangle = \sum_{a,b} c_{a,b} \left|a,b\right\rangle \) for each possible combination of the eigenvalues \(a\) and \(b\). \(A\) and \(B\) forms a CSCO.

In the degenerate case ii.ii) although again we got our basis, \(B\) could not lift the degeneracy. But there is sure an operator \(C\) which can do the job. (Why?!)

Next I will talk about the number of objects in CSCO and give some simple examples.

[1] http://faculty.physics.tamu.edu/herschbach/commuting%20observables%20and%20simultaneous%20eigenfunctions.pdf
[2] http://eecourses.technion.ac.il/046241/files/Rec2.pdf
[3] http://www.pa.msu.edu/~mmoore/Lect4_BasisSet.pdf

No comments:

Post a Comment