Lectures week 3 — MTH 496

We now turn to the mathematical structure of quantum theory. We will not devote much time to the physical bases of the theory, but we do need a little motivation. Please read the summary on pages 15-22 of the text for more details.

Physical Bases of Quantum Mechanics

The two basic observations that you should note are

  1. The idea of uncertainty, and Heisenberg’s uncertainty principle. The outcome of a measurement is never precisely determined by the state of a quantum system. Furthermore, if we prepare a state in which some quantity (e.g., the position of a particle) is specified with very little uncertainty, then another quantity (like e.g., the momentum of a particle) will have a high degree of uncertainty in the resulting state.
  2. The idea of wave-particle duality, namely that quantum particles exhibit properties of waves (like diffraction and interference) and of particles.

In fact the two ideas are linked: uncertainty results from the wave-like nature of particles as we will now see. For instance, to measure the position of an electron, we could use light of a certain wavelength {\lambda}. In principle this would allow us to measure the position of the electron with a precision, i.e., an uncertainty, of order {\lambda}. However, in so doing we will have caused the electron to interact with the photons in the light. Photons are the particle like description of light and carry a momentum (according to Einstein’s analysis of the photo electric effect) of magnitude {2\pi h/\lambda} where {h} is Planck’s constant

\displaystyle  h=6.626068\times10^{-34}\frac{\text{m}^{2}\text{kg}}{\text{s}}. \ \ \ \ \ (1)

The result is that the momentum of the electron has an uncertainty of order {2\pi h/\lambda} after we measure it’s position. (The value (1) is standard. The text-book uses {h} to denote another quantity, namely {\frac{h}{2\pi}}, which is standardly denoted {\hbar}.)

To understand the physics, it is important to note that uncertainty in quantum mechanics is very different from the fuzziness we introduced with densities on phase space in classical mechanics. Heisenberg’s uncertainty principle results from a fundamental and unescapable interaction of a system with our measuring apparatus. Later we will see that there are pure states in quantum mechanics, but that there is still uncertainty for observables in these states.

States and observables

Henceforth we will consider a quantum system. As yet we have not defined what such a thing is, but we will continue to suppose that we have two objects:

  1. A collection of observables {\mathcal{A}}.
  2. A collection of states {\Omega}.

Observables are properties of the system that we can measure and states describe the configuration of the system. Fundamental to the uncertainty principle is that it may not be possible to simultaneously measure two observables — for instance it is not possible to measure the position and momentum of a particle. However, we may measure a single observable {a\in\mathcal{A}}. As above we assume a pairing between states and observables

\displaystyle  \left\langle a\middle|\omega\right\rangle =\text{ average value of }a\text{ in state }\omega.

Given an observable {a\in\mathcal{A}} and a continuous real valued function {f:\mathbb{R}\rightarrow\mathbb{R}} we can define the observable {f(a)} to be

If we measure {a} to have value {a_{0}\in\mathbb{R}}, then {f(a)} has value {f\left(a_{0}\right)}.

Clearly {f(a_{0})} and {a_{0}} are (by definition) simultaneously measurable. The mapping

\displaystyle  L_{\omega,a}\left(f\right)=\left\langle f(a)\middle|\omega\right\rangle

is a linear mapping on the space of real valued functions on the real line and can be written

\displaystyle  \left\langle f(a)\middle|\omega\right\rangle =\int_{\mathbb{R}}f(t)\omega_{a}(dt)

for some probability measure {\omega_{a}} on {\mathbb{R}}. In particular

\displaystyle  \left\langle a\middle|\omega\right\rangle =\int_{\mathbb{R}}t\omega_{a}(dt).

Physically, you should imagine that we can produce a very large number {N} of identical systems all in state {\omega} and not interacting with one another. If we measure the observable {a} in each of these systems obtaining the values {a_{1},\ldots,a_{N}} then we should have

\displaystyle  \frac{f\left(a_{1}\right)+\cdots+f\left(a_{N}\right)}{N}\approx\left\langle f(a)\middle|\omega\right\rangle .

Mathematically, the above structure amounts to assuming that

There is an action of continuous real valued functions {f} on the set of observables in such a way that {f\circ g(a)=f\left(g(a)\right)} and each state {\omega\in\Omega} gives an assignment of a probability measure {\omega_{a}} on {\mathbb{R}} to each observable {a} in such a way that

\displaystyle  \omega_{f(a)}\left(E\right)=\omega_{a}\left(f^{-1}(E)\right) \ \ \ \ \ (2)

for any interval {E\subset\mathbb{R}}.

In particular, note that {\omega_{f(a)}\left(E\right)=0} if {E} is disjoint from the range of {f}, so that {f^{-1}\left(E\right)=\emptyset}. Thus, for example,

\displaystyle  \left\langle f\left(a\right)\middle|\omega\right\rangle \ge0\quad\text{if }f\ge0.

The set of states {\Omega} is not a vector space, however it does make sense to define the convex combination of states {\omega^{0}} and {\omega^{1}} as follows. Given {\lambda\in[0,1]} let {\omega^{\lambda}} denote the state

\displaystyle  \left\langle a\middle|\omega^{\lambda}\right\rangle =\left(1-\lambda\right)\left\langle a\middle|\omega^{0}\right\rangle +\lambda\left\langle a\middle|\omega^{1}\right\rangle ,

or equivalently that

\displaystyle  \omega_{a}^{\lambda}=(1-\lambda)\omega_{a}^{0}+\lambda\omega_{a}^{1}.

That this does define a state follows if we assume

Completeness of {\Omega}. The set of states is complete. That is, any rule assigning probability measures to observables and satisfying (2) is a state.

We will use the suggestive notation {(1-\lambda)\omega^{0}+\lambda\omega^{1}} for the state {\omega^{\lambda}}.

In addition to completeness, we shall assume that

  1. If two states are different then they must differ in some observable way.
  2. If two observables are distinct then it is possible to prepare a state in which they have distinct averages.

Mathematically this amounts to the following

Mean values separate states and observables. The pairing {\left\langle a\middle|\omega\right\rangle } separates points in {\mathcal{A}} and in {\Omega}. That is

  1. if {\left\langle a\middle|\omega_{1}\right\rangle =\left\langle a\middle|\omega_{2}\right\rangle } for every {a\in\mathcal{A}} then {\omega_{1}=\omega_{2}}, and
  2. if {\left\langle a\middle|\omega\right\rangle =\left\langle b\middle|\omega\right\rangle } for every {\omega\in\Omega} then {a=b}.

The set of observables is not assumed to be an algebra: we have no good definition of {ab} unless we can simultaneously measure {a} and {b}. However, the separability assumptions do allow us to make {\mathcal{A}} a vector space. We already have define {\lambda a} for {\lambda\in\mathbb{R}} and {a\in\mathcal{A}}: it is the result of applying the function {f_{\lambda}(t)=\lambda t} to {a}. To define the addition of two observables we take

\displaystyle  \left\langle a+b\middle|\omega\right\rangle :=\left\langle a\middle|\omega\right\rangle +\left\langle b\middle|\omega\right\rangle . \ \ \ \ \ (3)

This is a little deceptive. After all, how do I know that there is an observable that gives the right hand side when paired with states? For this to work we need another completeness assumption, namely

Completeness of {\mathcal{A}}. The set of observables is complete in the following sense. If {L:\Omega\rightarrow\mathbb{R}} is a linear map, in that

\displaystyle  L((1-\lambda)\omega^{0}+\lambda\omega^{1})=\left(1-\lambda\right)L(\omega^{0})+\lambda L(\omega^{1})

for any states {\omega^{0}} and {\omega^{1}} and {\lambda\in\mathbb{R}}, then {L\left(\omega\right)=\left\langle a\middle|\omega\right\rangle } for some observable {a\in\mathcal{A}}.

This allows us to make sense of (3) as a definition, although it does not allow us to compute the measure {\omega_{a+b}} or the averages {\left\langle f(a+b)\middle|\omega\right\rangle } in any explicit way.

The sum does allow for a sort-of product on the set of observables, namely

\displaystyle  a\circ b=\frac{\left(a+b\right)^{2}-\left(a-b\right)^{2}}{4}.

Although this product is commutative, it does not (necessarily) obey the associative law.


The uncertainty or standard deviation of an observable {a} in state {\omega} is the quantity

\displaystyle  \Delta_{\omega}a=\sqrt{\left\langle \left(a-\left\langle a\middle|\omega\right\rangle \right)^{2}\middle|\omega\right\rangle }=\sqrt{\left\langle a^{2}\middle|\omega\right\rangle -\left\langle a\middle|\omega\right\rangle ^{2}}.

This definition is equally valid in classical or quantum mechanics, however note that {\Delta_{\omega}a=0} for every observable if {\omega} is a pure state in classical mechanics. On the other hand, Heisenberg’s uncertainty principle states that for a state describing a quantum particle, such as an electron,

\displaystyle  \Delta_{\omega}x\Delta_{\omega}p\ge\frac{\hbar}{2}

where {x} and {p} are the position and momentum observables and {\hbar=\frac{h}{2\pi}}.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s