A linear operator $\hat{A}$ on a vector space is a function that maps vectors to vectors, $|\psi \rangle \rightarrow \hat{A}|\psi\rangle$ such that \[\hat{A} \left ( a |\psi \rangle + b|\phi \rangle \right ) = a\hat{A} |\psi\rangle + b\hat{A} |\phi \rangle .\]

Given a linear operator $\hat{A}$ on an inner product space, we can also define an action of a linear operator on a dual vector. The dual vector $\langle \psi | \hat{A}$ is defined to be the unique dual vector such that \[\left ( \langle \psi | \hat{A}\right ) |\phi \rangle = \langle \psi | \left ( \hat{A} |\phi \rangle \right ), \] for all vectors $|\phi \rangle$.

Note that, given this definition, an expression like $\langle \psi | \hat{A} | \phi \rangle$ is unambiguous. We will always get the same result regardless of whether we first apply $\hat{A}$ to $|\phi \rangle$ and then take the inner product with $|\psi \rangle$, or first apply the operator $\hat{A}$ to the dual vector $\langle \psi |$ and then apply it to $|\phi \rangle$.

Examples of Linear Operators

  • The identity operator $\hat{I}$: \[\hat{I} \ket{\psi} = \ket{\psi}, \qquad \text{for all }\ket{\psi}\]
  • The position operator $\hat{x}$: We define its action on the position basis as \[\hat{x} \ket{x} = x\ket{x},\] where $\hat{x}$ is an operator on the left hand side and $x$ is a scalar on the right hand side. We can deduce its action on an arbitrary vector $\ket{\psi}$ by writing it in the position basis as \[\ket{\psi} = \int_{-\infty}^{+\infty} \psi(x) \ket{x}\,\D x,\] and then \[\hat{x}\ket{\psi} = \int_{-\infty}^{+\infty} x\psi(x) \ket{x}\,\D x.\] Note, when we are working exclusively in the position basis it is common to abuse notation and write \[\hat{x}\psi(x) = x\psi(x),\] which can be confusing because $\hat{x}$ acts on abstract vectors $\ket{\psi}$ rather than their components $\psi(x)$ in the position basis. However, it causes no ambiguity when only this basis is being used.
  • The derivative operator $\hat{\frac{\D}{\D x}}$: We again define its action in the position basis. If $\ket{\psi} = \int_{-\infty}^{+\infty} \psi(x) \ket{x}$ then \[\hat{\frac{\D}{\D x}}\ket{\psi} = \int_{-\infty}^{+\infty} \frac{\D\psi}{\D x} \ket{x}.\] Again, abusing notation, we sometimes write \[\hat{\frac{\D}{\D x}} \left ( \psi (x) \right ) = \frac{\D\psi}{\D x},\] or, even more naughtily, we might forgo the hat on the operator and write \[\frac{\D}{\D x} \left ( \psi (x) \right ) = \frac{\D\psi}{\D x}.\]
  • The linear momentum operator $p$: This is defined in the same way in the momentum representation as $x$ is defined in the position representation, i.e. \[\hat{p} \ket{p} = p\ket{p}.\] If write write a general vector in the momentum basis as \[\ket{\psi} = \int_{-\infty}^{+\infty} \phi(p) \ket{p}\,\D p,\] then \[\hat{p}\ket{\psi} = \int_{-\infty}^{+\infty} p\phi(p) \ket{p} \,\D p.\] Once we have understood the connection between the position and momentum bases in more detail, we will prove that \[\hat{p} = -i\hbar \hat{\frac{\D}{\D x}}.\]
  • The parity operator $\hat{\mathcal{P}}$: This operator is very useful for understanding the symmetries of one-dimensional quantum systems, and we will be using it in module 4. It is defined by its action on the position basis as \[\hat{\mathcal{P}}\ket{x} = \ket{-x}.\] From this, we can determine its action on the coefficients $\psi(x)$ of a vector $\ket{\psi}$ in the position basis. If $\ket{\psi} = \int_{-\infty}^{+\infty} \psi(x) \ket{x}\,\D x$ then \[\hat{\mathcal{P}}\ket{\psi} = \int_{-\infty}^{+\infty} \psi(x) \ket{-x}\,\D x.\] We can now make the substitution $x'=-x$, which gives $\D x' = -\D x$, and $x' = -\infty$ when $x' = +\infty$, $x'= +\infty$ when $x=-\infty$. This gives \begin{align*} \hat{\mathcal{P}}\ket{\psi} & = - \int_{+\infty}^{-\infty} \psi(-x') \ket{x'}\,\D x' \\ & = \int_{-\infty}^{+\infty} \psi(-x') \ket{x'}\,\D x' \\ & = \int_{-\infty}^{+\infty} \psi(-x) \ket{x}\,\D x, \end{align*} where, the minus sign is removed in the second line because we changed the order of the integration limits and we can replace $x'$ with $x$ in the third line because $x'$ is a dummy variable. The upshot is that the coefficients $\psi(x)$ get transformed to $\psi(-x)$ by the parity operator. As you may have guessed by now, we will sometimes abuse notation and write \[\hat{\mathcal{P}} \psi(x) = \psi(-x).\]
  • The Laplacian operator $\hat{\nabla}^{2}$: Although we have not really discussed three-dimensional systems yet, the one-dimensional position basis $\ket{x}$ is generalized to a three-dimensional position basis $\ket{\vec{r}}$ where the basis vectors are now labelled by three-dimensional position vectors $\vec{r}$. In this basis, a general vector can be decomposed as \[\ket{\psi} = \int \psi(\vec{r}) \ket{\vec{r}}\,\D V,\] where the components $\psi(\vec{r})$ are now a scalar function of the position vector and $\D V$ is the three-dimensional volume element $\D V = \D x\D y\D z$. The Laplacian operator then acts as \[\hat{\nabla}^2 \ket{\psi} = \int \left ( \frac{\partial^2 \psi}{\partial x^2} + \frac{\partial^2 \psi}{\partial y^2} + \frac{\partial^2 \psi}{\partial z^2} \right ) \ket{\vec{r}}\D V.\]
  • Outer Products: In Dirac notation, an outer product is an operator of the form $\ketbra{\phi}{\psi}$, which acts as \[\left ( \ketbra{\phi}{\psi}\right ) \ket{\chi} = \ket{\phi}\braket{\psi}{\chi}.\] In other words, yo\u take the inner product of $\ket{\psi}$ with whatever the input vector is and output the vector $\ket{\phi}$ scaled by this inner product. The direction of the output vector is always the same as that of phi $\ket{\phi}$, but its length is scaled by the inner product. Note that an inner product $\braket{\phi}{\psi}$ is a scalar, whereas an outer product $\ketbra{\phi}{\psi}$ is an operator.

The Dirac Notaty

We will now learn the most useful thing to know in Dirac notation. It is so important that I will call it a theorem, even though I am a physicist and I prefer to just ramble on informally most of the time.

Theorem Let $\ket{e_1}$, $\ket{e_2}$, $\cdots$ be any orthonormal basis. Then, \[\sum_j \proj{e_j} = \hat{I},\] where $\hat{I}$ is the identity operator.

Proving this theorem is an in class activity.

The reason that this theorem is so useful is that we can insert the identity operator in front of any vector without changing anything, but then decomposing it in a basis as $\hat{I} = \sum_j \proj{e_j}$ allows us to determine the how things are represented in a particular basis. Conversely, if we recognize $\sum_j \proj{e_j}$ somewhere in an expression we can remove it because it is just the identity. Inserting the identity operator, moving terms around, and removing identity operators is such a useful proof technique that I have given it a name: The Dirac Notaty. It also has a song, which is sung to the tune of the Hokey Cokey (the British version of the Hokey Pokey which has an additional chorus). Here are the lyrics:

<html> <center></html> VERSE

You put the identity in.
You take the identity out.
You decompose it in a basis and
you move the terms about.
You do the Dirac Notaty
and you turn your bras to kets
That’s what it’s all about!

CHORUS

Oh, Dirac Notaty-taty!
Oh, Dirac Notaty-taty!
Oh, Dirac Notaty-taty!
Adjoint, Transpose, rah, rah, rah! <html> </center></html>

You can sing along using this video:

The bit about adjoints and transposes will make more sense in the next section.

We have already seen some examples of the Dirac notaty without realizing it. Suppose we want to know how a vector $\ket{\psi}$ is represented in the basis $\ket{e_j}$. Well, you can just stick in the identity and decompose it in the $\ket{e_j}$ basis. \begin{align*} \ket{\psi} & = \hat{I}\ket{\psi} \ket{\psi} \\ & = \left ( \sum_j \proj{e_j} \right ) \ket{\psi} \\ & = \sum_j \ket{e_j} \braket{e_j}{\psi}. \end{align*} We see that the components in the $\ket{e_j}$ basis are the inner products $\braket{e_j}{\psi}$, just as we found in the previous section, but now we understand that this is just an example of the Dirac notaty. We can also see why I previously wrote the components $\braket{e_j}{\psi}$ to the right of the basis elements $\ket{e_j}$. It makes it clearer that there is an identity operator involved. Note that the second line of this proof is really unnecessary. It is fine to jump straight to the third line. I just included it to make it clearer that I was decomposing the identity in a basis.

Let's look at one more example. Suppose you know the components of a vector $\ket{\psi}$ in the orthonormal basis $\ket{f_j}$, which are the inner products $\braket{f_j}{\psi}$, and you want to know how these are related to the components in another orthonormal basis $\ket{e_j}$. Well, you can just write \begin{align*} \braket{f_j}{\psi} & = \sand{f_j}{\hat{I}}{\psi} \\ & = \sand{f_j}{\left ( \sum_k \proj{e_k} \right )}{\psi} \\ & = \sum_k \braket{f_j}{e_k}\braket{e_k}{\psi}. \end{align*} Again, this is a formula we have seen before, now derived as an example of the Dirac notaty.

We can also perform the Dirac notaty for a continuous basis. For example, in the position basis the identity can be decomposed as \[\hat{I} = \int_{-\infty}^{+\infty} \D x \, \proj{x},\] and in the momentum basis as \[\hat{I} = \int_{-\infty}^{+\infty} \D p \, \proj{p}.\]

The Dirac notaty is useful because the identity operator does nothing, but decomposing it in a basis does everything. If you are stuck on one of the homework problems it is probably because you need to insert one or more identity operators, decompose them in some basis, move the inner product terms around, and recognize the identity decomposed in a basis and remove it.

Products of Operators

The product of two operators $\hat{A}\hat{B}$ is the unique operator such that \[\hat{A}\hat{B}\ket{\psi} = \hat{A} \left ( \hat{B} \ket{\psi}\right ),\] for all vectors $\ket{\psi}$. In other words, products are evaluated right-to-left. We first apply $\hat{B}$ to $\ket{\psi}$ and then apply $\hat{A}$ to the result.

Similarly, we can form products of larger numbers of operators. For example, to apply $\hat{A}\hat{B}\hat{C}$ to a vector, you first apply $\hat{C}$, then apply $\hat{B}$, and finally apply $\hat{A}$.

The operator product is associative: \[\hat{A}\hat{B}\hat{C} = \hat{A}\left ( \hat{B} \hat{C}\right ) = \left ( \hat{A} \hat{B} \right )\hat{C},\] but it is not generally commutative, i.e. $\hat{A}\hat{B} \neq \hat{B}\hat{A}$.

Because of this it is useful to introduce the commutator of two operators \[[\hat{A},\hat{B}] = \hat{A}\hat{B} - \hat{B}\hat{A}.\] The commutator is zero when the operators do commute $\hat{A}\hat{B} \neq \hat{B}\hat{A}$, but otherwise it is nonzero.

In an in class activity you will prove that \[[\hat{x},\hat{p}] = i\hbar \hat{I},\] so, in particular, the position and momentum operators do not commute.

Note that, in addition to the other bad notational habits we have mentioned, we will often write \[[\hat{x},\hat{p}] = i\hbar.\] More generally, whenever we write that an operator equals a scalar, e.g. $\hat{A} = a$, it means that the operator is proportional to the identity with the scalar as the proportionality constant, e.g. $\hat{A} = a$ means the same thing as $\hat{A} = a\hat{I}$.

Expectation Values

An expression of the form $\sand{\phi}{\hat{A}}{\psi}$ is a scalar. It is the inner product of $\hat{A}\ket{\psi}$ with $\ket{\phi}$. If a vector $\ket{\psi}$ is notrmalized, i.e. $||\psi||^2 = \braket{\psi}{\psi} = 1$ then \[\sand{\psi}{\hat{A}}{\psi}\] is called the expectation value of $\hat{A}$ with respect to $\ket{\psi}$. On the other hand, if $\ket{\psi}$ is not normalized then this is generalized to \[\frac{\sand{\psi}{\hat{A}}{\psi}}{\braket{\psi}{\psi}}.\] However, I will remind you again that it is usually preferable to always work with normalized wavefunctions in quantum mechanics and avoid the more complicated formulas.

Note that when the vector is obvious from context, e.g. if we are working on a problem that uses the same vector $\ket{\psi}$ throughout, then the expectation value is often denoted $\Expect{\hat{A}}$.

To see why this is called the expectation value, let's compute it for the position operator $\hat{x}$. \begin{align*} \sand{\psi}{\hat{x}}{\psi} & = \int_{-\infty}^{+\infty} \D x' \, \psi^*(x') \bra{x'} \int_{-\infty}^{+\infty} \D x \,\ x \psi(x) \ket{x} \\ & = \int_{-\infty}^{+\infty} \D x' \, \int_{-\infty}^{+\infty} \D x \, x \psi^*(x')\psi(x) \braket{x'}{x} \\ & = \int_{-\infty}^{+\infty} \D x' \, \int_{-\infty}^{+\infty} \D x \, x \psi^*(x')\psi(x) \delta(ex-x') \\ & = \int_{-\infty}^{+\infty} \D x \, x\psi^*(x)\psi(x) \\ & = \int_{-\infty}^{+\infty} x\Abs{\psi(x)}^2 \, \D x. \end{align*} Since $p(x) = \Abs{\psi(x)}^2$ is the probability density for position, this is the expectation value of $x$ in the sense of probability theory.

We will see later in the course that other physical quantities are also represented by linear operators. If $\hat{A}$ represents such a physical quantity then $\sand{\psi}{\hat{A}}{\psi}$ is its expectation value in the sense of probability theory.

In Class Activities

  1. Let $\ket{e_1}, \ket{e_2}, \cdots$ be an orthonormal basis. Prove that \[\sum_j \proj{e_j}\] is the identity operator.

    HINT: You have to prove that $\left ( \sum_j \proj{e_j} \right ) \ket{\psi} = \ket{\psi}$ for all vectors $\ket{\psi}$. Since $\ket{e_1}, \ket{e_2}, \cdots$ is an orthonormal basis, every vector can be decomposed as $\ket{\psi} = \sum_j b_j \ket{e_j}$ for some coefficients $b_j$.
  2. Show that $[\hat{x},\hat{p}] = i\hbar \hat{I}$, where $\hat{p} = -i\hbar \hat{\frac{\D}{\D x}}$.

    HINT: You need to show that $[\hat{x},\hat{p}]\ket{\psi} = i\hbar \ket{\psi}$ for all vectors $\ket{\psi}$. You may use sloppy notation, i.e. \[\hat{x} \psi (x) = x\psi(x), \qquad \hat{p} \psi(x) = -i\hbar \frac{\D \psi}{\D x}.\]