Ineffective Theory

Links for February 2021

A water treatment plant in Florida is attacked — attacker raises sodium hydroxide level in an apparent attempt to poison people. I’m pretty sure “thanks to a vigilant operator” is one of the ten scariest phrases in the English language. We haven’t really explored the long tail of cybersecurity risk: be unsurprised by unwelcome surprises in the next decade.

Andrew Gelman comments on Alvaro de Menard’s review (previously posted here) of the state of the social science literature.

Here’s a 10 terapixel image of the night sky. I don’t think I can reliably tell galaxies from the dimmer stars.

The New York Times finally gets around to publishing their piece on SSC and the rationalists; the word “rancid” comes to mind, among others. (I include this link because it seems culturally important, not because it’s worth your time to read.) Here’s Scott’s reply; if you read anything on this, read that first and last. Variously irate commentary is provided by Scott Sumner, Jason Crawford, Matt Yglesias, Noah Smith, and Scott Aaronson (with a follow-up), among others. Finally, Tanner Greer comments: “Mortals know that when Olympians feud, it is never Olympians who die.”

Rob Rhinehart on the evil octopus — if you’re irate about what happened, this is the piece that Scott Alexander thinks you should read. The title of this article is “The New York Times”, which I think is how we know it’s not really about the NYT. They’re just one of the arms of the Kraken. As Leonard Cohen points out, the Kraken is everything.

When nothing determines status but status itself — when there are no objective outside measures of the quality of one’s work — social dynamics become much more toxic. Greer’s explanation is based on a link between personal uncertainty and personal insecurity. I think the story goes beyond that. An illegible incentive can drive a group to behave in a certain way without any individual being aware of what’s happening; in “a world where one’s reputation rests on little more than reputation itself”, there’s a pretty strong illegible incentive to fight tooth-and-nail for reputation.

Vitalik Buterin on prediction markets — how modern ones can fail, and lessons for futarchy.

Also by Vitalik, here’s a nice introduction to zk-SNARKs.

The Legendre Transform and the Path Integral

This is yet another post on something I really should have learned long ago, but somehow never quite grasped. Or, in this case, never learned at all.

I have a Hamiltonian — let’s say $H = p^2 + V(x)$ for simplicity’s sake — and I’d like to get a Lagrangian for the same system. Classically, this is done through the Legendre transform. Focus on the $p$-dependence of $H(p;x)$ for fixed $x$. Define $h(p,\dot x;x) = p \dot x - H(p;x)$, and the Legendre transform of $H$ is given by the maximum value attained by $h$: $$ L(x,\dot x) = \max_{p} h(p,\dot x;x) \text. $$ Of course, that maximum value can be found by requiring that $0 = \frac{\partial h}{\partial p}$. As a result, the Lagrangian can be defined more simply by simply saying $L = p \dot x - H$, where $p$ is defined such that $\dot x = \frac{\partial H}{\partial p}$.

In quantum mechanics, there’s another way to go from the Hamiltonian to a Lagrangian. This is what we do when we derive the path integral. For the same Hamiltonian as above, we can expand the propagator like so: $$ \langle x_f | e^{-i H t} | x_i \rangle = \int d x_n\cdots d x_1\; \langle x_{f} | e^{-i H \Delta t} | x_n\rangle \cdots \langle x_{1} | e^{-i H \Delta t} | x_i\rangle $$ where $\Delta t = t / n$. When $n$ is large, the operator $e^{-i H \Delta t}$ is close to the identity, and easily approximated: $$ \langle x' | e^{-i H \Delta t} | x\rangle = e^{i \big(\frac 1 {2 \Delta t} (x'-x)^2 - V(x)\big)} + O(\Delta t^2) \text. $$ Hey, look, there’s a naive discretization of the Lagrangian up in the exponential! Putting it all together and treating $x(t)$ as a continuous function (at the very least, the discrete values can be interpolated to construct a piecewise smooth function), we obtain the usual path integral for the propagator. The object in the exponential is by definition the action, or the integral of the Lagrangian. $$ \langle x_f | e^{-i H t} | x_i \rangle = \int \mathcal D x(t)\; e^{i \int dt\; L(x,\dot x)} $$

These two procedures give the same answer, at least in “reasonable” cases, so they must be related. How? In particular, the second procedure should be hiding a Legendre transform somewhere. (I’m particularly interested in this question because I have no trouble keeping track of the second procedure, whereas the first is a source of perpetual bafflement.)

Well, let’s look at the quantum case more carefully. In order to approximate the matrix element of $e^{-i H \Delta t}$, one usually expands it with the Suzuki-Trotter decomposition: $$ \langle x' | e^{-i H \Delta t} | x\rangle = \int d p\; \langle x' | e^{-i p^2 \Delta t/2} | p\rangle \langle p | e^{-i V(x)} | x\rangle = \int d p\; e^{i p (x'-x)} e^{-i p^2 \Delta t/2 - i V(x)} \text. $$ Rather than integrating out the momentum, let’s keep it around for a bit and see what happens. Plugging this formula into the expression for the full propagator, and again treating $p$ and $x$ as continuous functions, we find $$ \langle x_f | e^{-i H t} | x_i \rangle = \int \mathcal D x(t)\; \mathcal D p(t)\; e^{i p \dot x - H(p,x)}\text. $$

We’re halfway there! The exponential is just the function $h(p,\dot x;x)$ that we use in the definition of the Legendre transform. The last bit is to realize that we want the classical limit, that is, the $\hbar \rightarrow 0$ limit. Rewriting the above expression with $\hbar$ in place: $$ \langle x_f | e^{-i H t / \hbar} | x_i \rangle = \int \mathcal D x(t)\; \mathcal D p(t)\; e^{i (p \dot x - H(p,x))/\hbar}\text. $$

Now we can consider integrating out the momentum. This integral is dominated by the region of stationary phase; that is, where $\frac{\partial h}{\partial p} = 0$. So, the Legendre transform reappears in the classical limit of the path integral (just as Lagrange’s equations of motion do).

Here is a related StackExchange post for further reading. It doesn’t quite go all the way (at least along the direction I’m interested in), but most of the important bits are there. There’s also a hint of this story in section 9.1 of Peskin.

Quantum Ensembles

In classical statistical mechanics, thermal expectation values are computed by averaging over some probability distribution. In the case of the canonical ensemble, this looks like a sum over all states $s$ weighted by the exponential of the energy $E_s$: $$ \langle\mathcal O\rangle_{\text{classical}} = Z^{-1} \sum_s e^{-E_s / T} \mathcal O_s \text. $$ The normalization of $Z^{-1}$ is irrelevant to the discussion here — in fact, I’ll leave it off all future expressions.

How does this generalize to quantum mechanics? The usual way is to replace the energies with a Hermitian operator (the Hamiltonian), the observable with another Hermitian operator, and the sum with a trace. An expectation value in the quantum canonical ensemble looks like $$ \langle \mathcal O\rangle_{\text{quantum}} = \mathrm{Tr}\;e^{-H / T} \mathcal O $$ To see that this is in fact a generalization of the classical ensemble above, consider the case where the Hamiltonian and observable commute. A basis exists where they are both diagonal, and in that basis the quantum expectation value takes exactly the form of the classical expectation value above.

So far, so familiar. But is this the only sensible generalization of classical statistical mechanics? In a sense, yes. First let’s go back to the classical case. The exponential of the energy $e^{-E/T}$ is referred to as the Boltzmann factor. If we know the expectation values of a bunch of operators $\mathcal O_1,\ldots,\mathcal O_n$, then as long as these expectation values are physically consistent, a suitable Boltzmann factor can always be constructed to satisfy them. (Actually, that’s the definition of physically consistent.) Furthermore, if the system is finite and we know the expectation values of enough operators, the Boltzmann factor is actually uniquely determined! (Well, up to a measure-zero set, or something like that.)

That should demotivate us from looking for new formulations of classical statistical mechanics. Similar logic holds for the quantum case. In a quantum system, we’re also constrained by linearity, but the structure of the argument is the same. Given enough (physically realizable) expectation values, the density matrix $e^{-H / T}$ is uniquely determined. As a result, there’s not much point looking for different formulations of quantum statistical mechanics, either.

Let’s do it anyway.

In a system with $Q$ qubits, the density matrix is a bulky object with $2^{2Q}$ entries. I don’t want to deal with that, I want to work with states — nice sleek vectors with a measly $2^{Q}$ entries. The classical ensembles give probability distributions on classical states; why can’t a quantum ensemble give a probability distribution on quantum (pure) states?

Well, in fact, the density matrix already does. By diagonalizing $H$, you can see that $e^{-H / T}$ naturally yields a probability distribution on the set of eigenstates of the Hamiltonian. This isn’t very nice, though, because it sort of presupposes that we know the eigenbasis of the Hamiltonian. For any interesting problem, I don’t.

Fortunately, it’s pretty clear that this probability distribution is far from unique. First, note that every probability distribution on states will induce some density matrix. This follows from the discussion above; alternatively, the density matrix is given explicitly by $$ \rho[p] = \int d |\psi\rangle\; p(\psi)\; |\psi\rangle\langle\psi| \text. $$ Now count dimensions. The space of density matrices is $2^{2Q}$ dimensional, give or take. The space of possible probability distributions on the continuous space of states? Ah… many more dimensions there.

So, many more probability distributions compatible with the thermal density matrix should exist. As far as I know, finding natural such probability distributions is largely unexplored territory. (There’s one induced by ergodicity and the time-evolution operator $e^{-i H t}$, although I’m not even sure if it’s unique.) Nevertheless, one such (approximate) construction is given by Sugiura and Shimizu. Start with a uniform (i.e., $SU(2^Q)$-symmetric) distribution on the space of states, corresponding to an infinite temperature ensemble. Now take one state, and apply the operator $e^{-H / (2T)}$. Lastly, if you think of states as being vectors in a Hilbert space, then normalize the result; if you think of states as rays, this is of course unnecessary.

In the thermodynamic limit, expectation values with respect to this distribution match the canonical quantum ensemble described above. In fact, in that limit, a single quantum state sampled from this ensemble yields (almost certainly) all the correct expectation values. A careful proof is given in the paper; a nice physical argument follows just as it would for a classical system. In the infinite-volume limit, we can consider multiple sub-systems each individually in the thermodynamic limit and all uncoupled from each other. Therefore, what appears to be a single thermodynamic “sample” actually contains infinitely many, equally thermodynamic samples.