What do commutators have to do with causality?
In (all?) QFT textbooks, it is stated that if two events \(x\) and \(y\) are space-like separated, then causality requires that
$$
[A(x),B(y)] = 0
$$
for any local operators \(A\) and \(B\). It was not obvious to me what this requirement had to do with causality, and I ignored the issue for years. This is to briefly record the cleanest answer I know of.
Commutators come up naturally when considering linear response. Suppose at \(t=0\) a system is described by the state \(|\psi\rangle\). We could measure some operator \(A\), or we could wait some time \(T>0\) and measure then:
$$
\langle A(T) \rangle_0 =
\langle \psi | e^{i H T} A e^{-i H T}| \psi \rangle
\text.
$$
Suppose at time \(t=0\), we hit the system with a small hammer, evolving for a brief time under the Hamiltonian \(B\). (Note that \(B\), being a Hermitian operator, is both an observable and a legal Hamiltonian – the concepts are one and the same.) For arbitrarily large perturbations, of course, the expectation value of \(A(T)\) is
$$
\langle A(T) \rangle_\epsilon =
\langle \psi | e^{i B \epsilon} e^{i H T} A e^{-i H T} e^{-i B \epsilon}| \psi \rangle
\text.
$$
The expression simplifies somewhat if we consider only linear response; that is, we drop all terms nonlinear in \(\epsilon\). In particular, the first derivative of the above expression is
$$
\frac{\mathrm d}{\mathrm d\epsilon} \langle A(T) \rangle_\epsilon =
-i\langle \psi |
[A(T), B(0)]
| \psi \rangle
\text.
$$
If \(A(x)\) and \(B(y)\) commute, then, than a small whack of operator \(B(y)\) has no effect on the expectation value of \(A(x)\) (no matter the initial state). If this holds for all operators at \(x\) and \(y\), then this (repeated to all orders in \(\epsilon\)) suffices to show that no message can be sent from \(y\) to \(x\). That’s causality!