What do commutators have to do with causality?
In (all?) QFT textbooks, it is stated that if two events $x$ and $y$ are space-like separated, then causality requires that
$$
[A(x),B(y)] = 0
$$
for any local operators $A$ and $B$. It was not obvious to me what this requirement had to do with causality, and I ignored the issue for years. This is to briefly record the cleanest answer I know of.
Commutators come up naturally when considering linear response. Suppose at $t=0$ a system is described by the state $|\psi\rangle$. We could measure some operator $A$, or we could wait some time $T>0$ and measure then:
$$
\langle A(T) \rangle_0 =
\langle \psi | e^{i H T} A e^{-i H T}| \psi \rangle
\text.
$$
Suppose at time $t=0$, we hit the system with a small hammer, evolving for a brief time under the Hamiltonian $B$. (Note that $B$, being a Hermitian operator, is both an observable and a legal Hamiltonian – the concepts are one and the same.) For arbitrarily large perturbations, of course, the expectation value of $A(T)$ is
$$
\langle A(T) \rangle_\epsilon =
\langle \psi | e^{i B \epsilon} e^{i H T} A e^{-i H T} e^{-i B \epsilon}| \psi \rangle
\text.
$$
The expression simplifies somewhat if we consider only linear response; that is, we drop all terms nonlinear in $\epsilon$. In particular, the first derivative of the above expression is
$$
\frac{\mathrm d}{\mathrm d\epsilon} \langle A(T) \rangle_\epsilon =
-i\langle \psi |
[A(T), B(0)]
| \psi \rangle
\text.
$$
If $A(x)$ and $B(y)$ commute, then, than a small whack of operator $B(y)$ has no effect on the expectation value of $A(x)$ (no matter the initial state). If this holds for _all_ operators at $x$ and $y$, then this (repeated to all orders in $\epsilon$) suffices to show that no message can be sent from $y$ to $x$. That’s causality!