10 Aug 2023

Easily repeated claims are less likely to be true

Briefly: if you hear a claim, you heard it for a reason. Someone told it to you! What caused that to happen? Well, it could be something that people like repeating because it’s true, but it could also be getting repeated for other reasons. If you notice that there are non-truth-related reasons for the claim to be repeated (easy to explain; politically convenient; just plain “catchy”), that should lower your probability (conditional on having heard the claim) that it’s true.

There’s a saying often used in physics: “well known to those who know it well”. (After hearing it repeated too many times, this vacuous tripe is now one of my least favorite phrases.) In a similar spirit, the above is obvious once it becomes obvious.

The rest of this post is just being careful and quantitative with the above, for the times it’s not obvious (and is maybe wrong). Here’s a concrete model. Each freshly made claim starts with probability \(p_0\) of surviving. Being true increases the probability to \(p_0 + p_T\), and being catchy increases it to \(p_0 + p_C\). A claim which is both true and catchy has probability \(p_0 + p_T + p_C\) to survive. As long as all probabilities are small, you can think of this as there being three separate, independent mechanisms for survival.

Mostly for convenience, let’s continue to assume that \(p_\bullet \ll 0\), and see what happens to a population of claims for which a fraction \(f_T \in [0,1]\) are true, \(f_C\) are catchy, and the two notions are independent. The total fraction that survive, and the fraction of true ones that survive, are \[ F_{\mathrm{total}} = p_0 + f_T p_T + f_C p_C \,\text{ and }\, F_{\mathrm{true}} = p_0 + p_T + f_C p_C \text. \] So, the fraction of claims that survive that are true is \[ P(\mathrm{True}|\mathrm{Heard}) = f_T \frac{p_0 + p_T + f_C p_C}{p_0 + f_T p_T + f_C p_C} \text. \] That should be the number you think of when you hear a claim and ask “how likely is this to be true?” For claims that are catchy, the probability is instead \[ P(\mathrm{True}|\mathrm{Heard}\land\mathrm{Catchy}) = \frac{f_T F_{\mathrm{true,catchy}}}{F_{\mathrm{catchy}}} = f_T \frac{p_0 + p_T + p_C}{p_0 + f_T p_T + p_C} \text. \] So we see that for generic parameters, \(P(\mathrm{True}|\mathrm{Heard}\land\mathrm{Catchy}) \lt P(\mathrm{True}|\mathrm{Heard})\). The effect is unsurprisingly largest when being catchy results in a large improvement to survival probability. In a more sophisticated model, this would translate to “be more skeptical of catchier claims”.

If the reproductive advantages stack multiplicatively instead of additively, the above effect no longer holds. I’ll leave it as an exercise for the reader to decide when that’s a better model.