18 Nov 2024

A Note on Peer Review

Every once in a while, the topic of peer review (and, usually, why it’s Horrible™) comes up on Hacker News, X, and I guess pretty much everywhere else. Here is a not-so-recent example (it was recent when I first started drafting this post). A typical quote begins:

Peer review is completely broken nowadays […]

Obviously, since I’m writing this note, I disagree. I do think that peer review is misunderstood—sometimes (very unfortunately) by reviews and authors, often by the informed public, and essentially always by the media. Below, I’m going to try to lay out how I think of peer review in the age of arXiv and Twitter.

First, it’s important to understand that science, like everything else, is massively heterogeneous. There are enormous differences in norms, habits, and so on even between physics and astrophysics, both very quantitative, computational, “give me plots not words” fields. There are also rather large differences between different institutions and different generations. I can’t speak with any confidence to how things work outside of my corner (nuclear theory). Take everything below with that grain of salt.

The life of a paper, once drafted, looks like this:

First it’s sent to “friends” of the authors, or in general people who they think might be interested. This step is optional—not everybody does it—but if you read “we are grateful to so-and-so for many useful comments on an early version of this manuscript”, this is typically the step it’s referring to.
After getting feedback (or waiting a suitable amount of time), the paper is posted to arXiv. This is the big moment, and the point at which people say “congratulations”.
Now the authors most likely wait, perhaps a few weeks. People will write to complain about not being cited, or perhaps say something more constructive.
The authors submit the paper to some journal. In my field this is, more likely than not, Phys Rev D. Likely, at the same time, the arXiv version of the paper is updated to take into account the changes of the last few weeks.
After some time—perhaps one month, perhaps three—the journal sends back a referee report. Quality of referee reports varies wildly. Sometimes it’s difficult to shake the feeling that the referees didn’t actually read the paper; other times (perhaps less often) referees manage to make comments substantial enough that they might have earned co-authorship in another context. This isn’t the place for “irritating referee” stories, but everybody has them. (I think everybody also has stories of that time they were the irritating referee.)
There may be one or two rounds of back-and-forth with the referee. After some amount of improvements to the paper, the referee typically declares himself happy, and the editors accept the paper.
A few weeks later, the paper is officially “published”, and will be referred to in the media as “peer-reviewed”. The few-week delay is because the journal employs some underpayed lackeys to edit the paper for typos and grammar, and also to “improve” formatting. Frequently enough, this introduces substantial errors into the paper which the authors do not manage to catch. Partly for this reason:
Almost anybody who wants to read the paper, reads it on arXiv instead of from the journal.

Most of the meaningful “peer review” happens immediately before and after the arXiv posting. The one or two referees are largely incidental, and are less likely to give high-quality feedback than the hand-picked colleagues to whom the paper was originally sent.

There are many papers that just aren’t read that carefully before peer review. Maybe the authors don’t have friends in the right niche. Similarly, there are papers which are read by a few of the authors’ friends, but not by anyone outside of some tight-knit community. Formal peer review serves as a pretty strong push to incentivize both authors and potential readers to transfer information across subfield boundaries. That’s quite valuable.

It’s not clear if this situation is stable. Why not just neglect journal submissions altogether? There’s a weak incentive to get your papers into journals: it looks weird, when applying for jobs (and funding), if you don’t. Younger folks care less about this than older faculty, though, so it’s possible that the incentive to shepherd papers through the review process will get weaker over time, diluting one of the few formal mechanisms for accountability. Personally I rather like peer review as an institution (ever if I frequently can’t stand dealing with referee reports), and hope that something much like it sticks around. I can’t imagine what.

Lastly note that, in all of the above, nobody had the job of “check that the paper is correct”. The authors do their best, of course, but can’t be meaningfully said to “check” anything. The recipients of an early draft of the paper are likely best-positioned to comment on correctness, and in my experience they often do, but they can hardly be said to be an uninterested third party. The reviewers assigned by the journal are not checking for correctness, but rather for obvious errors, and evaluating for relevance. As a result papers are “published” without any dedicated check specifically designed to weed out incorrect results (with the exception of those measures put in place by the authors themselves).

This is as it should be! Published papers are the primary mechanism for physicists to communicate with each other. They’re published because open communication is preferable to just sending private notes between a small group of friends. To the extent that formal publication (as opposed to just posting to arXiv) accomplishes anything, it raises the profile of the paper to other researchers.

This is critical: the publication of a paper is intended as a signal to other researchers (perhaps including those somewhat outside the original field). It should not be taken as a signal to the media or to the general public, and the institution of formal peer-review is completely unsuited to that task.

I think this raises an important question: what institution is responsible for signaling to the broader public that a result is correct? I don’t think we really have one. There are practices that appear to attempt to fill this role: press releases, science journalists, review articles, and white papers. Press releases are clearly untrustworthy. Science journalism would be the most trustworthy source if the journalists were good, but this requires both deep technical knowledge and all the standard journalistic practice, and I think that’s quite rare. Review articles are extremely valuable, but often written by interested parties, and incomprehensible to people too far from research. White papers do better, but I think are uncommon, and are still not often neutral.

As far as I can tell, creating a trusted and trustworthy institution for transferring information from researchers to the public remains an open problem.