Excerpted from a longer article by Michael Nielsen:
“Myth number 1: Scientists have always used peer review
The myth that scientists adopted peer review broadly and early in the history of science is surprisingly widely believed, despite being false. It’s true that peer review has been used for a long time – a process recognizably similar to the modern system was in use as early as 1731, in the Royal Society of Edinburgh’s Medical Essays and Observations (ref). But in most scientific journals, peer review wasn’t routine until the middle of the twentieth century, a fact documented in historical papers by Burnham, Kronick, and Spier.
This was a common practice in the days before peer review became widespread: decisions about what to publish and what to reject were usually made by journal editors, often acting largely on their own. These decisions were often made rapidly, with papers appearing days or weeks after submission, after a cursory review by the editor. Rejection rates at most journals were low, with only obviously inappropriate or unsound material being rejected; indeed, for some Society journals, Society members even asserted a “right” to publication, which occasionally caused friction with unhappy editors (ref).
What caused the change to the modern system of near-ubiquitous peer review? There were three main factors. The first was the increasing specialization of science (ref). As science became more specialized in the early 20th century, editors gradually found it harder to make informed decisions about what was worth publishing, even by the relatively relaxed standards common in many journals at the time.
The second factor in the move to peer review was the enormous increase in the number of scientific papers being published (ref). In the 1800s and early 1900s, journals often had too few submissions. Journal editors would actively round up submissions to make sure their journals remained active. The role of many editorial boards was to make sure enough papers were being submitted; if the journal came up short, members of the editorial board would be asked to submit papers themselves. As late as 1938, the editor-in-chief of the prestigious journal Science relied on personal solicitations for most articles (ref).
The twentieth century saw a massive increase in the number of scientists, a much easier process for writing papers, due to technologies such as typewriters, photocopiers, and computers, and a gradually increasing emphasis on publication in decisions about jobs, tenure, grants and prizes. These factors greatly increased the number of papers being written, and added pressure for filtering mechanisms, such as peer review.
The third factor in the move to peer review (ref) was the introduction of technologies for copying papers. It’s just plain editorially difficult to implement peer review if you can’t easily make copies of papers. The first step along this road was the introduction of typewriters and carbon paper in the 1890s, followed by the commercial introduction of photocopiers in 1959. Both technologies made peer review much easier to implement.
Nowadays, of course, the single biggest factor preserving the peer review system is probably social inertia: in most fields of science, a journal that’s not peer-reviewed isn’t regarded as serious, and so new journals invariably promote the fact that they are peer reviewed. But it wasn’t always that way.
Myth number 2: peer review is reliable
Every scientist has a story (or ten) about how they were poorly treated by peer review – the important paper that was unfairly rejected, or the silly editor who ignored their sage advice as a referee. Despite this, many strongly presume that the system works “pretty well”, overall.
There’s not much systematic evidence for that presumption. In 2002 Jefferson et al (ref) surveyed published studies of biomedical peer review. After an extensive search, they found just 19 studies which made some attempt to eliminate obvious confounding factors. Of those, just two addressed the impact of peer review on quality, and just one addressed the impact of peer review on validity; most of the rest of the studies were concerned with questions like the effect of double-blind reviewing. Furthermore, for the three studies that addressed quality and validity, Jefferson et al concluded that there were other problems with the studies which meant the results were of limited general interest; as they put it, “Editorial peer review, although widely used, is largely untested and its effects are uncertain”.
In short, at least in biomedicine, there’s not much we know for sure about the reliability of peer review. My searches of the literature suggest that we know don’t much more in other areas of science. If anything, biomedicine seems to be unusually well served, in large part because several biomedical journals (perhaps most notably the Journal of the American Medical Association) have over the last 20 years put a lot of effort into building a community of people studying the effects of peer review; Jefferson et al’s study is one of the outcomes from that effort.
In the absence of compelling systematic studies, is there anything we can say about the reliability of peer review?
The question of reliability should, in my opinion, really be broken up into three questions. First, does peer review help verify the validity of scientific studies; second, does peer review help us filter scientific studies, making the higher quality ones easier to find, because they get into the “best” journals, i.e., the ones with the most stringent peer review; third, to what extent does peer review suppress innovation?
As regards validity and quality, you don’t have to look far to find striking examples suggesting that peer review is at best partially reliable as a check of validity and a filter of quality.
What about the suppression of innovation? Every scientist knows of major discoveries that ran into trouble with peer review. David Horrobin has a remarkable paper (ref) where he documents some of the discoveries almost suppressed by peer review; as he points out, he can’t list the discoveries that were in fact suppressed by peer review, because we don’t know what those were. His list makes horrifying reading.
Here’s just a few instances that I find striking, drawn in part from his list. Note that I’m restricting myself to suppression of papers by peer review; I believe peer review of grants and job applications probably has a much greater effect in suppressing innovation.
* George Zweig’s paper announcing the discovery of quarks, one of the fundamental building blocks of matter, was rejected by Physical Review Letters. It was eventually issued as a CERN report.
* Berson and Yalow’s work on radioimmunoassay, which led to a Nobel Prize, was rejected by both Science and the Journal of Clinical Investigation. It was eventually published in the Journal of Clinical Investigation.
* Krebs’ work on the citric acid cycle, which led to a Nobel Prize, was rejected by Nature. It was published in Experientia.
* Wiesner’s paper introducing quantum cryptography was initially rejected, finally appearing well over a decade after it was written.
To sum up: there is very little reliable evidence about the effect of peer review available from systematic studies; peer review is at best an imperfect filter for validity and quality; and peer review sometimes has a chilling effect, suppressing important scientific discoveries.
At this point I expect most readers will have concluded that I don’t much like the current peer review system. Actually, that’s not true, a point that will become evident in my post about the future of peer review. There’s a great deal that’s good about the current peer review system, and that’s worth preserving. However, I do believe that many people, both scientists and non-scientists, have a falsely exalted view of how well the current peer review system functions. What I’m trying to do in this post is to establish a more realistic view, and that means understanding some of the faults of the current system.
Myth number 3: Peer review is the way we determine what’s right and wrong in science
By now, it should be clear that the peer review system must play only a partial role in determing what scientists think of as established science. There’s no sign, for example, that the lack of peer review in the 19th and early 20th century meant that scientists then were more confused than now about what results should be regarded as well established, and what should not. Nor does it appear that the unreliability of the peer review process leaves us in any great danger of collectively coming to believe, over the long run, things that are false.
In practice, of course, nearly all scientists understand that peer review is only part of a much more complex process by which we evaluate and refine scientific knowledge, gradually coming to (provisionally) accept some findings as well established, and discarding the rest. So, in that sense, this third myth isn’t one that’s widely believed within the scientific community. But in many scientists’ shorthand accounts of how science progresses, peer review is given a falsely exaggerated role, and this is reflected in the understanding many people in the general public have of how science works. Many times I’ve had non-scientists mention to me that a paper has been “peer-reviewed!”, as though that somehow establishes that it is correct, or high quality. I’ve encountered this, for example, in some very good journalists, and it’s a concern, for peer review is only a small part of a much more complex and much more reliable system by which we determine what scientific discoveries are worth taking further, and what should be discarded.”
Michael Nieslsen is “writing a book about “The Future of Science”; this post is part of a series where I try out ideas from the book in an open forum. A summary of many of the themes in the book is available in this essay. If you’d like to be notified when the book is available, please send a blank email to the.futur[email protected] with the subject “subscribe book”.”