Age of Invention: Does History have a Replication Crisis?

Aug 29, 2023

You’re reading Age of Invention, my newsletter on the causes of the British Industrial Revolution and the history of innovation. This edition went out to over 22,000 subscribers. To support my work, you can upgrade your subscription here:

Back in 2011, the field of psychology went into crisis. Some of the most famous and widely-cited experimental results — like the finding that powerfully posing for a few minutes gives you a hormonal boost in confidence, or that priming people with words to do with ageing makes them walk slower — could not be replicated by others. These were findings published in the field’s most prestigious academic journals, and going back for decades. Many of them had made mistakes in the experiments, through negligence, unintended bias, or simple error. A few, quite simply, had been faked. Whole swathes of research and media coverage, including some globally best-selling books, turned out to be based on foundations of sand. And since then, more and more scientific fields have turned out to have been the victims of replication crises.

Nobody had bothered, for years and years, to go to the trouble of actually checking the more unusual and interesting findings. The Scottish psychologist-turned-science journalist Stuart Ritchie wrote an eye-opening book about the scandals in science called Science Fictions. He and another science journalist, Tom Chivers, have also lately started a podcast to sort the reliable findings from the media froth, diving into the details of what scientific studies actually show — it’s called The Studies Show (har har).

But I’ve become increasingly worried that science’s replication crises might pale in comparison to what happens all the time in history, which is not just a replication crisis but a reproducibility crisis. Replication is when you can repeat an experiment with new data or new materials and get the same result. Reproducibility is when you use exactly the same evidence as another person and still get the same result — so it has a much, much lower bar for success, which is what makes the lack of it in history all the more worrying.

Historical myths, often based on mere misunderstanding, but occasionally on bias or fraud, spread like wildfire. People just love to share unusual and interesting facts, and history is replete with things that are both unusual and true. So much that is surprising or shocking has happened, that it can take only years or decades of familiarity with a particular niche of history in order to smell a rat. Not only do myths spread rapidly, but they survive — far longer, I suspect, than in scientific fields.

Take the oft-repeated idea that more troops were sent to quash the Luddites in 1812 than to fight Napoleon in the Peninsular War in 1808. Utter nonsense, as I set out in 2017, though it has been cited again and again and again as fact ever since Eric Hobsbawm first misled everyone back in 1964. Before me, only a handful of niche military history experts seem to have noticed and were largely ignored. Despite being busted, it continues to spread. Terry Deary (of Horrible Histories fame), to give just one of many recent examples, repeated the myth in a 2020 book. Historical myths are especially zombie-like. Even when disproven, they just. won’t. die.

Or take the case of the 12,000-franc prize instituted by Napoleon for an improved method of preserving food for the use of his armies, which prompted Nicolas Appert to invent canned food. It’s frequently cited to show the how prizes can have a significant impact. Except that, despite being repeated hundreds of times, it literally never happened. Appert was given money by the French government, but it was a mere reward in recognition of his achievement, given over a decade after he had invented the method. The myth of the food canning innovation prize is a truly ancient one, which I traced back to a mis-translation of a vaguely-worded French source all the way back in 1869. That’s over 150 years of repeated falsehood, and I see no signs of it slowing.

The persistence of historical falsehoods is easy to explain. Just as in science, there is simply no time to check absolutely every detail in the things you cite. And even if you do, you may have to follow a citation chain that is dozens or hundreds of links long. They will often end up with archival sources that would be too time-consuming to be worth going to the trouble of accessing. History, like any other field, very often relies on trust.

But it’s hard to trust when you are exposed to one of the most frightening of revelations: that hardly anyone ever bothers to check.

I don’t think this is just me being grumpy and pedantic. I come across examples of mistakes being made and then spreading almost daily. It is utterly pervasive. Last week when chatting to my friend Saloni Dattani, who has lately been writing a piece on the story of the malaria vaccine, I shared my mounting ~~paranoia~~ healthy scepticism of secondary sources and suggested she check up on a few of the references she’d cited just to see. A few days later and Saloni was horrified. When she actually looked closely, many of the neat little anecdotes she’d cited in her draft — like Louis Pasteur viewing some samples under a microscope and having his mind changed on the nature of malaria — turned out to have no actual underlying primary source from the time. It may as well have been fiction. And there was inaccuracy after inaccuracy, often inexplicable: one history of the World Health Organisation’s malaria eradication programme said it had been planned to take 5-7 years, but the sources actually said 10-15; a graph showed cholera pandemics as having killed a million people, with no citation, while the main sources on the topic actually suggest that in 1865-1947 it killed some 12 million people in British India alone.

Now, it’s shockingly easy to make these mistakes — something I still do embarrassingly often, despite being constantly worried about it. When you write a lot, you’re bound to make some errors. You have to pray they’re small ones and try to correct them as swiftly as you can. I’m extremely grateful to the handful of subject-matter experts who will go out of their way to point them out to me. But the sheer pervasiveness of errors also allows unintentionally biased narratives to get repeated and become embedded as certainty, and even perhaps gives cover to people who purposefully make stuff up.

If the lack of replication or reproducibility is a problem in science, in history nobody even thinks about it in such terms. I don’t think I’ve ever heard of anyone systematically looking at the same sources as another historian and seeing if they’d reach the same conclusions. Nor can I think of a history paper ever being retracted or corrected, as they can be in science. At the most, a history journal might host a back-and-forth debate — sometimes delightfully acerbic — for the closely interested to follow. In the 1960s you could find an agricultural historian saying of another that he was “of course entitled to express his views, however bizarre.” But many journals will no longer print those kinds of exchanges, they’re hardly easy for the uninitiated to follow, and there is often a strong incentive to shut up and play nice (unless they happen to be a peer-reviewer, in which case some will feel empowered by the cover of anonymity to be extraordinarily rude).

This lack of effective institutions or incentives was really brought home to me recently by the publication of a paper in the prestigious journal History & Technology by Jenny Bulstrode of UCL, in which she claimed that the inventor Henry Cort had stolen his famous 1783 iron-rolling process from Reeder’s iron mill in Jamaica, where it had been developed by 76 black metallurgists by passing iron through grooved sugar rollers. It was a widely-publicised paper, receiving 22,756 views — eleven times as many views as the journal’s next most most read paper, and frankly unheard of for most academic papers — along with a huge amount of press coverage.

Bulstrode argued that Cort had heard of the invention via a relative, the master of the ship Abby, who had been in Jamaica and in November 1781 visited him in Portsmouth; that in March-May 1782 Reeder’s mill was destroyed by the British army on the pretext that it might be used for weapons in a slave revolt during wartime, but that this was really at the behest of Cort to destroy the competition; and that the grooved rolling machines at Reeder’s mill were dismantled and sent to Portsmouth where Cort could use them.

When I wrote about Bulstrode’s paper last month, I noted that she had not actually presented evidence for the claims that received so much press attention — something that ought to have been picked up by the journal’s peer reviewers. She presented no evidence that there was any invention at Reeder’s mill, nor that an invention was derived there in the way she claimed. I pointed out that the grooves on sugar rollers were entirely different to those used in Cort’s iron process. This could be seen from carefully reading Bulstrode’s paper itself, and taking her citations (if not her claims) at face value.

But things go from worrying to worse. After writing it, I was contacted by Oliver Jelf, currently completing a Masters thesis, who had actually bothered to go and check the main sources that Bulstrode cites, and has not only examined her interpretation of the sources in detail, but has transcribed those same sources for anyone to read and judge for themselves. Jelf notes that significant parts of Bulstrode’s story do not follow from the evidence she cites:

The sources instead suggest that ordinary and widespread ironmaking processes were in use at Reeder’s foundry; that no innovation occurred there; that the chain of events by which Cort is supposed to have heard of the foundry’s activities certainly did not occur; that Reeder’s foundry was destroyed because of the threat of a Franco-Spanish invasion force; and that no part of the foundry was removed from the immediate vicinity of the island, let alone taken to Portsmouth

In science terms, if you were to read those same sources before reading Bulstrode’s arguments, there is absolutely no way that you would derive the same conclusions as her . If you simply read the same primary sources that Bulstrode did and tried to create a narrative from them, I can’t see how you’d simply get anything more interesting than the following: “Reeder was the well-connected owner of a very ordinary iron foundry in Jamaica, which was profitable because it used slave labour and had very little competition on the island. In 1782, at the behest of panicked islanders, the military governor of Jamaica reluctantly razed the mill to the ground because of a rumoured imminent Franco-Spanish invasion, with some of the foundry’s weapons and ammunition temporarily brought onto British warships until the threat had passed. Oh, and in 1781 a distant relative of Henry Cort once sailed from Jamaica to Lancaster, which is nowhere near to where Cort was based in Portsmouth.” What I simply cannot fathom, now that I’ve read her sources thanks to Jelf’s transcriptions, is how Bulstrode arrived at her narrative at all.

What Jelf’s initiative reveals, I think, is a potential solution to at least some of the problems that history faces. Just as in the sciences it is considered good practice to make one’s data available, in history it should perhaps be a requirement to upload to some public repository the photographs or transcriptions of any cited archival sources that are not otherwise freely accessible online. This would not only help peer-reviewers assess a historian’s narrative — something that they frankly cannot do without going and reviewing those same sources for themselves — but would also hugely accelerate the rate at which historical work is digitised and made more available. You would not believe the sheer volume of photographs and transcriptions that thousands of historians have already made of various archival sources, which circulate only privately at best, but are usually never shared at all. We’ve already digitised so much, but have nowhere to put it — nor really much incentive to make it public.

Some of the more specialised journals still publish full transcriptions of newly-recognised archival sources, often with expert commentary and interpretation. You occasionally see it in some PhD theses still. It used to be very common in the late nineteenth century. But it really ought to be a matter of course, treated in the same way as a scientist’s data, and perhaps even revered. We need to change the incentives that historians face, so that making new sources available becomes one of the most lauded and rewarded things that they can do.

In the wake of the infamous 2011 replication crisis, a growing number of people have become actively concerned with how to shape the incentives and institutions that scientists face, to reduce the space for fraud, error, bias, and hype, as well as to improve the quality of science more generally — as much a movement as a field of study, allied under the label of “metascience”. I hope that history will see a similar movement. If we’re to maintain trust in historians, it could certainly do with it.

If you’d like to support my work, please consider upgrading your free subscription to a paid one here:

Discussion about this post

Charles C. Mann

Aug 29, 2023

I quite enjoyed this piece. I think it's worth noting that you're talking about two types of error. One is the irresistible anecdote--"Good Queen Bess actually expelled all the Africans in Britain!"--that turns out not to be true. The other is the systematic creation of a narrative for ideological reasons. Both have been around forever. Oddly, the second is easier to correct for than the first, because there are always people of different political flavors who will attack the biased narrative. Instead of being retracted, as in scientific papers, these tend to be relegated to the sidelines and forgotten. The system is not perfect, but things have worked this way many, many times in the past. In colonial US history, for example, Jenning's super-polemical "Invasion of America" was useful as a corrective in 1975, but it is so over-the-top biased that Pekka Hamalainen doesn't even cite it in "Indigenous Continent," his 2022 book covering the same ground.

The first kind, IMO, is much harder to deal with. Let me give an example from my own work. Almost 20 years ago I wrote a book called "1491," about American societies before Columbus. In it I discussed the Maya invention of zero, a major intellectual landmark. The archaeological evidence for this is tricky to understand, as are the various aspects of what zero means mathematically. For obvious reasons, I included a short section about when zero was invented and popularized elsewhere in the world. I came across a fascinating fact, one that had been repeated often in specialized texts--that the Catholic Church had banned the use of zero in medieval Europe, claiming that it was somehow un-Christian not to use Roman numerals. This was long ago, so I may have the details wrong, but I believe the ultimate source was a German history of mathematics by the distinguished scientific historian Otto E. Neugebauer (1899-1990). But the book had never been translated and I don't read German. I contacted a couple historians of science who confirmed Neugebauer's stellar reputation and said that they had heard the story. So I went with it--only to be contacted by a Catholic blogger who was furious at me for perpetuating this myth.

It turned out that Neugebauer was indeed a great historian, but also one with an anti-Catholic bias that he indulged by noting when Catholics (in his opinion) did dumb things. And it was true that Catholic rulers had banned zero, but that was because they were afraid that it would be used to swindle people--which Neugebauer apparently thought was idiotic, but doesn't seem that way to me. Anyway, some early readers had misunderstood Neugebauer's emphasis on the rulers' Catholicism as referring to their banning it for religion reasons, and this led naturally to some others thinking that the orders had to have come on high. So... the blogger was right, my book was wrong, and I corrected the paperback edition.

But because it's a small point, and exactly the kind of fun detail that perks up a text, I am quite confident that this mistake will go on endlessly--while the wild portrait of European colonizers as being a mostly monolithic force of inhuman beasts in Jenning's book has largely disappeared, except for certain sections of the Internet. (Newer accounts like Hamalainen's aren't exactly flattering, but they see colonization as a much more complex, many-sided, and incomplete process.)

Whew! Sorry this is so long, but your essay was interesting!

Expand full comment

10 replies by Anton Howes and others

Arnold Kling

Aug 29, 2023

"This lack of effective institutions or incentives was really brought home to me recently by the publication of a paper in the prestigious journal History & Technology by Jenny Bulstrode of UCL, in which she claimed that the inventor Henry Cort had stolen his famous 1783 iron-rolling process from Reeder’s iron mill in Jamaica, where it had been developed by 76 black metallurgists by passing iron through grooved sugar rollers. It was a widely-publicised paper, receiving 22,756 views — eleven times as many views as the journal’s next most most read paper, and frankly unheard of for most academic papers — along with a huge amount of press coverage."

There are papers with random mistakes, but this is not one of them. We are in an environment filled with what Bryan Caplan calls "social desirability bias." Get a result that appeals to progressive politics, and your paper will be easily published and the popular press will amplify it. Go in the other direction, and your paper will get denounced and retraction demanded, if it gets published at all.

Expand full comment

2 replies

108 more comments...

No posts

Age of Invention, by Anton Howes

Age of Invention: Does History have a Replication Crisis?

Discussion about this post