(A print version of this interview is available here)
Fifteen years after the launch of the Budapest Open Access Initiative (BOAI) the OA revolution has yet to achieve its objectives. It does not help that legacy publishers are busy appropriating open access, and diluting it in ways that benefit them more than the research community. As things stand we could end up with a half revolution.
But
could a new development help recover the situation? More specifically, can the newly
reinvigorated preprint movement gain sufficient traction, impetus, and focus to
push the revolution the OA movement began in a more desirable direction?
This
was the dominant question in my mind after doing the Q&A below with Philip Cohen, founder of the
new social sciences preprint server SocArXiv.
Preprint
servers are by no means a new phenomenon. The highly-successful physics
preprint server arXiv (formally
referred to as an e-print service) was founded way back in 1991, and today it hosts
1.2 million e-prints in physics, mathematics, computer science, quantitative biology, quantitative finance and statistics. Currently around 9,000-10,000 new papers each
month are submitted to arXiv.
Yet
arXiv has tended to complement – rather than compete with – the legacy
publishing system, with the vast majority of deposited papers subsequently being
published in legacy journals. As such, it has not disrupted the status quo in
ways that are necessary if the OA movement is to achieve its objectives – a
point that has (somewhat bizarrely) at times been celebrated by open access
advocates.
In
any case, subsequent attempts to propagate the arXiv model have generally proved
elusive. In 2000, for instance, Elsevier launched a chemistry
preprint server called ChemWeb, but closed it in 2003. In
2007, Nature launched Nature
Precedings,
but closed it in 2012.
Hope springs eternal
Fortunately,
hope springs eternal in academia, and new attempts to build on the success of
arXiv are regularly made. Notably, in 2013 Cold Spring Harbor Laboratory (CSHL)
launched a preprint server for the biological sciences called bioRxiv. To the joy of preprint
enthusiasts, it looks as if this may prove a long-term success. As of March 8th
2017, some 8,850 papers had been
posted, and the number of monthly submissions has grown to around 620.
Buoyed
up by bioRxiv’s success, and convinced that the widespread posting of preprints
on the open Web has great potential for improving scholarly communication, last
year life scientists launched the ASAPbio
initiative.
The initial meeting was deemed so successful that the normally acerbic PLOS co-founder Michael Eisen penned an
uncharacteristically upbeat blog post about it (here).
Has
something significant changed since Elsevier and Nature unsuccessfully sought to
monetise the arXiv model. If so, what? Perhaps the key word here is “monetise”.
We can see rising anger at the way in which legacy publishers have come to
dominate and control open access (see here, here, and here for instance), anger
that has been amplified by a dawning realisation that the entire scholarly communication
infrastructure is now in danger of being – in the words of Geoffrey Bilder – enclosed by private
interests, both by commercial publishers like Elsevier, and by for-profit
upstarts like ResearchGate and Academia.edu (see here, here and here for instance).
CSHL/bioRxiv
and arXiv are, by contrast, non-profit initiatives whose primary focus is on research, and facilitating research, not the pursuit of profit. Many feel that
this is a more worthy and appropriate mission, and so should be supported. Perhaps, therefore, what has changed is that there is a new awareness that while legacy publishers contribute
very little to the scholarly communication process, they nevertheless profit from it, and excessively at that. And for this reason they are a barrier to achieving the
objectives of the OA movement.
Reproducibility crisis
But
what is the case for making preprints freely available online? After all, the
research community has always insisted that it is far preferable (and safer) for scholars to rely on papers that have been through the peer-review process, and published
in respectable scholarly journals, in order to stay up to date in their field, not on self-deposited early
versions of papers that might or might not go on to be published.
Advocates
for open access, however, now argue that making preprints widely available enables
research to be shared with colleagues much more quickly. Moreover, they say, it enables papers to potentially be scrutinised by a much greater number of eyeballs
than with the traditional peer review system. As such, they add, the
published version of a paper is likely to be of higher quality if it has first been made available as a preprint. In addition, they say, posting
preprints allows researchers to establish priority in their discoveries and
ideas that much earlier. Finally, they argue, the widespread sharing of
preprints would benefit the world at large, since it would speed up the entire
research process and maximise the use of taxpayer money (which funds the
research process).
Many
had assumed that OA would provide these kind of benefits. In addition to making
papers freely available, it was assumed that open access would introduce a
quicker time-to-publish process. This has not proved the case. For instance, while
the peer review “lite” model pioneered by PLOS ONE did initially lead to faster
publication times, these have subsequently begun to lengthen again.
Above
all, open access has failed to address the so-called reproducibility crisis (also
referred to as the replication crisis). By utilising a more transparent publishing process (sometimes including open peer review) it was assumed that open
access would increase the quality of published research. Unfortunately, the
introduction of pay-to-publish gold OA has undermined this, not least because
it has encouraged the emergence of so-called predatory OA
publishers
(or article brokers), who gull researchers
into paying (or sometimes researchers willingly pay) to have their papers published
in journals that wave papers past any review process.
The
reproducibility crisis is by no means confined to open access publishing (the
problem is far bigger), but it could hold out the greatest hope
for the budding preprint movement.
Why
do I say this? And what is the reproducibility crisis? Stanford Professor of
Medicine John Ioannidis neatly summarised the reproducibility crisis in 2005, when
he called his seminal paper on the topic “Why most published research findings are false”. In this and
subsequent papers Ioannidis has consistently argued that the findings of many
published papers are simply wrong.
Shocked
at Ioannidis’ findings, other researchers set about trying to size the problem
and to develop solutions. In 2011, for instance, social psychologist Brian Nosek launched the Reproducibility
Project,
whose first assignment consisted of a collaboration of 270 contributing authors
who sought to repeat 100 published experimental and correlational psychological
studies. Their conclusion: only 36.1% of the studies could be replicated, and
where they did replicate their effects were smaller than the initial studies
effects, seemingly confirming Ioannidis’ findings.
The
Reproducibility Project has subsequently moved on to examine the situation in cancer
biology (with similar initial
results).
Meanwhile, a survey undertaken by Nature last year would appear to confirm
that there is a serious problem.
Whatever
the cause and extent of the reproducibility crisis, Nosek’s work soon attracted
the attention of John Arnold, a former Enron trader who has committed a large
chunk of his personal fortune to funding those working to – as Wired puts it – “fix
science”. In 2013, Arnold awarded Nosek a $5.25 million grant to allow him and colleague
Jeffrey Spies to found the Center for
Open Science (COS).
COS
is a non-profit organisation based in Charlottesville, Virginia. Its mission is
to “increase openness, integrity, and reproducibility of scientific research”. To
this end, it has developed a set of tools that enable researchers to make their
work open and transparent throughout the research cycle. So they can register their
initial hypotheses, maintain a public log of all the experiments they run, and the
methods and workflows they use, and then post their data online. And the whole
process can be made open for all to review.