Open and Shut?: The OA interviews: Philip Cohen, founder of SocArXiv

(A print version of this interview is available here)

Fifteen years after the launch of the Budapest Open Access Initiative (BOAI) the OA revolution has yet to achieve its objectives. It does not help that legacy publishers are busy appropriating open access, and diluting it in ways that benefit them more than the research community. As things stand we could end up with a half revolution.

But could a new development help recover the situation? More specifically, can the newly reinvigorated preprint movement gain sufficient traction, impetus, and focus to push the revolution the OA movement began in a more desirable direction?

This was the dominant question in my mind after doing the Q&A below with Philip Cohen, founder of the new social sciences preprint server SocArXiv.

Preprint servers are by no means a new phenomenon. The highly-successful physics preprint server arXiv (formally referred to as an e-print service) was founded way back in 1991, and today it hosts 1.2 million e-prints in physics, mathematics, computer science, quantitative biology, quantitative finance and statistics. Currently around 9,000-10,000 new papers each month are submitted to arXiv.

Yet arXiv has tended to complement – rather than compete with – the legacy publishing system, with the vast majority of deposited papers subsequently being published in legacy journals. As such, it has not disrupted the status quo in ways that are necessary if the OA movement is to achieve its objectives – a point that has (somewhat bizarrely) at times been celebrated by open access advocates.

In any case, subsequent attempts to propagate the arXiv model have generally proved elusive. In 2000, for instance, Elsevier launched a chemistry preprint server called ChemWeb, but closed it in 2003. In 2007, Nature launched Nature Precedings, but closed it in 2012.

Hope springs eternal

Fortunately, hope springs eternal in academia, and new attempts to build on the success of arXiv are regularly made. Notably, in 2013 Cold Spring Harbor Laboratory (CSHL) launched a preprint server for the biological sciences called bioRxiv. To the joy of preprint enthusiasts, it looks as if this may prove a long-term success. As of March 8^th 2017, some 8,850 papers had been posted, and the number of monthly submissions has grown to around 620.

Buoyed up by bioRxiv’s success, and convinced that the widespread posting of preprints on the open Web has great potential for improving scholarly communication, last year life scientists launched the ASAPbio initiative. The initial meeting was deemed so successful that the normally acerbic PLOS co-founder Michael Eisen penned an uncharacteristically upbeat blog post about it (here).

Has something significant changed since Elsevier and Nature unsuccessfully sought to monetise the arXiv model. If so, what? Perhaps the key word here is “monetise”. We can see rising anger at the way in which legacy publishers have come to dominate and control open access (see here, here, and here for instance), anger that has been amplified by a dawning realisation that the entire scholarly communication infrastructure is now in danger of being – in the words of Geoffrey Bilder – enclosed by private interests, both by commercial publishers like Elsevier, and by for-profit upstarts like ResearchGate and Academia.edu (see here, here and here for instance).

CSHL/bioRxiv and arXiv are, by contrast, non-profit initiatives whose primary focus is on research, and facilitating research, not the pursuit of profit. Many feel that this is a more worthy and appropriate mission, and so should be supported. Perhaps, therefore, what has changed is that there is a new awareness that while legacy publishers contribute very little to the scholarly communication process, they nevertheless profit from it, and excessively at that. And for this reason they are a barrier to achieving the objectives of the OA movement.

Reproducibility crisis

But what is the case for making preprints freely available online? After all, the research community has always insisted that it is far preferable (and safer) for scholars to rely on papers that have been through the peer-review process, and published in respectable scholarly journals, in order to stay up to date in their field, not on self-deposited early versions of papers that might or might not go on to be published.

Advocates for open access, however, now argue that making preprints widely available enables research to be shared with colleagues much more quickly. Moreover, they say, it enables papers to potentially be scrutinised by a much greater number of eyeballs than with the traditional peer review system. As such, they add, the published version of a paper is likely to be of higher quality if it has first been made available as a preprint. In addition, they say, posting preprints allows researchers to establish priority in their discoveries and ideas that much earlier. Finally, they argue, the widespread sharing of preprints would benefit the world at large, since it would speed up the entire research process and maximise the use of taxpayer money (which funds the research process).

Many had assumed that OA would provide these kind of benefits. In addition to making papers freely available, it was assumed that open access would introduce a quicker time-to-publish process. This has not proved the case. For instance, while the peer review “lite” model pioneered by PLOS ONE did initially lead to faster publication times, these have subsequently begun to lengthen again.

Above all, open access has failed to address the so-called reproducibility crisis (also referred to as the replication crisis). By utilising a more transparent publishing process (sometimes including open peer review) it was assumed that open access would increase the quality of published research. Unfortunately, the introduction of pay-to-publish gold OA has undermined this, not least because it has encouraged the emergence of so-called predatory OA publishers (or article brokers), who gull researchers into paying (or sometimes researchers willingly pay) to have their papers published in journals that wave papers past any review process.

The reproducibility crisis is by no means confined to open access publishing (the problem is far bigger), but it could hold out the greatest hope for the budding preprint movement.

Why do I say this? And what is the reproducibility crisis? Stanford Professor of Medicine John Ioannidis neatly summarised the reproducibility crisis in 2005, when he called his seminal paper on the topic “Why most published research findings are false”. In this and subsequent papers Ioannidis has consistently argued that the findings of many published papers are simply wrong.

Shocked at Ioannidis’ findings, other researchers set about trying to size the problem and to develop solutions. In 2011, for instance, social psychologist Brian Nosek launched the Reproducibility Project, whose first assignment consisted of a collaboration of 270 contributing authors who sought to repeat 100 published experimental and correlational psychological studies. Their conclusion: only 36.1% of the studies could be replicated, and where they did replicate their effects were smaller than the initial studies effects, seemingly confirming Ioannidis’ findings.

The Reproducibility Project has subsequently moved on to examine the situation in cancer biology (with similar initial results). Meanwhile, a survey undertaken by Nature last year would appear to confirm that there is a serious problem.

Whatever the cause and extent of the reproducibility crisis, Nosek’s work soon attracted the attention of John Arnold, a former Enron trader who has committed a large chunk of his personal fortune to funding those working to – as Wired puts it – “fix science”. In 2013, Arnold awarded Nosek a $5.25 million grant to allow him and colleague Jeffrey Spies to found the Center for Open Science (COS).

COS is a non-profit organisation based in Charlottesville, Virginia. Its mission is to “increase openness, integrity, and reproducibility of scientific research”. To this end, it has developed a set of tools that enable researchers to make their work open and transparent throughout the research cycle. So they can register their initial hypotheses, maintain a public log of all the experiments they run, and the methods and workflows they use, and then post their data online. And the whole process can be made open for all to review.

Open Science Framework

At the heart of the COS project is the Open Science Framework (OSF). This, COS executive director Brian Nosek explained to me last year, consists of two main components – a back-end application framework and a front-end view. “The back-end framework is an open-source, general set of tools and services that can be used to support virtually any service supporting the research lifecycle”, he explained, adding that the front-end is the interface through which researchers interact with the system.

How will this help the preprint movement? If the objective is to make the entire research process open and transparent then posting preprints is clearly an essential part of the OSF vision. And to assist in this the Open Science Framework includes a module called OSF Preprints. Any researcher can post preprints directly into OSF Preprints. Importantly, the service also allows “collections” to be created. These can be collections of, say, journals, meetings, registries, or indeed preprints. And they can be community-based collections with a branded community interface. SocArXiv is one of those community interfaces.

As COS Community Manager Matt Spitzer explained to me last year, “SocArXiv will simply be a branded service built on a generalised OSF pre-print service.”

As preprint fever spreads so a growing number of communities have begun to follow in SocArXiv’s footsteps. In the last few months we have seen the emergence of PsyArXiv, AgriXiv, and engrXiv, all of which piggyback on OSF Preprints. And most recently the Berkeley Initiative for Transparency in the Social Sciences has launched BITSS Preprints. In addition, The Electrochemical Society (ECS) has indicated that it too plans to leverage the Open Science Framework to create a preprint service.

Elsewhere, the Latin American online library and publishing platform SciELO has announced plans to launch a preprint service. And for those in the humanities the Humanities Commons has launched CORE.

True, CORE is described as a repository, but it caters for preprints too. Indeed, it seems likely that we could see repositories and preprint servers start to merge. In the Q&A below Cohen stresses that SocArXiv is not intended exclusively for preprints and, as we shall see, he believes it is important that it should not.

Clearly keen to play in the preprint pond, for-profits are riding the wave too. Both PeerJ and F1000Research now offer preprint services, although these are primarily intended to feed the pay-to-publish services these companies offer. Likewise, OA publisher MDPI has launched preprints.org, presumably for similar reasons.

Finally, we could note that the American Chemical Society (ACS) has announced plans to launch a preprint server (ChemRxiv) too. This is ironic given its response to the launch of ChemWeb 17 years ago, but underlines how attitudes to preprints have changed.

Central Service

As the number of preprint servers increases, however, so concern has grown that the landscape could become overly complex, and inefficient. At ASAPbio’s third meeting, therefore, it was proposed that a central preprint service be created. Explaining the logic for this, ASAPbio commented “an increasing number of intake mechanisms … may lead to confusion and difficulty in finding preprints, heterogenous standards of ethical disclosure, duplication of effort in creation of infrastructure, and uncertainty of long-term preservation.”

ASAPbio has already attracted $1 million in funding for the mooted Central Service, and since OSF Preprints could be said to contain the seeds for creating this – in so far as it is fast becoming the platform of choice for those setting up preprint servers and because, courtesy of its partnership with SHARE, it is already harvesting preprints from third-party servers (i.e. bioRxiv, arXiv, PeerJ and CogPrints) – COS is bidding to build the ASAPbio Central Service.

But the million-dollar question is whether this fledgling preprint movement has the potential to get the OA revolution back on track, and perforce reduce the degree of control publishers now have over scholarly communication. Key to this, of course, will be whether the new services can attract sufficient papers to make them viable, and whether they will prove financially sustainable over time. Above all, however, their success will depend on whether they can play a meaningful role in reinventing scholarly communication for the networked world.

Here we could note that in the Q&A below Cohen voices concern that ASAPbio envisages the Central Service as catering for preprints alone. This, he says, could prove “a gift to the publishers, who retain their dominance by controlling the so-called ‘version of record.’”

He adds: “There is no reason to erect this barrier between systems, where the ‘preprints’ system only publishes non-reviewed work, and the journals only publish reviewed work – except to protect the revenue stream of the publishers.”

Importantly, Cohen warns, fixating on “the idea of the ‘complete’ draft may impede innovation toward more advanced forms of communication.”

As we noted, arXiv has done little to disrupt the legacy publishing system. The danger is that the new generation of preprint servers will achieve little more than arXiv in this regard. That is, they could become no more than repositories of articles jostling for a place in traditional (or pay-to-publish OA) journals. Already we can see journal editors seeking to position them as passive reservoirs of papers waiting to be selected for publishers’ pay-to-publish mills.

Preprint servers have the potential to be far more than that. They should be viewed as nurseries in which new forms of scholarly communication are experimented with and developed. As such, they should be viewed as separate and, to a great extent, independent of the legacy journal system. One hopeful sign here is that we have seen the emergence of new overlay journals like Discrete Analysis and Quantum. Built on top of arXiv these tend to be scholar-led, community owned journals created and managed to review, highlight, and disseminate high-quality research papers, not to monetise them. As such, they can be seen as alternatives rather than complements to the traditional system (and its oligopolists).

Complete the revolution?

Evidently, Cohen would like to see SocArXiv play a similar role. When I asked him if he envisaged the service adding comment and post-publication review functionality, or becoming a platform for new overlay journals he replied, “[I]t’s important to point out that, as an open service, it is possible right now for anyone to develop those functions. Any institution, working group, department, or library could put up a list of papers, automatically or manually generated, and host discussions on them, facilitate peer review, and produce their own overlay journals. A big part of our outreach job in the coming year is to get people who have the knowhow and resources to develop such things to jump on it and bring them to fruition.”

Cohen clearly also has his eye on a world beyond the traditional journal. Writing on the LSE blog last year he said, “I hope that SocArXiv will enable us to save research from the journal system.”

And below he points out that scholarly work involves far more than journal articles, not least data and commentary. “SocArXiv does not require the disruption of the journal system, but if we help make that happen, and help build a better system to replace it, I would be glad.”

The good news is that if the preprint movement flourishes, and manages to maintain an existence independent of traditional publishers, it has the potential to complete the revolution the OA movement began. And if all else fails, it could seek to cut publishers out of the loop altogether and take back ownership of scholarly communication.

Alternatively, of course, it may – like the OA movement more generally – end up captured and exploited by legacy publishers, who will seek to use it in a way that props up the outdated and inefficient model of scholarly communication that currently allows them to make excessive profits from the public purse. Not only would this be a waste of taxpayers’ money, but it would hobble and hold back the global research endeavour.

The interview begins …

RP: What is SocArXiv, who should use it, and why?

PC: SocArXiv is an open archive of the social sciences, a free, noncommercial service for rapid sharing of academic papers. It is built on the Open Science Framework, an open access, open source platform that also allows researchers to upload entire projects (e.g., data and code) and link them to research results.

Anyone who does research in the social sciences should consider using it. Because SocArXiv is a not-for-profit alternative, researchers can be assured that they are sharing their research in an environment where access, inclusivity, and preservation, rather than profit, will remain at the heart of the mission.

All this is in contrast to the for-profit companies that want to monetize your research, including Academia.edu, ResearchGate, the Elsevier products Mendeley and SSRN, and Google Scholar. They may or may not provide people with something useful – access, storage, social networking, metrics – but they exist to make money for their investors, and that’s not our mission.

RP: How is the service managed and by who?

PC: SocArXiv is administratively housed at the University of Maryland, under my direction, with a steering committee of sociologists and academic librarians. That means that our grant money is administered by UMD, and we receive tax deductible contributions through the university’s foundation.

In our operations we are a partner of the nonprofit Center for Open Science (COS), which built and operates the archive. As a member community of the COS Preprints service, we participate in their Advisory Group, which consults on questions of governance and technology.

RP: As you indicated, the SocArXiv steering committee is heavy on sociologists. Does that tell us anything beyond the fact that you are a sociologist and so presumably reached out to your colleagues in the first instance?

PC: That’s correct. Being a small operation, it helped to start with people in one discipline as a way to organize our discussion of needs and desires – what we want, and how can we make it happen.

Of course, our needs and desires are very similar to those of people in other disciplines, but it helped to think locally. The system is open to all disciplines – anyone who wants their work to appear under the words “social science” (we have a number of papers, for example, from anthropology, geography, and urban planning). It’s also important that the sociologists on the committee include experts in such subjects as the sociology of knowledge, organizations, social movements, and higher education.

Beyond our researchers, by working with leaders of the academic library community as well, we are developing the project on a foundation of good preservation, access, and public service – and lots of experience managing information projects. Additionally, as we gain institutional supporters we are including them on a consultative advisory board.

RP: Are social scientists more or less likely to embrace open access and preprint servers than other disciplines? What are the discipline-specific issues here, and are there any disincentives for social scientists to use a service like SocArXiv?

PC: I can’t generalize to social science in general, but some patterns are clear. For example, economists are used to reading important work online before it’s peer reviewed, and they have high-status outlets for working papers that are recognized outside of academia – as when major news organizations report on NBER Working Papers.

Sociologists, on the other hand, expect to hear about interesting research first at a conference – where they will see slides but not have access to a paper – and then wait months (or years) to read it in a peer reviewed journal. I use that example purposefully, because it also correlates with the massive disparity in social and political influence between economics and sociology.

RP: How receptive are social science journals to accepting papers that have been on a preprint server? Is there an issue here?

PC: I don’t know of any major social science journal that will not accept papers that have been posted in a public repository. The American Sociological Association, for example, although it has a bad track record of operating for-profit journals and discouraging open access, explicitly permits publication in all of its journals of papers that have been posted in non-peer-reviewed repositories.

RP: We last spoke in July 2016. What has changed since then, and is the service proving more or less popular/successful than you anticipated?

PC: We have made great strides since our soft launch last summer as the first community in the OSF Preprints service. In December COS launched a more fully featured web interface for uploading and discovering papers, and several other communities have started up (in agriculture, psychology, engineering, and research transparency). All the papers from these services become part of the same open system.

As COS is leading on the technology, we have been concentrating on the scholar and community side. We have received grants of $50,000 each from the Alfred P. Sloan Foundation, and the Open Society Foundations, and contributions of $10,000 each from two libraries (UCLA and MIT). At the University of Maryland, we have received support from the Department of Sociology, the College of Behavioral and Social Sciences, and the University Libraries. We are using this money for outreach and development, to build the user base and expand the community, and to bring people together to work on next steps.

To that end, this year we will hold a symposium called O3S: Open Scholarship for the Social Sciences, on the UMD campus October 26-27. We hope it will be the first in a series of conferences, and we will feature panels showcasing open scholarship, research on open scholarship, and a workshop on the future of SocArXiv. With keynote addresses by COS co-founder and CTO Jeffrey Spies and sociologist Tressie McMillan Cottom we think this is going to be a great event. And we will have some funding to bring junior scholars to the symposium. (The call for papers and more information is available on our blog site, SocOpen.org.)

Meanwhile, new people are posting papers every day. At this writing we recently passed 800 papers, posting at a rate of several per day. March looks great, starting off at double the rate of the previous two months. Of course, this is an infinitesimal fraction of the social science coming out. I had naively thought we would grow faster.

The users remain concentrated among people who use Twitter and people who are motivated to move their papers over from the corporate paper sites. So there is lots of room for growth, and outreach is the watchword.

New scholarly communication system

RP: Preprint servers seem to be enjoying a new lease of life, particularly in the wake of the launch of bioRxiv and the ASAPbio initiative. Most recently, we have seen announcements for new preprint servers from SciELO and ECS. Do you see SocArXiv as part of a new movement? If so, how would you characterise the nature and the goals of this movement?

PC: Preprints are a good workaround for our highly dysfunctional journal publishing system. With preprints you can get your work out in a timely way, to actual readers, while preserving your ability to publish in regular journals for prestige and promotion. Lots of credit to the big idea from arXiv.org, which started this for math and physics decades ago. They have preserved their journal system while enhancing the efficacy and efficiency of their research.

This is what we want to do for the social sciences in the near term, while participating in the broader interdisciplinary movement to build a new scholarly communication system over time.

RP: We have also seen a recent call for a central preprint service. Some have expressed doubts about this. For instance, quantum physicist Michael Nielsen commented “it creates an effective monopoly, which tends to suppress innovation”. On the other hand, the institutional repository movement has demonstrated that creating an effective distributed system faces its own kind of challenges. What are your views on the need for a central service, and the pros and cons of central vs. distributed services? Would a central service be competitive with subject-specific preprint servers like SocArXiv in your view, or complementary?

PC: I have positive and negative responses to the central preprint service. On the positive side, I reject the fear that a central service will be a monopoly and suppress innovation. This shows a fundamental misunderstanding of open systems. If they are really open, they can’t be monopolies, because they present no obstacles to entry or innovation. You can’t start a petroleum or journal publishing company today because Exxon or Elsevier will crush you in the marketplace – you need to take sales away from them to succeed, and they will sell what you are selling for less, preventing you from getting started. Truly open scholarship is not like that. Anyone can distribute the information however they want without taking it away from anyone else.

Of course there is competition in open scholarship – for attention, for grant money, for legitimacy – but it is not like actual market competition because the products are free and unlimited copies. The idea that ASAPbio or COS is dominant like Exxon or Elsevier is dominant is just very naïve about the power of global capitalism.

Seriously, Elsevier is making billions of dollars off a premodern publishing system that no one in their right mind would have designed this way half a century ago. That’s suppressing innovation. COS is the size of a thumb drive to them; it could be a thousand times bigger without posing the threat to innovation that they do. On the contrary, beyond their own innovation, open platforms like the OSF encourage innovation by others because anyone can build integrations and applications on top of them.

And that brings me to my negative response. ASAPbio intends the Central Service to include only preprints, which they define as, “Complete and public drafts of scientific documents, yet to be certified by peer review.” I believe this definition – which preserves the journal article as the unit of scholarly output – is limiting in two ways.

First, by insisting that preprints are not yet peer reviewed, it is a gift to the publishers, who retain their dominance by controlling the so-called “version of record.” There is no reason to erect this barrier between systems, where the “preprints” system only publishes non-reviewed work, and the journals only publish reviewed work – except to protect the revenue stream of the publishers.

Second, the idea of the “complete” draft may impede innovation toward more advanced forms of communication. Of course that is how most researchers in the journal disciplines work today, but a more innovative future is within our grasp.

In real life, today, scholarly work includes registrations, code, data, comments, and reviews themselves – but we usually only count published papers. Work does not stop when a draft is “complete.” Just yesterday I had the very common, frustrating experience of flipping back and forth between two papers by the same research team, produced in series, with the second building only very slightly off the first. The team was spinning out small bits of “complete” research in rapid succession, to publish them as quickly as possible – and maximize the lines on their CVs.

If scholarly communication were allowed to break out of the journal article mode, they could simply have rolled out sequential analyses along a research path. The peer review system that accompanies such innovation would be more efficient and – if it were conducted according to open scholarship principles – more informative and engaging, with reviews of different components of the research ideally provided as context to readers and researchers alike as the project evolves.

This is just one scenario, used to illustrate the possibilities for genuine innovation outside the relatively ancient and hidebound paper system. Post-publication review may turn out to be great, and I’m worried that a narrow definition of preprints will hinder that potential development.

For what it is worth, although we are on a system called “OSF Preprints,” SocArXiv invites people to post working papers (drafts in progress), preprints (things to be published), and postprints (things already published elsewhere), as long as the author has the right to distribute them. We see no reason to impose limits to one or another of these categories.

Clearly, the norms and practices associated with emerging scholarly communication systems are yet to be established. We want to develop new ideas while also allowing people to get jobs, get promoted, and use peer review to maintain standards of quality – all at higher speed and reduced cost – and we think we’re off to a great start at doing that.

One final point on the Central Service: I’m excited by the proposal from the Center for Open Science, in response to the Request For Applications. In addition to an exemplary model of community governance, great technology, and a demonstrated commitment to open science principles in so many ways, COS offers the prospect of a preprint system that ties in to a wider set of tools and materials, which – while meeting the requirements of the RFA – might allow the system that evolves to be less constraining that I’m afraid it might otherwise be. I don’t know who the other contenders are, but I’d love to see COS build it.

RP: As SocArXiv will be using the Center for Open Science platform it will be linked into SHARE. What does SHARE bring to the party? Presumably its function is as a discovery service only, since its currency is metadata rather than full-text, right? The OAI-PMH harvesting protocol that the IR movement developed was based on metadata, but has not really been that successful. What are your thoughts on these matters?

PC: What SHARE brings to SocArXiv is the same thing it brings to all of the 150 data sources it currently aggregates, from the giant arXiv and PubMed Central to smaller individual institutional repositories and SocArXiv. SHARE is not designed to be a discovery platform in and of itself; it harvests, normalizes, and then distributes a dataset of research events, which include the posting of preprints.

Through SHARE, the Association of Research Libraries and COS provide public infrastructure for disseminating metadata for any purpose. SHARE provides great opportunities for SocArXiv, allowing people to create custom research streams, institutional reports, discovery tools, and anything else you can do with research metadata.

As a rudimentary example, I myself (knowing next to nothing about such things) built a Twitter feed for SocArXiv papers using SHARE (@socarxivpapers), which I described on our blog.

Someone who knew what they were doing could do a lot more, and we’re excited to make that possible. (I am not dodging the question of OAI-PMH, it’s just beyond my expertise to comment on that.)

RP: You mentioned data and software code earlier. SocArXiv acts as a repository for these too?

PC: Yes. SocArXiv and the other services on OSF Preprints run on the Open Science Framework. Preprints may be nested within projects on that platform, and include any research materials.

This is a very powerful and flexible platform, which includes storage, researcher collaboration tools, versioning, analytics, variable public access settings, and the ability to mint DOIs. This is a great benefit of working with COS, which is providing this application framework as a free public good.

Copyright

RP: Last July, The Scholarly Kitchen gave you a hard time over whether uploads to SocArXiv are vetted, and suggested that without moderation the service will have a problem with regard to copyright infringement. What is the current situation, and who is responsible if a paper uploaded to the service infringes someone’s copyright? Likewise, how are nonsense, off-topic and inappropriate papers filtered out (are they)?

PC: Our mission is to provide access, not to police copyright. All SocArXiv users agree to the COS terms of use, which, in accordance with the Digital Millennium Copyright Act, offers a means of complaining if anyone thinks something has been posted in violation of their copyright.

To my knowledge we have yet to receive such a complaint. Maybe Scholarly Kitchen thinks everyone has a moral obligation to play the role of copyright police. This is not our job. Although we will of course comply with the law, as noted, we’re not raising and spending money and recruiting volunteers to devote to the prevention of minor copyright infractions.

In my experience, most authors have no idea what’s in the ridiculous contracts they sign, and they often veer between exaggerated paranoia and reckless egalitarianism when it comes to sharing their work.

Often, we get the worst of both worlds. For example, I learned from your tweet of a new (paywalled) study finding that 40% of papers on ResearchGate were in violation of publisher copyrights. This is a case when researchers are stealing their own work from Elsevier (and others) and then giving it to ResearchGate to sell, for which the researcher receives nothing. Congratulations, academic freedom! As I wrote about Sci-Hub, “if your entire enterprise can be brought down by the insertion of 11 characters into a URL, your system may in fact not be sustainable.”

On the question of moderation and quality control, at present papers are not vetted before they are posted. We manually take down the very few things that are obviously inappropriate. This works when you’re taking in a few papers a day, but obviously we will need a more robust moderation system as the service grows, including clear guidelines and a routine plagiarism check.

It is our hope that we can persuade researchers to reallocate some of the time they currently donate as reviewers in the service of monopolistic for-profit companies to our public-good project, and volunteer to work as moderators (as arXiv has done). COS is currently developing the moderation dashboard we will need to carry this out.

That said, I personally think it would be good for us to get beyond the fear of having our work contaminated by the proximity to work of lesser quality (or elevated by the esteemed contributions of others, for that matter). It is different when people discover books by browsing shelves; in that case it’s a shame to have bad books getting in your way. But with a free digital archive the downside to accepting bad work is not so great.

We expect people will mostly find specific research on SocArXiv through, for example, published citations, the recommendations of colleagues, through aggregations created by subject experts, from institutional lists, conference programs, and social media.

We also hope to provide tools such as lists of most-read, most-cited, most-favourably reviewed, and so on (or these may be developed by third parties). Most mathematicians don’t read raw feeds from arXiv, and we don’t think that’s how people will use SocArXiv either.

I think we will be able to surface great work without requiring all submissions to be of high quality, with all the energy and expense that would entail. We encourage people to brag not about the existence of their paper on SocArXiv, but rather about its value.

RP: Where does SocArXiv fit with the larger agenda that I think you refer to as “open scholarship”? Where does open begin and end so far as research in social sciences is concerned?

PC: To clarify, when we say “open scholarship,” we are aligning with the open science movement, but including those who don’t consider their work to be “science.” The open approach responds to many of the problems we face in the research community today, including the long run issues in academia generally and the current crisis associated with the Trump presidency.

The SocArXiv steering committee just posted a statement in response to the planned March for Science, titled “Social Science without Walls,” which summarizes our view on this question. In it we argue that SocArXiv will help us realize our collective goals of making our work better, more efficient, more relevant, and less hierarchical.

The social science without walls made possible by open source, open access research infrastructure, we wrote, “allows us to make the best use of our resources, improve the process and products of our work, bring it to more people faster, and dissolve the obstacles to interaction that plague our industry.”

From the research process itself through dissemination of results and – crucially, today especially – engagement with wider publics, open scholarship is foundational to our vision of social science.

RP: I believe you are of the view that the research community needs to take back control of scholarly publishing. What does that mean in practice? Does it mean, for instance, you believe traditional publishers no longer have a valid role in scholarly communication? And how does SocArXiv facilitate the process of taking back control?

PC: Most of what commercial journal publishers do academics actually do. We research, write, review, edit, and promote our work – and commercial publishers organize that labour, partly to our benefit and the public’s benefit but largely to their own. Some of what they do is outside of our expertise, including editing and producing publications, but those functions are secondary. And a lot of what they do is only necessary to serve the needs of the system they rely on, such as marketing and policing copyrights and devising means of keeping content from reaching readers.

An open access scholarly publishing system could do more, faster, better, and vastly cheaper, without most of what commercial publishers do. SocArXiv does not require the disruption of the journal system, but if we help make that happen, and help build a better system to replace it, I would be glad.

Funding and the future

RP: You mentioned that SocArXiv has already attracted some funding. Can you say more about funding and how it can be assured over time? How successful have you been to date in your funding efforts? Can you envisage the service ever offering paid-for services in order to be financially sustainable? If so, what kind of services, and whom would you expect to be billed?

PC: The operation of the archive is funded by COS at present. The grant money and institutional contributions we have so far are going to design and outreach and governance efforts. I hope we will be able to continue building the system with money from foundations such as those that support us now, as we develop a model of sustainability that derives support from the voluntary contributions of academic institutions and research funders.

I have been inspired by arXiv’s model (and they have graciously consulted with us, in addition to letting us riff off their name), and I hope that we can follow in their footsteps on sustainability as well.

We are committed to offering a free service for researchers and readers, and open access indefinitely. We might in principle offer ancillary services to institutions for a fee, but we have as yet no such plans. (Note to institutional readers: if you are currently paying SSRN thousands of dollars per year for a paper series or a list of papers, contact us!)

RP: To what extent is the SocArXiv project focused on advocacy as much as service provision? More generally, is there a danger that the preprint movement might end up chasing after buzzwords and trends, rather than sparking fundamental change in scholarly communication (which I think has been a tendency within the OA movement)?

PC: We have to do some of both – advocacy and service provision – but ultimately I hope our service will be our advocacy. I can write polemics all day long (and I often do), but in the absence of a working open archive they won’t mean that much.

Participation in the archive is not conditional on some political or social movement affiliation. At our most ambitious we do want to shift the ground on which social science is built, but that’s going to require offering something new and professionally rewarding beyond a cutting critique of Elsevier.

RP: What future plans are there for SocArXiv? I have seen mention of a comment function, post-publication review, overlay journals? Are these all on the table? What other features/functionality do you anticipate offering in the future?

PC: Those are all potentially important features, although not as important as a smoothly operating basic archive, with transparent governance, shared norms, and community support – so I’m not rushing.

However, it’s important to point out that, as an open service, it is possible right now for anyone to develop those functions. Any institution, working group, department, or library could put up a list of papers, automatically or manually generated, and host discussions on them, facilitate peer review, and produce their own overlay journals. A big part of our outreach job in the coming year is to get people who have the knowhow and resources to develop such things to jump on it and bring them to fruition.

I especially want to encourage people who are already in the business of aggregating papers – such as conferences and paper competitions – to use the system. Anyone running a paper competition could require the papers be posted on SocArXiv, where they could be juried as they are made public.

Similarly, conference submissions could be done through the archive, with papers tagged according to their panel sessions or subject areas. These are simple examples of how we could do work we are already doing but in an open way, using the tools SocArXiv already has made available to move toward an open scholarship culture.

RP: What are the primary obstacles today to achieving the changes you would like to see to scholarly communication, how can they be overcome, and what long-term opportunities does the open agenda offer the research community?

PC: You may have meant practical obstacles, but all these words later I’m inclined toward a more philosophical answer. To my mind, our biggest obstacles are institutional inertia and risk aversion.

No reasonable person would design an academic publishing system like this if we were building it today. When I was a grad student in the early 1990s, before the web, we had to physically be in the library to read the journals (I did not subscribe to any). Now that we have the capacity to provide them to anyone anywhere at a fraction of the cost, are they any more accessible?

The great innovations in journal publishing technology in the last quarter century seem to have gone to building and maintaining elaborate paywall and authentication systems, and legal protocols to enforce them – and more is spent keeping people out than bringing people in.

The American Sociological Association, in my own discipline, still allocates “pages” to journal editors according to the cost of printing and shipping paper, setting an arbitrary limit to how many “top” articles may exist. Fearing a future in which “the journal world may not be as profitable in the future as it is now,” ASA’s response is to work on inventing new paywall journals.

Inertia is normal for social institutions, of course, but journal publishing seems to have more than most. I’m sure this comes from the slow turnover of generations in academia, and from the constricting job market that compels professors to squeeze harder to make students in their own image, out of fear of joint failure.

That’s probably also why they fight so hard to maintain our arbitrary prestige and ranking system, which bestows success or failure on scholars before anyone beyond a tiny committee of reviewers has laid eyes on their actual work, much less assessed its impact.

There is also big money at stake. But it’s not just executives and managers of the multinational conglomerates that sit atop the system, it’s also the conferences and receptions and awards (and tote bags) they dole out, for which the vast majority of faculty continuously scrap.

We could do so much better for so much less money.

But there are risks. We have to be willing to try new things, to step out from under the current system. We have to evaluate people not based on the pedigree of their journal publications but on the quality of their work. We have to reward career pathways that differ from the ones that got us where we are.

Some attempts will fail. But if we’re guided by sound principles, focus on what’s important, and play to our strengths – doing the things we do well and contracting for the things we don’t – the rewards will be greater down the road. And that’s what the open agenda offers.

RP: Thank you for taking the time to answer my questions