Wednesday, March 01, 2006

Institutional Repositories, and a little experiment

I have been writing about open access (OA) for at least five years now. During that time I have found myself becoming increasingly interested in the topic, and more and more convinced that it is important that the OA movement should succeed in its objectives.
When I started, many of the publications I wrote for were experimenting with the new online medium, and as a consequence my articles were generally made freely available on the Web. Over time, however, I found more and more editors were becoming reluctant to release their content in this way — a development that increasingly frustrated me.

Of course journalists — unlike the authors of academic papers — write for a fee, and I understood that in order for editors to pay me they had to make money somehow. Nevertheless, it seemed unfortunate that I was writing about open access but my articles were being put behind a walled garden, and an entrance fee charged for anyone who wanted to read them. Not only did it mean that the number of people who could potentially read what I had written was reduced but, in the new hyperlinked world, it felt counterintuitive

Last year, therefore, I started this blog, which was a very liberating experience. Finally I could write what I wanted, when I wanted, and I could make it available to anyone who had an Internet connection.

The downside, of course, was that I was not being paid for my articles, except in those rare cases where enlightened publishers like INDICARE were happy to pay me for an article (e.g. on digital rights management and OA) that I also posted on my blog.

While I am keen to continue writing about OA, and in a way that will enable me to maximise the number of people who can read what I write, it would clearly be helpful if I could earn a little money from that writing too!

To this end I have decided to try a little experiment: to self-publish some of my articles about OA via my blog, and then invite readers to pay to read them. No has to pay to read them, but those who find some value in them, and feel they would like to help, can do — on a strictly voluntary basis.

I am publishing the first such article today. This looks at the history of the institutional repository, and its relation to the OA movement.

Below are the first 800 words of the article (which is 10,500 words in total). Anyone wishing to read it in its entirety can click on the link at the bottom of this post. If after reading it you believe that doing so was of value to you then you might like to consider making a small contribution to my PayPal account. I have in mind a figure of $8, but whatever anyone felt inspired to contribute would be fine by me. Likewise, if they chose not to contribute, that would be fine too. Payment can be made quite simply by quoting the e-mail account: Please note that it is not necessary to have a PayPal account to make a payment.

What I would ask is that if you point anyone else to the article then you consider directing them to this post, rather than directly to the PDF file itself.


Clear blue water

While the concept of the institutional repository is not new, there has over the past year been a sudden upsurge of interest in the topic. This in turn has led to considerable disagreement about the nature and scope of an institutional repository, and its role within the academic institution.

Indeed, when JISC recently created a mailing list for those wishing to discuss issues related to the topic, the initial flurry of posts suggested that there are as many definitions of an institutional repository as there are those with an opinion on it.

So where did the institutional repository come from, and why has it become the source of so much disagreement — at the very moment when it looks set to enter the mainstream? More importantly, how should the OA movement react to these developments?

Reshaping the scholarly communication process

The seminal text on the subject was a paperThe Case for Institutional Repositories — written by Raym Crow in 2002.

In that paper Crow defined institutional repositories as "digital collections capturing and preserving the intellectual output of a single or multiple-university community."

Their role, he suggested, should be twofold. First: to "serve as tangible indicators of an institution's quality and to demonstrate the scientific, societal, and economic relevance of its research activities, thus increasing the institution's visibility, status, and public value"; Second: to provide tools to assist universities "re-shape the scholarly communication process".

As Crow put it, "While institutional repositories centralise, preserve, and make accessible an institution's intellectual capital, at the same time they will — ideally — form part of a global system of distributed, interoperable repositories that provides the foundation for a new disaggregated model of scholarly publishing."

Essentially, Crow envisaged that institutional repositories would enable universities to exploit the new digital networked world to regain control of scholarly communication. This, he said, would mean rethinking the relative roles of authors, librarians, and publishers, and "unbundling" the traditional model of publishing.

As a consequence, access to research would expand, and the monopoly power of journal publishers broken — monopoly power that publishers acquired, he maintained, as a result of their insistence that researchers hand over copyright in their scholarly papers as a condition of publication. This then allowed publishers to sell the research back to universities in the form of ever more expensive journal subscriptions.

By unbundling the publishing process into its constituent parts ("Registration, Certification, Awareness, and Archiving"), and reasserting ownership of the raw material (the papers), Crow argued, universities could break the chokehold that publishers had acquired over scholarly publishing.

"The purpose of a disaggregated scholarly publishing model," he explained, "is not to destroy the current journal system, but to weaken the monopolistic impact of that system on academic institutions and their libraries."

His perspective was not perhaps surprising: a managing partner at Chain Bridge Group, Crow’s thoughts were published as a position paper for The Scholarly Publishing and Academic Resources Coalition (SPARC) — an organisation created in 1997 by the Association of Research Libraries (ARL) "to be a constructive response to market dysfunctions in the scholarly communication system."

It was clear that something had to be done: With static or falling budgets, libraries were struggling to cope with the escalating cost of journals. Between 1986 and 2000, for instance, serial prices increased by 196%, compared to a rise in the Consumer Price Index of just 57%. The consequent "serials crisis", SPARC complained, had "reduced dissemination of scholarship and crippled libraries".

Crow's thesis was that if researchers retained copyright in their papers (merely granting publishers a non-exclusive licence), and deposited them in institutional repositories, the publisher's role could be restricted to activities like managing the peer review process, creating value-added "overlay" journals based on the content of the repositories, and providing services like "citation linking, controlled vocabularies, and the like". And this would allow universities to restore a more equitable power balance.

In other words, institutional repositories would allow universities to create a more cost-effective model, and force that model on publishers.

The evidence so far, he argued, "suggests that the resources required [to create the necessary infrastructure for the new model] would represent but a fraction of the journal costs that libraries now incur and over which they have little control."

In short, institutional repositories were viewed as a way in which librarians could address the affordability problem posed by the constantly rising, and eventually unsustainable, costs associated with buying serials.


But while the library community may have produced the defining document on institutional repositories, the concept was originally developed by academics.

Crow's ideas, for instance, owed a great debt to the e-print service arXiv, which had been developed in 1991 by Los Alamos physicist Paul Ginsparg. A central subject-based repository where researchers could self-archive preprints of their physics papers, arXiv had by the time Crow wrote his paper in 2002 been widely embraced by the physics community and spawned a number of imitators. Ginsparg had also published a number of papers and talks in which he had developed the idea of overlay journals..…..

To read the article in its entirety (as a PDF file) click here.


Anonymous said...

Interesting article and I'm going to send you a Paypal micropayment...but not because I agree with you.

Perhaps I'm missing the point, but it seems to me that the content, its discoverability and its free and long-term availability is what's important--not what sort of system it's stored in (or what other items might be stored along with).

If a digital archive provides reliable, long-term access to an article does it matter that only other articles are in the system? I believe that as we get ever more networked and interoperable and cross-linked, it doesn't matter to the reader where an item "lives" as long as it can be discovered, retrieved and used. So if a librarian fills one part of an IR with self-archived publications and another part with digital images of campus buildings what's the problem? Few will go to the "repository" or "archive" as their first stop in their search for an item. Instead, I they'll tend to arrive at the IR following a link for content they discovered via other means (OAI-built indexing services, Google Scholar, citations to DOI's, CV entries, etc.).

And good luck with the university IT staff (a culture that almost by definition thinks an article over 6 months old is already obsolete). My experience suggests they'll require a good bit of education before they get why this is important...much less what to do about it.

Richard Poynder said...

Thanks for the comment, and for the PayPal contribution. Much appreciated!

While I agree that in a networked world it does not matter where items "live", there seem to me to be two problems with the current situation; problems that suggest that the respective interests of the digital library and of open access are currently pulling in different directions.

First, many of the articles currently listed in IRs are not also freely available. If you go today to eScholarship@Amherst and click on the Paper of the Day you get the message "This thesis is not available electronically or by photocopy. Please contact Archives and Special Collections at for more information."

If you then cycle through hitting the NEXT ARTICLE button you seem to get the same message for every item in the IR. Indeed, my understanding is that there are no full-text articles at all in the Amherst IR.

While you could argue that this makes these items discoverable, what has it got to do with open access? As I think you are acknowledging, open access assumes "free, immediate, permanent access to refereed-article full-texts online"? I conclude from this that while librarians may be building IRs, they are often not facilitating open access in the process.

Second, there is the issue of timing. If we wait until libraries have solved all their digital library problems before we can provide open access, we will clearly have to wait a long time.

I think Clifford Lynch put it well in the article when he said "If all you want to do is author self-archiving, I suspect that there are likely to be cheaper and more quickly deployed solutions. [After all] if we don't stress too much about long term preservation, a system to support self-archiving at an institutional level should be a pretty inexpensive service to build."

For that reason, he suggested that one option "is to do one of these fast and cheap and then get on with the hard problems, and fold the papers back in to the fully featured institutional repository later."

That is what I am also proposing. However, since I am also proposing that the initial OA job be done by someone other than the librarian, my suggestion is perceived as being controversial.

Evidently I have upset some librarians with this proposal. Others, however, apparently take a different view .

Anonymous said...

Comparing Dorothea Salo's post, which offers a well argued rebuttal to Stevan Harnad's persuasive but tiresome focus on impact factor, with Tom Roper's "an excellent piece" (that's the extent of his review of "Blue Water Main") unfairly trivializes the former. Well done Dorothea (and well done Richard, other than your characterization of pro and con responses from librarians).

Heather Morrison said...

Librarians ARE archivangelists, too!

Richard - thank you for this article, and for all your journalistic work on OA - much appreciated, keep it up!

I would like to point out that enthusiasm for self-archiving varies amongst librarians - just as it does with researchers, some of whom are keen, and others completely indifferent.

Many of us librarians are archivangelists in our own right. I myself am an editor of the E-LIS Open Archive.

A group of us OA Librarian enthusiasts share a group blog - if you check this out at:,
you will see that most of us are very much involved with OA archiving.

You will find copies of my own academic work in the SFU Library Institutional Repository, too.

In Canada, the drive to build - and fill - institutional repositories - with the primary target being OA academic content, such as peer-reviewed postprints and theses - is led by the Canadian Association of Research Libraries.

You mention that new types of material are being included in IRs, such as grey literature and the like, which could transform scholarship.

I think you're on to something, but would broaden the content here. For example, one of the most exciting opportunities to transform scholarship is open data. To see what is possible, have a look at some sessions from the OAI4 conference (by Peter Murray-Rust, Liz Lyons, and Hans Pfeiffenberger), at:

Finally, a comment on your experiment: as a person who spends a fair bit of her spare time writing on a voluntary basis, I very much hope to see ways for the freelancer to make a bit of money.

I must admit that I'm not keen on this particular approach, however. Academic tradition involves giving away one's work, and you are writing on an academic topic. One of the central points of OA is that knowledge should be free - I don't want to see researchers charging for their articles, any more than I care for publishers charging for pay-per-view.

There likely are other approaches for freelancers to make money that I would like better.

Google's ad approach for blogs has some appeal, although personally, I would want to be able to vet the ads. Perhaps someday Google will find a means that I can live with. If they were to set up a fair-trade / environmentally friendly adsense option, I'd be set! For that matter - I'd be checking my own blog for shopping ideas.

I also have heard that some people are transforming their blogs into books and selling them - this is an idea that has some appeal, and may be worthy of your consideration, Richard.

best wishes,

Heather Morrison

Richard Poynder said...

Mark, Heather,

Thank you very much for your comments.

Mark: My point was simply that some librarians have responded to what I had to say in a positive light, and some in a negative light. There was no sleight of hand intended or, I believe, to be inferred from what I said.

The article, by the way, is called Clear Blue Water.

Heather: You are right: there are some librarian archivangelists. Unfortunately, there are far too few, and more and more librarians are becoming distracted by the need to create a digital library. This is both understandable and appropriate. If librarians don't build the necessary digital libraries, who will?

It also seems to be the case that some librarians are seeking to use institutional repositories to "change the publishing paradigm" in ways that might work counter to the objectives of open access.

For these reasons I concluded that it might be best to let librarians get on with the important job of building their digital libraries and, if they want to, experimenting with new publishing platforms, and hand off the task of creating and managing Open Access Archives (or whatever term one wants to use for a postprint archive) to someone else.

With regard to Open Data: yes, I am aware of the development of eScience and I had a very interesting e-mail dialogue with Peter Murray-Rust while researching the article. I concluded, however, that this is a whole subject in itself — one I plan to write about it at some point in the future.

Concerning your comments on my experiment, I fear your logic escapes me. The implication of what you say appears to be that if a journalist decides to write about an academic topic the resulting articles should be given away as if they were written by an academic, rather than by a journalist. Does it not therefore imply that all science journalists should be giving their articles away for free?

Does it not also perhaps imply that when academics write for newspapers (as many do) their articles should not be charged for either? If that is right, then how would it work in practice?

Newspapers, by the way, increasingly ask academics to write for them, often because it is a way of getting free content (since they don't have to pay a journalist to produce it). This freely obtained content is then "monetised" by the newspaper (through cover charges, and by selling advertising and subscriptions etc.).

But perhaps I misunderstand what you are saying. Maybe you could elaborate?

With regard to the idea of turning blogs into books, you will see that I am currently doing the very reverse!.

Anonymous said...

Good article, but I just want to comment on your 'little experiment'. I like the idea, but in my opinion, it hits a bit of a problem in maximising the number who pay-up

Those personally interested in an article, who feel that way inclined and have the cash to spare, will pay up.

My worry is institutions. With strict spending rules - especially in the public sector - we can not just borrow the company credit card to make a donation. Saying "I want you to authorise a payment for something I already have access to legally, but I think it would be a nice thing to do" will be hard to wash, especially with those wondering what the auditors will think.

This may not be such a problem with this article (about IRs) as may reading have a personal interest. But if you were to write an article on a less sexy topic, but of use to those in organisations, your key audience may well want to pay, but have no method of doing so without using their own personal money on a 'work thing'.

I've thought the same about small open source projects. Often an organistion will save thousands by using an open source tool instead an expensive equivalent. The open source software is run by a few individuals dedicating a lot of their own time and money, but the org has no easy way to donate a small sum (probably a small % of the cost for competiting products) which would be a big boast to the project (at least paying back their webhosting fees).

Richard Poynder said...

Thanks for this Chris. You raise a good point. I am also conscious that people might, for a variety of other reasons, prefer making payments by means other than PayPal.

If anyone can suggest alternative payment mechanisms I can offer in addition to PayPay I would love to hear about them.