Tag Archives: institutional repository

Who is requesting what through Cambridge’s Request a Copy service?

In October last year we reported on the first four months of our Request a Copy service. Now, 15 months in, we have had over 3000 requests and this provides us with a rich source of information to mine about the users of our repository.  The dataset underpinning the findings described here is available in the repository.

What are people requesting?

We have had 3240 requests through the system since its inception in June 2016. Of those the vast majority have been for articles 1878 (58%) and theses 1276 (39%). The remaining requests are for book chapters, conference objects, datasets, images and manuscripts. It should be noted that most datasets are available open access which means there is little need for them to be requested.

Of the 23 requests for book chapters, it is perhaps not surprising that the greatest number  – 9 (39%) came for chapters held in the collections from the School of Humanities and Social Sciences. It is however possibly interesting that the second highest number – 7 (30%) came for chapters held in the School of Technology.

The School of Technology is home to the Department of Engineering which is the University’s largest department. To that end it is perhaps not surprising that the greatest number of articles requested were from Engineering with 311 of the 1878 requests (17%) from here. The areas with next most requested number of articles were, in order, the Department for Public Health and Primary Care, the Department of Psychiatry, the Faculty of Law and the Judge Business School.

What’s hot?

Over this period we have seen a proportional increase in the number of requests for theses compared to articles. When the service started the requests for articles were 71% versus 29% for theses. However more recently, theses have overtaken request for articles to a ratio of 54% to 46%.

The most requested thesis, by a considerable amount, over this period was for Professor Stephen Hawking’s thesis with double the number of requests of the following ten most requested theses. The remaining top 10 requested theses are heavily engineering focused, with a nod to history and social research. These theses were:

The top 10 requested articles have a distinctly health and behavioural focus, with the exception of one legal paper authored by Cambridge University’s Pro Vice Chancellor for Education, Professor Graham Virgo.

When are people requesting?

Looking at the day of the week people are requesting items, there is a distinct preference for early in the week. This reflects the observations we have made about the use of our helpdesk and deposits to our service – both of which are heaviest on Tuesdays.

When in the publication cycle are the requests happening?

In our October 2016 blog we noted that of the articles requested in the four months from when the service started in June 2016 to the end of September 2016, 45% were yet to be published, and 55% were published but not yet available to those without a subscription to the journal.  The method we used for working this out involved identifying those articles which had been requested and determining if the publication date was after the request.

Now, 15 months after the service began it is slightly more difficult to establish this number. We can identify items that were deposited on acceptance because we place these items on a very long embargo (until 2100) until we can establish the publication date and set the embargo period. So in theory we could compare the number of articles with this embargo period against those that have a different date.

However articles that would provide a false positive (that appear to have been requested before publication) would be ones which had been published but we had not yet identified this – to give an indication of how big an issue this is for us, as of the end of last week there were 1768 articles in our ‘to be checked’ pile. We would also have articles that would provide a false negative (that appear to have been requested after publication) because they had been published between the request and the time of the report and the embargo had been changed as a result. That said, after some analysis of the requests for articles and conference proceedings, 19% are before publication. This is a slightly fuzzy number but does give an indication. 

How many requests are fulfilled?

The vast majority of the decisions recorded (35% of the total requests for articles, but 92% of the instances where we had a decision) indicate that the requestor shared their article with the requestor. The small number (3%) of  ‘no’ recordings we have indicate the request was actively rejected.

We do not have a decision recorded from the author in 62% of the requests. We suspect that in the majority of these the request simply expires from the author not doing anything. In some cases the author may have been in direct correspondence with the requestor. We note that the email that is sent to authors does look like spam. In our review of this service we need to address this issue.

Next steps

As we explained in October, the process for managing the requests is still manual. As the volume of requests is increasing the time taken is becoming problematic. We estimate it is the equivalent of 1 person day per week. We are scoping the technical requirements for automating these processes. A new requirement at Cambridge for the deposit of digital theses means there will be three different processes because requests for these theses will be sent to the author for their decision. These authors will, in most cases, no longer be affiliated with Cambridge. Requests for digitised theses where we do not have the author’s permission are processed within the Library and requests for articles are sent to the Cambridge authors.

Given the challenges with identifying when in the publication process the request has been made, we need to look at automating the system in a manner that allows us to clearly extract this information. The percentage of requests that occur before publication is a telling number because it indicates the value or otherwise of having a policy of collecting articles at the acceptance point rather than at publication.

Published 12 September 2017
Written by Dr Danny Kingsley
Creative Commons License

Could the HEFCE policy be a Trojan Horse for gold OA?

The HEFCE Policy for open access in the post-2014 Research Excellence Framework kicks in 9 weeks from now.

The policy states that, to be eligible for submission to the post-2014 REF, authors’ final peer-reviewed manuscripts of journal articles and conference proceedings with an ISSN must have been deposited in an institutional or subject repository on acceptance for publication. Deposited material should be discoverable, and free to read and download, for anyone with an internet connection.

The goal of the policy is to ensure that publicly funded (by HEFCE) research is publicly available. The means HEFCE have chosen to favour is the green route – by putting the AAM into a repository. This does not involve any payment to the publishers. The timing of the policy – at acceptance – is to give us the best chance of obtaining the author’s accepted manuscript (AAM) before it is deleted, forgotten or lost by the author.

Universities across the UK have been preparing. Cambridge has had the ‘Accepted for publication? Send us your manuscript‘ campaign running since May 2014 with a very simple and well liked interface allowing researchers to submit their work. The Open Access team then deposits the item, checks for funding and the publisher policies and then organises payment for open access publication if required.

To give an idea of the numbers we are dealing with at Cambridge, during 2015 the Open Access team deposited 2553 articles into our repository Apollo.

Compliance levels

We have been reporting to Wellcome Trust and the RCUK over the past few years to indicate compliance levels with their policies. However the ‘compliance level’ for the HEFCE policy is a slippery concept. For a start, the policy has not yet come into force. Another complicating factor is the long term nature of the ‘reporting’. We will not truly know how compliant we have been until the time comes to submit to REF – whenever that will be (currently it seems 2021).

At Cambridge have been working on the assumption that because we do not know which outputs will be the ones that we will claim we should collect all eligible articles. However, the number of deposited articles Open Access team received over the past year represents approximately 30% of the full eligible output of the University. This might seem concerning in some ways, but it must be remembered that each researcher in the University will only be reporting four research outputs for the REF.

There are some articles that are obvious contenders for REF. By concentrating on researchers who are publishing in very high impact journals we have been trying to catch those articles we are extremely likely to claim.

During the course of 2015 we discovered 93 papers published in Nature, Science, Cell, The Lancet and PNAS. 33% of these papers were already HEFCE compliant. Of the remaining non-compliant papers we contacted 47 authors, made them aware of the HEFCE open access policy, and invited them to submit their accepted manuscript to the Open Access Service. Less than 40% of those authors who were contacted responded with their accepted manuscript. Therefore, even after direct intervention only 49% papers were HEFCE compliant, which means that still more than half of all eligible papers published in Nature, Science, Cell, The Lancet and PNAS during this period would not have been HEFCE compliant had the policy been in place.

The lack of engagement by members of the academic community with this process is a serious concern – and potentially due to four reasons:

  • Lack of awareness of the policy
  • Putting it off until the policy is in place
  • Deliberately choosing not to submit a work because it is not considered important enough or they do not consider their contribution to be significant enough
  • Some form of conscientious objection to the policy

We should note that the third reason is a matter of some concern to the University as it is not the researcher who decides which articles are put forward for REF. In addition, the University is interested in having a high overall level of compliance for REF as it considers making the research output of the institution available to be important.

Temporary reprieve

Cambridge is no island when it comes to facing significant challenges in capturing all outputs in preparation for HEFCE’s policy. While the highly devolved nature of the institution and the sheer volume of publications may be a problem unique to Cambridge and Oxford, other institutions are still developing the technology they intend to use or are facing staffing issues.

In a concession to serious concern across the sector about the ability to meet the deadline, on 24 July 2015 HEFCE announced that there was a temporary modification to the policy. They now allow research outputs to be made open access up to three months after publication until at least April 2017 (and until such time that the systems to support deposit at acceptance are in place).

This means for the first year of the policy we have a small window after publication to locate articles, determine if they are in our repositories, and if not chase the authors for the Author’s Accepted Manuscript.

The trick is knowing that an article has been published. At Cambridge our ‘best bet’ is to use Symplectic which scrapes various aggregating sources such as Scopus. However Symplectic is hindered by the efficiency of its sources. There is no guarantee that a given article will appear in Symplectic within three months of publication. And even if it is, we have already discussed the low engagement by the research community to approaches from the Open Access team for AAMs.

Subject based repositories

So far this blog has been talking about using institutional repositories for compliance. But the policy specifically states: “The output must have been deposited in an institutional repository, a repository service shared between multiple institutions, or a subject repository“.

The oldest, most established subject repository is arXiv.org and it makes sense for us to consider using arXiv as part of Cambridge’s compliance strategy. After all, some areas of high energy physics, most of computer science and much of mathematics use arXiv as a means to share their research papers. In 2014, the number of articles that were deposited into arXiv.org and subsequently picked up in Symplectic and approved by researchers were 582 – approximately 6.5% of Cambridge’s total eligible articles.

If we are able to claim these articles for HEFCE compliance without any behaviour change requirement from our academic staff then this is an ideal situation. But how do we actually do this? There is a footnote to the HEFCE statement above which says that: “Individuals depositing their outputs in a subject repository are advised to ensure that their chosen repository meets the requirements set out in this policy.” And this is the crunch point. arXiv does not currently identify which version of the work has been deposited, nor does it record the acceptance date of the work. Because of this we are currently not able to simply use the work being uploaded to arXiv.

There is work underway to look at this possibility and what would be required to allow us to use the subject based repositories as a means for compliance. HEFCE themselves have identified under ‘Further areas of work‘ that  “measures to support compliance in subject repositories” is an area of uncertainty and they will work with the community to address this.

Alternative approach?

It is possibly a good moment to take a step back from the minutiae of the means and the timing of the HEFCE policy and focus on the goal that publicly funded research is publicly available. We are in a complex policy environment. HEFCE affects all researchers but many researchers are also funded through COAF or the RCUK with their respective (gold leaning) Open Access policies.

Of the HEFCE eligible articles submitted to to Open Access team in 2015, after working through all the different funder requirements, there was a split of 44% gold Open Access and 56% green Open Access. Of the gold payments the split is approximately 74% for hybrid journals and 26% for fully open access journals.  That said, the three journals with which we have published the most – PLOS ONE, Nature Communications and Scientific Reports – are fully Open Access journals with APCs of $1495, $5200 and $1495 respectively.

A highly relevant question is – outside of the efforts by our Open Access compliance teams, how much Cambridge research is being made open access anyway?

Open access articles

The Web of Science (WoS) allows a filter on ‘Open Access’. It does not appear to list articles that are made open access on a hybrid basis, only picking up fully open access journals. While these are not definitive numbers, it does give us some idea of the scale we are looking at. In 2014 WoS gives us a figure of 981 articles published as open access by a University of Cambridge author in a fully open access journal.

The Springer Compact to which many institutions (including Cambridge) have signed up means that now all articles published by that research community will be made open access. In 2014, the Open Access Service had paid for 21 articles to be made open access. In the same period across the institution we had published 695 articles with Springer. (Note that in 2015 we paid 51 Springer  APCs). This means that for the cost of the Springer subscription and our APC payments for the previous year we will have a good proportion of Cambridge articles published as open access articles.

These two sets of numbers only allow for articles published either in fully open access journals or with Springer. It does not account for the articles where the University (or a Department or individual) pays an APC to make an article available in a hybrid (non Springer) journal. The upshot is – a significant proportion of Cambridge research is published open access.

Skip the AAM on acceptance part?

So what does this published open access research mean for compliance with the HEFCE policy? The updated HEFCE policy has addressed this:

“… we have decided to introduce an exception to the deposit requirements for outputs published via the gold route. This may be used in cases where depositing the output on acceptance is not felt to deliver significant additional benefit. We would strongly encourage these outputs to be deposited as soon as possible after publication, ideally via automated arrangements, but this will not be a requirement of the policy.”

This makes sense from an administrative perspective if the article appears in a journal where there is an embargo period on making the AAM available, forcing the University to pay an APC to make the work Open Access to meet RCUK requirements. It would avoid the palaver of:

  • obtaining the AAM from the author
  • depositing it into the repository
  • having to check to see when the article has been published
  • updating the details and
  • either set the embargo on the AAM or change the attachment in the record to the Open Access final published version

However journals where there is an embargo period on making the AAM available forcing an APC payment is in fact almost a definition of hybrid journals. We know there are issues with hybrid – of the extra expense, of double dipping, of the higher APC charges for hybrid over fully Open Access journals. Putting these aside, what this HEFCE policy change means is that publishers have effectively shifted the HEFCE policy away from a green open access policy to a gold one for a significant proportion of UK research. This is a deliberate tactic, along with the unsubstantiated campaign that green Open Access poses a major threat to scholarly publishing and therefore embargoes should be even longer.

We are already facing the problem that hybrid journals are forcing the move towards green open access being ‘code’ for a 12 month delay. This is the beginning of a very slippery slope. We have been outplayed. It really is time for the RCUK and Wellcome Trust to stop paying for hybrid Open Access.

But I digress.

The cons

The message is confusing enough – three sets of policies and three different requirements in terms of the timing and the means to make work compliant and available. We are trying to make it as simple as possible for researchers – with limited success.

The move to widespread Open Access in the UK is a huge shift for the research community and those that support them. It would be very difficult to debate the ‘against’ argument for the statement that publicly funded research should be publicly available but the devil is very much in the detail.

It would be an incredible shame if the HEFCE policy is hijacked into a partial gold OA policy, but as administrators we are drowning in compliance. There needs to be a broad discussion across the funders to try and address the conflicting compliance requirements and the potentially negative effect these policies are having on the future of open scholarly publishing. 

We welcome the opportunity to discuss these issues with HEFCE, Wellcome Trust and the RCUK. There’s plenty to talk about.

Published 25 January 2016
Written by Dr Danny Kingsley
Creative Commons License

Where to from here? Open Access in Five Years

As part of the Office of Scholarly Communication Open Access Week celebrations, we are uploading a blog a day written by members of the team. Thursday is a piece by Dr Arthur Smith looking to the future.

Introduction

Academic publishing is not what it used to be. Open access has exploded on the scene and challenged the established publishing model that has remained largely unchanged for 350 years. However, for those of us working in scholarly communications, the pace of change feels at times frustratingly slow, with constant roadblocks along the way. Navigating the policy landscape provided by universities, funders and publishers can be maddening, yet we need to remain mindful of how far we have come in a relatively short time. There is no sign that open access is losing momentum, so it’s perhaps instructive to consider the direction we want open access to take over the next five years, based upon the experiences of the past.

So how much is the University of Cambridge publishing and is it open access? Since 1980, according to Web of Science, the University’s publications increased from 3000 articles per year to more than 11,000 in 2014 (Fig. 1). Over the same period the proportion of gold open access articles rose steadily since first appearing on the scene in the late 1990s. Thus far in 2015 nearly one in ten articles is available gold open access, although this ignores the many articles available via green routes.

image02

Fig. 1. Publications at the University of Cambridge since 1980 according to WoS (accessed 14/10/2015).

 

The HEFCE policy

By far the most important development for open access in the UK has been the introduction of HEFCE’s open access policy. As the policy applies to all higher education institutions it affects every university researcher in the UK. While the policy doesn’t formally start until April 2016, so far progress has been slow (Fig. 2). We believe that less than a third of all the University’s articles that are published today are currently compliant with the HEFCE policy, and despite a strong information campaign, our article submission rate has stagnated at around 250 articles per month, well off the monthly target of 930.

image03 image04

Fig. 2. Publications received to the University of Cambridge open access service. The target number of articles per month is 930.

It’s understandable that some papers will fall through the cracks, but even for high impact journals many papers still don’t comply with the policy. But let’s be clear, aside from any policy compliance issues and future REF eligibility, these numbers reveal that fully two thirds of research papers produced at the University cannot be read without a journal subscription. And if readers can’t afford to pay for access then they’ll happily find other means of obtaining research papers.

What about inviting authors to make their research papers open access? Since June I have tracked five high impact journals and monitored the papers published by University of Cambridge authors (Fig. 3). Upon first discovery of a published paper, only 29% of articles were compliant with the HEFCE policy, which is consistent with our overall experience in receiving AAMs. But even after inviting authors to submit their accepted manuscripts to the University’s open access repository, the number of compliant articles rose to only 42%. Less than a third of authors who were directly contacted and asked to make their work open access eventually submitted their manuscripts. Clearly, the merits of open access are not enough to convince authors to act and distribute their manuscripts.

image03

Fig. 3. Compliant articles published in five high impact journals. Even after direct intervention less than half of all articles are HEFCE compliant.

SCOAP³

The SCOAP3 initiative is a publishing partnership that makes journals in the field of particle physics open access. This innovative scheme brings together multiple universities, funders and publishers and turns traditional journals, that are already widely respected by the physics community, into purely open access journals. No intervention is required by either authors or university administrators, making the process of publishing open access as simple as possible. The great advantage of this scheme is that authors don’t need to worry about choosing an open access option from the publisher, nor deal with messy invoices or copyright issues. All of these problems have been swept away.

Jisc Springer Compact

Like SCOAP3 the recently announced Jisc Springer Compact is a coalition of universities in the UK that have agreed a publishing model with Springer that makes ~1600 journals open access. Following a similar Dutch agreement, this publishing model means that any authors with qualifying institutional affiliations will have their publications made open access automatically. We’ve already started receiving our first requests under this scheme. However, unlike the SCOAP3 initiative which ‘flips’ entire journals to gold OA, the journals under the UK Jisc Springer Compact are still hybrid and only content produced by qualifying authors is open access. While this is great for those universities signed up to the deal, it still leaves a great many papers languishing under the subscription model.

Affiliation vs. Community

So which of these strategies will prove to the most successful? Will universities take ownership of open access publishing or will subject based communities come together in publishing coalitions.

The advantage of subject based initiatives is they flip entire journals for the benefit of a whole research community, making all the work within a specific discipline open access. However, without sufficient cohesion and drive within an academic community it’s likely that adoption will be fragmented across the myriad of disciplines. It’s no surprise that SCOAP3 emerged out of the particle physics community, given this scholarly community’s involvement in the development of arXiv, but it’s unrealistic to expect this will be the case everywhere.

Publishing agreements based around institutional affiliations will undoubtedly become more common, but until all universities have agreements in place with all the major publishers (Elsevier, Wiley, Springer, etc.) then a large fraction of scholarly outputs will still remain locked down.

What does the future hold?

Ultimately I want to do myself out of a job. As odd as that sounds, the current system of paying publishers for individual papers to be made open access is a laborious and time consuming process for authors, publishers and universities. Similarly the process of making accepted manuscripts available under the green model is equally ridiculous. Publishers should be automatically depositing AAMs on behalf of authors. There is no evidence that making AAMs available has ever killed a journal, and besides, the sooner we can reach agreements with all the major publishers and research funders that result in change on a global scale the better it will be for everyone.

Published 22 October 2015
Written by Dr Arthur Smith
Creative Commons License

It’s time for open access to leave the fringe

The Repository Fringe was held in Edinburgh on 3-4 August. With the theme of “Integrating repositories in the wider context of university, funder and external services”, the event brought together repository managers across the UK to discuss practice and policy. Dr Arthur Smith, Open Access Research Advisor at the University of Cambridge, attended the event and came away with the impression that more needs to be done to embed open access in scholarly processes.

In his keynote speech to Repository Fringe 2015, titled ‘Fulfilling their potential: is it time for institutional repositories to take centre stage?’  David Prosser, Executive Director of Research Libraries UK (RLUK) gave a concise overview of the history surrounding open access and the situation we currently find ourselves in, especially in the UK.

What’s become clear is that ‘we’ is a problematic term for the scholarly communications community. A lack of cohesion and vision between librarians, repository managers and administrators means ‘we’ have failed to engage with researchers to make the case for open access.

I feel this is due to, in part, the fragmented nature of repositories stemming from an institutional need for control. If national (and international) open access subject repositories had been created and exploited perhaps researcher uptake of open access in the UK and around the world would have been faster. For example, arXiv continues to be the one stop shop for physicists to publish their manuscripts precisely because it’s the repository for the entire physics community. That’s where you go if you’ve got a physics paper. To be fair, physics had a culture of sharing research papers that predates the internet.

Repositories are only as good as the content they hold, and without support from the academic community to fill repositories with content, there is a risk of side-lining green open access*. This will in turn increase the pressure to justify the cost of ineffective institutional repositories.

As David correctly identified, scholars will happily take the time to do things they feel are important. But for many researchers open access remains a low priority and something not worth investing their time in. Repositories are only capturing a fraction of their institution’s total publication output. At Cambridge we estimate that only 25-30% of articles are regularly deposited.

Providing value

The value of open access, whether it’s green or gold**, isn’t obvious to the authors producing the content. Yet juxtaposed with this is a report prepared by Nature Publishing Group on 13 August: Perceptions of open access publishing are changing for the better. This examined the changing perceptions of researchers to open access. While many researchers are still unaware of their funders’ open access requirements, the general perception of open access journals in the sciences has changed significantly, from 40% who were concerned about the quality of OA publication in 2014, to just 27% in 2015.

Clearly the trend is towards greater acceptance of open access within the academic community, but actual engagement remains low. If we don’t want to end up in a world of expensive gold open access journals, green repositories must be competitive with slick journal websites. Appearances matter. We need to attract the attention of the academics so that open access repositories are seen as viable places for disseminating research.

The scholarly communications community must find new ways of making open access (particularly green open access) appealing to researchers. One way forward is to augment the reward structure in academic publishing. Until open access is adopted more widely, academics should be rewarded for the effort involved in making their work openly available.

In the UK, failure to comply with the Higher Education Funding Council for England (HEFCE) and other funders’ policies could seriously affect future funding outcomes. It is the ever-present threat of funding cuts which drives authors to choose open access options, but this has changed open access into a policy compliance debacle.

Open access as a side effect of policy compliance is not enough; we need real support from academics to propel open access forward.

Measuring openness

As a researcher, the main things I look for when assessing other researchers and their publications are h-index, total and article level citations, and journal prestige (impact factor). I am not aware of any other methods which so simply define an author’s research.

While these types of metrics have their problems, they are nonetheless widely used within the academic community. An annual openness index, which is simply the ratio of open access articles to the total number of publications, would quickly reveal how open an academic’s research publications are. This index could be applied equally to established professors and early career researchers, as unlike the h-index, there is no historical weighting. It only depends on how you’re publishing now.

Developing such a metric would spur on open access from within academic circles by making open access publishing a competition between researchers. Perhaps the openness index could also be linked to university progression and grant reward processes. The more open access your work is, the better it is for you, and as a consequence, the community.

Open access needs to stop being a ‘fringe’ activity and become part of the mainstream. It shouldn’t be an afterthought to the publication process. Whether the solution to academic inaction is better systems or, as I believe, greater engagement and reward, I feel that the scholarly communications and repository community can look forward to many interesting developments over the coming months and years.

However, we must not be distracted from our main goal of engaging with researchers and academics to gather content for the open access repositories we have so lovingly built.

Glossary

*Green open access refers to making a copy of a published work available by placing it in a repository. This can be thought of as ‘secondary’ open access.

**Gold open access is where the research is published either in a fully open access journal – which sometimes incurs an article processing charge, or in a hybrid journal – which imposes an article processing charge to make that particular article available and also charges a subscription for the remainder of the articles in the journal. This can be thought of as ‘born’ open access.

Published 27 August 2015
Written by Dr Arthur Smith
Creative Commons License