Plan S – links, commentary and news items

The discussions around Plan S are voluminous. On 8 February 2019, the opportunity to provide feedback on Plan S closed.

We have been attempting to maintain a list of commentary and news stories on Plan S at the end of one of our blogs: Most Plan S principles are not contentious. This has now grown so large we have moved the list into this dedicated blog. We will continue to try and keep it up to date – please let us know if we have missed anything that should be added.

Please note that there is a list on the Open Access Tracking Project using the tag “oa.plan_s”  which is crowd sourced and updated in real time, so is more comprehensive than this effort. There is also a comprehensive Reddit list curated by Jon Tennant available. A smaller list (but with different links) is also available.

Relevant documents from Science Europe

Commentary, news stories & press releases

These are presented here in reverse order of publication (most recent first).

Commentary in 2018

Published  10 February 2019
Compiled by Dr Danny Kingsley
Creative Commons License

2018 That Was The Year That Was

In what has now become a tradition, we are sending out our annual summary of the activities of the Office of Scholarly Communication. Our first year, in 2015, the summary was a stock take of where we were at. By the following year, 2016, we were implementing a strategy. What followed in 2017 was a year of numbers.  Last year was really a year of consolidation allowing us now for the first time in four years to take a step back and breathe.

REF, what REF?

It is impossible to be in this space in the UK and not be highly focused on the Research Excellence Framework. While our team has been working towards the REF for four years, suddenly during 2018 the community really took notice. This is fantastic from an advocacy perspective but it has posed some other issues.

We are facing a tsunami, where deposits to the repository have more than doubled (and in some months quadrupled) what we were receiving a couple of years ago, with an almost stable staff number during that time. It is now not uncommon for us to receive well over 1000 deposits in a given month. This has meant that for the first half of 2018 the ever-present backlog of articles to be uploaded to the repository rose to 4,000, of which 60% were potentially claimable for the REF. We also were holding over 1000 records that needed to be updated with publication details.

To address this we have worked hard to streamline our processing. In addition to stretching our helpdesk system’s ability to classify and sort to the limit, we have developed two systems to almost automate our deposit process.

The first is to create a solution to automate journal policies, called Orpheus. This was released as an Open Source resource in January.  The second is a web application we are calling ‘Fast-Track’ which reduces the processing time of deposits dramatically by presenting the decisions that need to be made on each Open Access deposit in a simple interface. We launched the application internally (for the Open Access Service) late last year. The drastic reduction of processing time from an average of 20 minutes per record to one to two minutes means since September we have processed more than 3000 items from the backlog reducing our numbers that are pending processing at time of this blog going live to 600 and falling fast.

Question time

We plan to engage the wider library and administrative community with Fast-Track in early 2019, with the double benefit of exposing others to at least one aspect of open access in a simple and accessible manner, and allowing the staff from the Open Access Service to focus on supporting researchers with REF and Open Access related queries. These queries are relentless by the way, with 400 to 600 per month on the Open Access queue alone. Even with the automation and system management of repeat queries, we spend close to 100 person hours a week managing the Open Access helpdesk alone. A project for 2019 will be an analysis of the queries to see if there is a solution to reducing the number of queries before they come to us.

However, in reflection of the advice that came from our repository community late last year on ‘What would you have liked to know when you started in scholarly communication?‘ we also hold tight the philosophy that “It’s not all about the REF”.

Wide readership

The downloads from our repository, Apollo, point to the whole reason why we are trying to make Cambridge research available in the first place.  According to Jisc’s IRUS-UK service, our readers come from all over the world, with the US the biggest user. We experienced over 2.2 million downloads of material from the repository in 2018, of which close to 1 million (927,114) were of articles and nearly 40,000 were of datasets.

We are still feeling something of the ‘Hawking effect‘ – in 2018, Professor Stephen Hawking’s PhD thesis received 424,141 downloads, representing more than half of our total theses downloads for 2018 of 740,441. Of those theses our most downloaded  in order were:

The OSC’s Request a Copy service continues to provide access to embargoed content in Apollo. In 2018, the service managed more than 4600 requests, with nearly twice as many requests for theses than journal articles. All of these have been processed by hand, which takes approximately 27 person hours per week. In 2019 we are planning to enhance our systems so that we can continue to provide access to Cambridge’s research outputs in a less manual manner.  Find out more about the requests we receive in our blog post, What do you want, and why do you want it? An update on Request a Copy.

Policy changes

In a space that is already well known as fast moving, 2018 broke all records in relation to the pace of policy change.

On 1 April 2018, UK Research and Innovation (UKRI – not to be shortened to the phonetic acronym ‘you cry’) came into existence, subsuming RCUK and transforming part of HEFCE into Research England. UKRI is undertaking a review of their Open Access policy.

In 2018 Cambridge was an invited contributor to the Wellcome Trust Open Access Policy Review Consultation and Expert Evidence Gathering Session, which resulted in a new Open Access policy for Wellcome Trust released in November.

The issue of allowing articles uploaded to arXiv and other subject repositories to be compliant with REF rules remained a problem at the beginning of 2018. After several years of consultation with arXiv and across the sector about technological solutions, the paper ‘arXiv and the REF open access policy’ was published in April. In July, there was a change to the UKRI policy on the acceptance of works deposited to preprint servers and subject repositories. This is good news for a significant section of our research community but does require careful handling of workflows .

In August, Cambridge brought in new funding guidelines which support publishers showing progression towards open access. This is equally addressing Cambridge’s positive moves to an open future and the need to sensibly manage the UKRI block grant fund allocation.

Obviously, Plan S, announced in September, remains a hot topic. We have published a few blogs on the topic (see below) and continue to hold a watching brief on it.

Thesis news

This past year has seen a huge amount of work by the OSC to implement and consolidate the policy and processes around the agreement by the Board of Graduate Studies on the new access levels to theses, cementing in the policy for deposit of digital theses across the University, and agreement from UKRI that this is compliant with their Training Grant policies.

Our advocacy continues on the digitisation and release of our vast store of physical-only theses, which is only possible with the permission of the author. To that end, in 2018 we released a series of short films, My thesis, open access and me to demonstrate the benefit of open access to those considering it. We also released a brilliant comic strip drawn by one of our librarian colleagues and Data Champions, Clare Trowell – it is available online. These activities were dubbed (in possibly the best pun of 2018) as: ​”The theses formerly know as prints” project.

We have continued our digitisation of alumni theses, through the support of the Arcadia Foundation and have now digitised 200 of these theses and made them open access. This includes the work we have been doing with the Scott Polar Research Institute to collect and digitise their full corpus of theses. We also finally went live with an online eSales system for the 1,400 theses we had digitised from the British Library’s expansive microfilm archive.

All this activity has resulted in a huge increase in the number of theses in the repository. When the Office of Scholarly Communication first began in 2015 there were 700 theses in the repository. We have now exceeded 6,000.

Data news

The biggest news for the Research Data Management Facility in 2018 was the consolidation of the staff onto ongoing permanent contracts. After a process that has lasted several years we are delighted (and relieved) that Dr Lauren Cadwallader is in post as the RDMF Manager, and Dr Sacha Jones has joined us as the RDMF Coordinator.

We extend a huge thanks to Clair Castle who joined us for much of 2018 to keep the RDMF rolling while the staffing was being resolved, in particular for her work with the Data Champions. During 2018 we ran a successful second call for applications to the Data  Champions programme, resulting in a cohort of 50 people across the University participating. We also worked on a series of postcards and cartoons (again with our talented colleague Claire Trowell)  to promote the Data Champion programme. There is a full description of these resources in the Cartooning the Data Champions blog post.

The numbers around our datasets continue to be impressive. According to Institutional Repositories Usage Statistics UK (IRUS UK), Apollo contains ~30% of all datasets (across 140 repositories) in the UK, that amounts to more than 1,500 research datasets of which more than 70% are linked to RCUK funding.

Systems

We continued to integrate our Apollo repository with other systems to allow further automation of processes. The repository is now minting DOIs, displaying Altmetric information when available, linking to ORCID and our metadata is harvested into CORE and listed on IRUS-UK.

While these linkages have improved our offerings, it does mean we cannot upgrade the repository until we both upgrade Elements, and Repository Tools, the repository-Elements connector.

Upgrades are also constrained by REF timeframes given the need for stability of our systems in the run up to REF2021, so we need to make a decision early in 2019 about whether we push further ahead or put everything on hold until our REF return is secured.

Training

Training continues to be a big focus for the OSC, with a strong move towards online training in 2018. This is explained in some details in this related blog. We are also invested in a group that is looking at the question of competencies and associated training in scholarly communication which was loosely titled the Scholarly Communication professional development Group.

Now newly named the “Scholarly Communication Competencies Coalition” (SC3), we will be launching an online presence early in 2019. Some of the activities of the group in 2018 included developing an online resource to try and showcase what this area is like as a place of work, see In their own words: working in scholarly communication. We also investigated the skills required to work in scholarly communication.

Outreach activities

The OSC puts a great deal of effort into sharing our work with our library, University and wider communities. We have welcomed over 800 attendees this year, plus more who have watched recordings of events and webinars of the 57 events we ran for researchers, librarians and the wider Cambridge community.

Open Access Week, as always, was a very big week for everyone in the team, during which four very engaging speakers joined us for a lively event, Is Open Research really changing the world?, to question if research outputs really are available to everyone when they are made open access.

We continue to improve and update our online resources, relaunching the open access website as a one-stop-shop for our research community to help demystify the process of meeting funder OA requirements and making a manuscript REF eligible. We also released a compilation of the best copyright resources on the web, featuring everything from training session slides to videos. As a bit of fun we published our second annual Advent calendar in December.

We have also considered the effort we put into our outreach activities, which has meant we are going to approach the decisions about which events we livestream and film differently.

And you, our readers

We continued to blog enthusiastically through our Unlocking Research and Open Research: Adventures from the frontline blogs, managing 35 blog posts this year. There was a glitch with the analytics for the first four months of 2018, with the system rebooting on 7 April this year. Even with a third of the year missing, this blog enjoyed 25,000 visits over the past nine months.

Looking at where visitors to the blog have originated, it is interesting to note that the number of ‘organic search’ readers remains consistent throughout the year, whereas the direct links are clearly affected by our own promotions or through discussions elsewhere.

Our most popular blog, with 1741 visits since 7 April, was “What does a researcher do all day?”  This is a perennial favourite – published on 1 February 2016, it was also the second most popular blog last year.

In order, our other popular blogs with over 1000 visits each were

Projects and plans

There have also been other interesting side projects we have undertaken in our ‘spare time’. We started a research project to understand what we contribute to the scholarly literature , what we pay and what we get out of it, to assist decision making about subscriptions and other expenditure across the University. We hope to write up and release some findings from this project soon. We have also been conducting a Text and Data Mining Test Kitchen Project to help define what a TDM service might look like within the library, and work will continue in 2019.

As always this remains an interesting and dynamic area to work in and we are looking forward to another exciting year!

Published  25 January 2019
Written by Dr Danny Kingsley
Creative Commons License

Orpheus, an Open Source solution for journal policies

As anyone who administers an institutional repository can tell you, repeatedly looking up journals’ policies and attributes is a pain in the neck. We have discussed this problem a few times, noting in 2017 the complex embargo situation and the confusion about publication dates. Indeed it has been clear since 2013 that this is so complicated it is unrealistic to expect researchers to navigate this situation. This means considerable amounts of repository staff time are typically spent traversing a confusing landscape of complex, inconsistent and fluid policies.

To stop or at least mitigate this pain, wouldn’t it be great if those policies and attributes were available in a structured, machine-readable format, so that the burden of retrieving and using such information could be transferred from people to repository software?

(Given, an even better solution would be, of course, for publishers to have simpler and standardised policies across their journals, but there is little indication that this will happen any time soon – see links above.)

Our solution – Orpheus

JISC are currently working on and will shortly release version 2 of their SHERPA services, which have enormous potential for providing machine-readable data on embargo periods and at least some of the other attributes we need. However, circa two years ago we decided that, in face of increasing demand for our services, we could not afford a wait to automate our workflows. Besides, we reasoned that any external solution would be unlikely to cover all the journal attributes we rely on beyond embargo periods, and to be updated at the frequency we require.

So, in the last trimester of 2017, I set out to develop a database that could store in a strictly structured and machine-readable format all bits of information from journals, publishers and conferences that we repeatedly look up. This would replace the time the team behind Cambridge’s DSpace repository Apollo was spending retrieving and manually applying those data to each deposited item.

Orpheus (named after the son of Apollo in Greek mythology) was thus born in January 2018. To mark its first birthday, we have just turned Orpheus into an Open Source project and released the code at https://github.com/osc-cam/orpheus.

In this blog post, I will provide an overview of Orpheus’ main features and of how we have been using it to increase the efficiency of our repository and services.

Supported attributes and available interfaces

On the web interface for editors and users, attributes are listed, for each journal, publisher and conference, in a detailed view that looks like this:

Orpheus currently supports the following attributes of journals/publishers:

  • name, synonyms, URL and, for journals, ISSNs and publisher
  • revenue model (subscription, hybrid, fully Open Access)
  • gold OA policy (article processing charges, licence choices, etc)
  • green OA policy (allowed versions and outlets, embargo period, licence, etc)
  • Europe PMC participation (whether or not the journal deposits papers in EPMC)
  • deals/discounts (whether the journal is included in an institutional deal such as Springer Compact or offers any discounts)
  • contacts (e-mail addresses for queries

Orpheus’ RESTful API exposes journal attributes in JSON format and its response can be tailored to facilitate integration with repositories platforms and other systems. For instance, the screenshot below shows only the attributes that we feed into Apollo and/or our helpdesk system (on the left below).

  

Like every project written in Django, Orpheus includes an additional web interface for administrators to manage users and permissions, and to perform bulk operations such as updating or deleting multiple entries at once. It looks pretty too (as seen on the right above).

Current coverage

Orpheus includes parsers that allowed substantial datasets of journals and their attributes to be imported into the system, saving the Cambridge team the effort of populating the database from scratch. Data was imported from:

Orpheus currently has almost 40,000 journal entries belonging to more than 8,000 publishers (“preferred names”; the larger number of “total entries” includes synonyms).

While we may derive some satisfaction in achieving comprehensive coverage and including journals such as هیدروژیولوژی and Демографија, what really matters to us in terms of maximising the efficiency of our services is databasing those journals and conferences that Cambridge academics most often publish in.

A quick analysis of journal names contained in all Apollo submissions received since 2014 (29,598 submissions) reveals that we are now able to match 83% of those to a record in Orpheus and retrieve embargo periods, APC value and licencing information for, respectively, 72, 48 and 37% of past submissions. These results are encouraging, especially considering that (1) ’journal name’ in this dataset of past submissions includes conference names and strings that do not correspond to true journals, such as 13 entries for ’TBC’ (to be completed); and (2) for new submissions, our system tries to find matches by ISSN and eISSN before attempting matches by name, so we have a better chance of matching “Hepatology (Baltimore, Md.)” to the right journal than this analysis would suggest.

Integrations with Apollo and Zendesk

Without digging into the technical details of the integration of Orpheus with Apollo (to be honest, I would not be able to go into detail here, for the integration with Apollo was fully implemented by my colleague Agustina Martinez-Garcia), it suffices to say that Apollo has been querying Orpheus and successfully applying embargoes to many of the c. 900 submissions we receive per month (we received, on average, 892 monthly submissions in 2018).

Orpheus has also been integrated with our helpdesk system (powered by Zendesk) via “Orpheus Lookup”, a small Open Source application available here. This enables relevant information about journals to be embedded in our helpdesk interface (see right hand side pane of screenshot below), facilitating the job of advising researchers on how to comply with their funders’ Open Access policies. The app also allows us to populate the relevant helpdesk ticket fields (see left hand side pane of screenshot) with one click. Information in these fields may then be processed by a Zendesk macro (also Open Source), to produce tailored auto-reply messages that can be further customised by the staff member.

In summary, our experience indicates that the benefits of integration of an institutional repository with an auxiliary database providing machine-readable representations of frequently required attributes of journals, conferences and publishers outweigh the costs of development and maintenance of the system. Other institutions or consortia interested in automating the processes of looking up and applying those attributes to repository records may benefit from hosting an instance of Orpheus.

If you are interested in more detail about the Orpheus integration, please email us on info@repository.cam.ac.uk and we will be happy to help.

Published  22 January 2019
Written by Dr Andre Sartori
Creative Commons License