All posts by Arthur Smith

Clearing the final hurdle – automating embargo setting

One of the biggest issues facing the Open Access Team has been keeping up with the constant stream of accepted manuscripts that need to be processed. In many cases we receive notification of an accepted manuscript well before formal publication. This has presented a significant challenge over the last five years because although we know there is a publication forthcoming (or at least we trust that there this), we have no idea as to when an article may actually be published.

This means that we have many thousands of publication records in Apollo which have ‘placeholder’ embargoes because we simply did not know the publication date at the point of archiving and therefore could not set an accurate embargo. After archiving, many of the records in Apollo may have been supplemented with a publication date thanks to metadata supplied via Symplectic Elements, but we still need to set an accurate embargo.

In other cases we might be waiting for an article to be published gold open access so that we can update Apollo with the published version of record.

While we are now very adept at archiving manuscripts in Apollo (thanks in large part to Fast Track and Orpheus) it remains a challenge to properly and accurately update Apollo records with either correct embargoes for accepted manuscripts, or the open access version of record. It is a futile task to be constantly checking whether a manuscript has been published. While the Open Access Team keeps a list of every publication that requires updating, this is a thankless job that should be highly automatable.

To that end, we have recently leveraged Orpheus to do at lot of the heavy lifting for us. By interrogating every journal article in Apollo and comparing its metadata against Orpheus we can now quickly determine which items can be updated and take the necessary next steps, changing embargoes where appropriate or identifying opportunities to archive the published version of record.

To do this we created a DSpace curation task to check every “Article” type in Apollo that had at least one file that was currently under embargo. We then compared the publication metadata against the information held in Orpheus to determine what steps needed to be taken. In total we found 9,164 items in need of some attention. The results are displayed below in a Tableau Public visual and summarised in Table 1.

Of these items, 3,864 had a published open access version archived alongside the embargoed manuscript, so we skipped any further updating of these records. This is actually a very good sign, and indicates that the Open Access Team has been going back to records and supplementing them with the open access version of record.

Amongst the remaining items, 2,794 were successfully matched against Orpheus and had their embargoes verified: 1,862 records were updated with shorter embargoes and 412 had longer embargoes applied, leaving 520 items which were unchanged because they already had the correct embargo period.

The final 2,506 items were primarily composed of records with no publication date (1,132 items), publications that could potentially be supplemented by the open access version of record (537 items) or had no embargo information in Orpheus (434 items).

Table 1. Summary of outcomes after comparing Apollo records against Orpheus.

Date archived in Apollo2014201520162017201820192020Total
The item has an open VoR version710512001019130022673864
Accepted version – embargo updated21457613223051342794
No publication date available10159327142171132
Orpheus VoR embargo: 014511854517537
No AAM embargo information available3664393326425434
Other outcome837114472316212403
Total1915415841358152541224029164

We plan to run this curation task on a regular basis and periodically check the outcomes. Any items that continually fail to update will be processed manually by the Open Access Team, but our intention and desire is to move away from manual processing wherever possible.

Published 3 April 2020

Written by Dr Arthur Smith

Image showing that this blog post is under CC-BY licence.

A Fast-Track Route to Open Access

In the last two years, since the REF 2021 open access policy came into force, the Open Access Team has received an ever increasing number of manuscript submissions for archiving in Apollo, Cambridge’s institutional open access repository.

We have been thinking long and hard about ways to cope with the workload, by scrutinising existing practices and streamlining workflows, because we want to provide the best possible service to our researchers, commensurate with the University’s world leading research.

This blog introduces what is perhaps the greatest overhaul of our workflows since the service began: a new ‘Fast Track’ deposit system.

Work it harder

Before the start of the REF OA policy (2014-2016), the Open Access Team would process and manually curate every manuscript submission we received. Authors could expect an initial response within 1-2 working days, after which (usually within a month) we would archive their manuscript in Apollo.

A simplified workflow for a typical manuscript was:

  1. Manuscript uploaded by submitter in Symplectic Elements.
  2. Item created in Apollo (DSpace) workflow
  3. Helpdesk ticket created (Zendesk).
  4. Open Access Team reviews manuscript, advises submitter and makes a decision.
  5. Open Access Team archives the manuscript in Apollo and informs submitter.

Both the decision (4) and archive (5) steps take time. For each manuscript we would need to decide whether the files we received could be archived, what funder open access policies were at play and the open access options available from the publisher. We could then advise authors about their open access choices.

To archive a manuscript the process was broadly the following:

  1. Review the helpdesk ticket (Zendesk) for the open access decision.
  2. Enter as many publication details as possible in Symplectic Elements.
  3. Retrieve the submission from the Apollo (DSpace) deposit workflow.
  4. Add licence and metadata to the record.
  5. Review the submission and approve for archiving.
  6. Move the item to the relevant departmental collection and apply an appropriate embargo (if required).
  7. Finally, update the helpdesk ticket and send the original submitter a link to their Apollo record.

Each manuscript took on average 18 minutes to archive, which, besides being manually tedious and prone to error, was extremely time-consuming. Add to this the time required to make the initial decision and each manuscript submission could easily take 30 minutes for the Open Access Team to fully process from start to finish, especially if an open access fee had to be paid.

Fast-forward two years and with the rate of new manuscript submissions now peaking at over 1,300 per month, simply processing manuscripts for the REF would require more than four full-time staff members. Whilst these manual processes were viable for a handful of submissions a day, they became unwieldy at scale.

Make it better

Our first attempt at speeding up our open access system began in August 2017. To start we made a number of operational changes to reduce the time spent processing manuscript submissions:

  • We would rely entirely on the metadata present in Symplectic Elements to populate the Apollo records (i.e. we would not curate manual records).
  • The Open Access Team would no longer update the helpdesk records, instead internal record keeping would be automated as much as possible.

Unfortunately, the number of steps in the Apollo workflow was still roughly the same as the previous process, but with one key difference: a new field to record what we call the ‘Fast Track’ decision. There were seven Fast Track options:

  • Submitted
  • Proof
  • Published (not open access)
  • Published (open access)
  • Accepted (published)
  • Accepted (not published)
  • Other

The first six options represent the vast bulk of all manuscripts received by the Open Access Team, and ‘Other’ option simply acts as a catch-all for anything else. By simply knowing what sort of manuscript has been uploaded much of the decision and archiving process can be automated. However, the agent still needed to retrieve the item from the Apollo workflow, check the version of the file and publication status of the paper, add some metadata fields, approve the item, and move it to an appropriate collection.

Figure 1. The Apollo workflow page of a typical manuscript submission, with the addition of the new ‘Fast Track’ field.

The choice of Fast Track decision leads to four possible outcomes which would ‘trigger’ actions in our Zendesk helpdesk:

  • Submitted, proof, published (not open access)
    • Email submitter, ask for accepted manuscript
  • Published (open access)
    • Archive in Apollo (no embargo) ⇒ Email submitter Apollo link
  • Accepted (published), accepted (not published)
    • Archive in Apollo (embargoed) ⇒ Email submitter Apollo link
  • Other
    • Refer to Open Access Team

Despite being a much faster process, it was still manually tedious. It could also require up to 33 actions from agents (29 mouse clicks) and 14 web pages to be loaded, still not very user friendly. However, the time to archive had decreased from 18 to 9 minutes – a 50% reduction from the previous fully manual system.

Do it faster

So what if all the steps involved in processing a manuscript submission could be reduced to the absolute minimum, and be actionable within a single webpage? After a short development sprint, the Open Access Team launched the ‘Fast Track Deposits’ interface last September. A snapshot of the user interface is shown below.

Figure 2. The Fast Track interface. Choosing one of the options in blue is enough to fully archive a manuscript, or process it for further action by the submitter or the Open Access Team.

At the top of the page, the agent can see a ‘publication summary’ including the item title, the journal title, and publisher DOI if available. Both the item title and publisher DOI are hyperlinked, so that the agent can Google-search the item or land on the publisher’s webpage with a single mouse click.

The agent must first inspect the file and check that it is a suitable version (i.e. either the accepted version or the open access published version). If wrongly labelled, they must relabel the file via a dropdown menu, and add/delete files as appropriate. The agent then ‘describes’ the manuscript (i.e. decides whether it is the accepted, published, submitted or proof version) and submits their decision. The decision determines the trigger behaviour in the automatically populated helpdesk ticket. The agent is then free to move on to the next item.

If the decision is ‘accepted’ or ‘published open access’, the item is deposited and the submitter is automatically notified via email. For submitted, proof, and non-OA published versions, the author receives an automatic email asking for the accepted manuscript. Items are archived in the repository under a generic collection, and any forthcoming publication details are added to the record via external source information in Elements.

To see just how efficient Fast Track is we’ve prepared a short demonstration video which captures some of the key features:

Video 1. Real-time demonstration of the Fast Track system.

Makes us stronger

Agents therefore need only make one decision: identify the file version. But the real ingenuity of the Fast Track system is that embargoes can be set automatically by:

  1. Taking into account the decision made by the agent (e.g. no embargo if published open access);
  2. Detecting publication status and publication dates from Elements; and
  3. Retrieving journals’ embargo policies via Orpheus (you can learn more about Orpheus in our previous blog post).

In some cases, usually because we don’t know the publication date, we can’t determine the embargo length of an accepted manuscript. In such cases we apply a 36 month embargo from the date of the Fast Track decision. We know that this embargo won’t always be correct, however, we routinely check manuscripts in Apollo and update embargoes accordingly.

Figure 3. Simplified overview of the Fast Track process. The key decision is to determine the type of manuscript that has been submitted. Everything else is handled automatically.

Since launching Fast Track the average time to process a manuscript is 1-2 minutes. More than 8,000 items have been processed since launching the phase two Fast-Track interface. If items processed under the phase one effort are included, the number goes up to just over 14,000. And since a picture speaks a thousand words, Figure 4 below shows the effect produced by the new interface launched in September on our backlog of unprocessed submissions.

Figure 4. Historical change in the number of unprocessed open access manuscript submissions. The total number of outstanding manuscript submissions peaked at nearly 2,400 in September 2018. Immediately after launching the Fast Track website the backlog dropped dramatically and was completely eliminated by March 2019.

We will continue to develop Fast Track to further streamline our processing of manuscripts. We have already started to partner with librarians and administrators across the University to leverage the collective knowledge about open access which now exists within the University’s professional academic services.

Get in contact: If you are running a DSpace repository and would like to implement Fast Track to work alongside your existing workflows email us at support@repository.cam.ac.uk

Published 23 April 2019
Written by Dr Mélodie Garnier and Dr Arthur Smith
Creative Commons License

Blood: in short supply?

Two years ago (almost to the day) we called out Blood for their misleading open access options that they offered to Research Council and Charity Open Access Fund (COAF) authors. Unfortunately, little has changed since then:

Neither of these routes is sufficient to comply with either Research Councils’ or COAF’s open access policies which require that the accepted text be made available in PMC within 6 months of publication, or that the published paper is available immediately under a CC BY licence.

At the time, we called on Blood to change their offerings or we would advise Research Councils and COAF funded authors to publish elsewhere. And that’s exactly what’s happened:

Figure 1. All articles published in Blood since 2007 which acknowledge MRC, Wellcome, CRUK or BHF funding. Data obtained from Web of Science.

Over the last two years we’ve seen a dramatic decline in the number of papers being published in Blood by Medical Research Council (MRC), Wellcome Trust, Cancer Research UK (CRUK) and British Heart Foundation (BHF) researchers. The number of papers published in Blood that acknowledge these funders in now at its lowest point in over a decade.

It’s important to remember that the 23 papers published in Blood in 2017 are all non-compliant with the open access policies of Research Councils and COAF, and if these papers acknowledge Wellcome Trust funding then those researchers may also be at risk of losing 10% of their total grant. If you are funded by Research Councils or one of the COAF members, please consider publishing elsewhere. SHERPA/FACT confirms our assessment:

Sign the open letter

We’re still collecting signatures for our open letter to the editor of Blood in the hope that they’ll reconsider their open access options. Please join us by adding your name.