Clearing the final hurdle – automating embargo setting

One of the biggest issues facing the Open Access Team has been keeping up with the constant stream of accepted manuscripts that need to be processed. In many cases we receive notification of an accepted manuscript well before formal publication. This has presented a significant challenge over the last five years because although we know there is a publication forthcoming (or at least we trust that there this), we have no idea as to when an article may actually be published.

This means that we have many thousands of publication records in Apollo which have ‘placeholder’ embargoes because we simply did not know the publication date at the point of archiving and therefore could not set an accurate embargo. After archiving, many of the records in Apollo may have been supplemented with a publication date thanks to metadata supplied via Symplectic Elements, but we still need to set an accurate embargo.

In other cases we might be waiting for an article to be published gold open access so that we can update Apollo with the published version of record.

While we are now very adept at archiving manuscripts in Apollo (thanks in large part to Fast Track and Orpheus) it remains a challenge to properly and accurately update Apollo records with either correct embargoes for accepted manuscripts, or the open access version of record. It is a futile task to be constantly checking whether a manuscript has been published. While the Open Access Team keeps a list of every publication that requires updating, this is a thankless job that should be highly automatable.

To that end, we have recently leveraged Orpheus to do at lot of the heavy lifting for us. By interrogating every journal article in Apollo and comparing its metadata against Orpheus we can now quickly determine which items can be updated and take the necessary next steps, changing embargoes where appropriate or identifying opportunities to archive the published version of record.

To do this we created a DSpace curation task to check every “Article” type in Apollo that had at least one file that was currently under embargo. We then compared the publication metadata against the information held in Orpheus to determine what steps needed to be taken. In total we found 9,164 items in need of some attention. The results are displayed below in a Tableau Public visual and summarised in Table 1.

Of these items, 3,864 had a published open access version archived alongside the embargoed manuscript, so we skipped any further updating of these records. This is actually a very good sign, and indicates that the Open Access Team has been going back to records and supplementing them with the open access version of record.

Amongst the remaining items, 2,794 were successfully matched against Orpheus and had their embargoes verified: 1,862 records were updated with shorter embargoes and 412 had longer embargoes applied, leaving 520 items which were unchanged because they already had the correct embargo period.

The final 2,506 items were primarily composed of records with no publication date (1,132 items), publications that could potentially be supplemented by the open access version of record (537 items) or had no embargo information in Orpheus (434 items).

Table 1. Summary of outcomes after comparing Apollo records against Orpheus.

Date archived in Apollo2014201520162017201820192020Total
The item has an open VoR version710512001019130022673864
Accepted version – embargo updated21457613223051342794
No publication date available10159327142171132
Orpheus VoR embargo: 014511854517537
No AAM embargo information available3664393326425434
Other outcome837114472316212403
Total1915415841358152541224029164

We plan to run this curation task on a regular basis and periodically check the outcomes. Any items that continually fail to update will be processed manually by the Open Access Team, but our intention and desire is to move away from manual processing wherever possible.

Published 3 April 2020

Written by Dr Arthur Smith

Image showing that this blog post is under CC-BY licence.

Leave a Reply

Your email address will not be published, but will be held by us together with your comment. If you would prefer us not to hold this information please do not submit a comment. For more information about how we handle your personal information, please see http://www.lib.cam.ac.uk/privacy-policy

This site uses Akismet to reduce spam. Learn how your comment data is processed.