Category Archives: Uncategorized

Data sharing and reuse case study: the Mammographic Image Society database

26 August 2020Uncategorizeddata reuse, data sharing, open data, Open Research, open science, research dataDominic Dixon

The Mammographic Image Society (MIAS) database is a set of mammograms put together in 1992 by a consortium of UK academic institutions and archived on 8mm DAT tape, copies of which were made openly available and posted to applicants for a small administration fee. The mammograms themselves were curated from the UK National Breast Screening Programme, a major screening program that was established in the late 80s offering routine screening every three years to women aged between 50-64.

The motivations for creating the database were to make a practical contribution to computer vision research – which sought to improve the ability of computers to interpret images – and to encourage the creation of more extensive datasets. In the peer-reviewed paper bundled with the dataset, the researchers note that “a common database is a positive step towards achieving consistency in performance comparison and testing of algorithms”.

Due to increased demand, the MIAS database was made available online via third parties, albeit in a lower resolution than the original. Despite no longer working in this area of research, the lead author, John Suckling – now Director of Research in the Department of Psychiatry, part of Cambridge Neuroscience – started receiving emails asking for access to the images at the original resolution. This led him to dig out the original 8mm DAT tapes with the intention of making the images available openly in a higher resolution. The tapes were sent to the University Information Service (UIS), who were able to access the original 8mm tape and download higher resolution versions of the images. The images were subsequently deposited in Apollo and made available under a CC BY license, meaning researchers are permitted to reuse them for further research as long as appropriate credit is given. This is the most commonly used license for open datasets and is recommended by the majority of research funding agencies.

Motivations for sharing the MIAS database openly

The MIAS database was created with open access in mind from the outset. When asked whether he had any reservations about sharing the database openly, the lead author John Suckling noted:

“There are two broad categories of data sharing; data acquired for an original purpose that is later shared for secondary use; data acquired primarily for sharing. This dataset is an example of the latter. Sharing data for secondary use is potentially more problematic especially in consortia where there are a number of continuing interests in using the data locally. However, most datasets are (or should be) superseded, and then value can only be extracted if they are combined to create something greater than the sum of the parts. Here, careful drafting of acknowledgement text can be helpful in ensuring proper credit is given to all contributors.”

This distinction – between data acquired for an original purpose that is later shared for secondary use and data acquired primarily for sharing – is one that is important and often overlooked. The true value of some data can only be fully realised if openly shared. In such cases, as Suckling notes, sufficient documentation can help ensure the original researchers are given credit where it is due, as well as ensuring it can be reused effectively. This is also made possible by depositing the data on an institutional repository such as Apollo, where it will be given a DOI and its reuse will be easier to track.

Impact of the MIAS database

As of August 2020, the MIAS database has received over 5500 downloads across 27 different countries, including some developing countries where breast cancer survival rates are lower. Google Scholar currently reports over 1500 citations for the accompanying article as well as 23 citations for the dataset itself. A review of a sample of the 1500 citations revealed that many were examples of the data being reused rather than simply citations of the article. Additionally, a systematic review published in 2018 cited the MIAS database as one of the most widely used for applying breast cancer classification methods in computer aided diagnosis using machine learning, and a benchmarking review of databases used in mammogram research identified it as the most easily accessible mammographic image database. The reasons cited for this included the quality of the images, the wide coverage of types of abnormalities, and the supporting data which provides the specific locations of the abnormalities in each image.

The high impact of the MIAS database is something Suckling credits to the open, unrestricted access to the database, which has been the case since it was first created. When asked whether he has benefited from this personally, Suckling stated “Direct benefits have only been the citations of the primary article (on which I am first author). However, considerable efforts were made by a large number of early-career researchers using complex technologies and digital infrastructure that was in its infancy, and it is extremely gratifying to know that this work has had such an impact for such a large number of scientists.”. Given that the database continues to be widely cited and has been downloaded from Apollo 1358 times since January 2020, it is still clearly the case that the MIAS database is having a wide impact.

The MIAS Database Reused

As mentioned above, the MIAS database has been widely reused by researchers working in the field of medical image analysis. While originally intended for use in computer vision research, one of the main ways in which the dataset has been used is in the area of computer aided diagnosis (CAD), for which researchers have used the mammographic images to experiment with and train deep learning algorithms. CAD aims to augment manual inspection of medical images by medical professionals in order to increase the probability of making an accurate diagnosis.

A 2019 review of recent developments in medical image analysis identified lack of good quality data as one of the main barriers researchers in this area face. Not only is good quality data a necessity but it must also be well documented as this review also identified inappropriately annotated datasets as a core challenge in CAD. The MIAS database is accompanied by a peer-reviewed paper explaining its creation and content as well as a read me PDF which explains the file naming convention used for the images as well as the annotations used to indicate the presence of any abnormalities and classify them based on their severity. The presence of this extensive documentation combined with it having been openly available from the outset could explain why the database continues to be so widely used.

Reuse example: Applying Deep Learning for the Detection of Abnormalities in Mammograms

This research, published in 2019 in Information Science and Applications, looked at improving some of the current methods used in CAD and attempted to address some inherent shortcomings and increase the competency level of deep learning models when it comes the minimisation of false positives when applying CAD to mammographic imaging. The researchers used the MIAS database alongside another larger dataset in order to evaluate the performance of two existing convolutional neural networks (CNN), which are deep learning models used specifically for classifying images. Using these datasets, they were able to demonstrate that versions of two prominent CNNs were able to detect and classify the severity of abnormalities on the mammographic images with a high degree of accuracy.

While the researchers were able to make good use of the MIAS database to carry out their experiments, due to the inclusion of appropriate documentation and labelling, they do note that since it is a relatively small dataset it is not possible to rule out “overfitting”, where a deep learning model is highly accurate on the data used to train the model, but may not generalise well to other datasets. This highlights the importance of making such data openly available as it is only possible to improve the accuracy of CAD if sufficient data is available for researchers to carry out further experiments and improve the accuracy of their models.

Reuse example: Computer aided diagnosis system for automatic two stages classification of breast mass in digital mammogram images

This research, published in 2019 in Biomedical Engineering: Applications, Basis and Communications, used the MIAS database along with the Breast Cancer Digital Repository to test a CAD system based on a probabilistic neural network – a machine learning model that predicts the probability distribution of a given outcome – developed to automate classification of breast masses on mammographic images. Unlike previously developed models, their model was able to segment and then carry out a two-stage classification of breast masses. This meant that rather than classifying masses into either benign or malignant, they were able to develop a system which carried out a more fine-grained classification consisting of seven different categories. Combining the two different databases allowed for an increased confidence level in the results gained from their model, again raising the importance of the open sharing of mammographic image datasets. After testing their model on images from these databases, they were able to demonstrate a significantly higher level of accuracy at detecting abnormalities than had been demonstrated by two similar models used for evaluation. On images from the MIAS Database and Breast Cancer Digital Repository their model was able to detect abnormalities with an accuracy of 99.8% and 97.08%, respectively. This was also accompanied by increased sensitivity (ability to correctly classify true positives) and specificity (ability to correctly classify false negatives).

Conclusion

Many areas of research can only move forward if sufficient data is available and if it is shared openly. This, as we have seen, is particularly true in medical imaging where despite datasets such as the MIAS database being openly available, there is a data deficiency which needs to be addressed in order to improve the accuracy of the models used in computer-aided diagnosis. The MIAS database is a clear example of a dataset that has enabled an important area of research to move forward by enabling researchers to carry out experiments and improve the accuracy of deep learning models developed for computer-aided diagnosis in medical imaging. The sharing and reuse of the MIAS database provides an excellent model for how and why future researchers should make their data openly available.

Published 20th August 2020
Written by Dominic Dixon

The Role of Open Data in Science Communication

19 June 2020Uncategorizeddata champions, open data, Open Research, scholarly communicationMaria Angelaki

Itamar Shatz has written a guest blog post for the Office of Scholarly Communication about how public trust in the scientific community increases when researchers make their data openly available to all. He also emphasizes that science communicators (e.g. press offices, journalists, publishers) have a responsibility to point attention directly at the primary source of the data. Itamar is a PhD candidate in the Department of Theoretical and Applied Linguistics at the University of Cambridge. He is also a member of the Cambridge Data Champion programme, having joined at the start of this year. He writes about science and philosophy that have practical applications at Effectiviology.com.

It’s no secret that the public’s view of the scientific community is far from ideal.

For example, a global survey published by the Wellcome Trust in 2019 showed that, on average, only 18% of people indicate that they have a high level of trust in scientists. Furthermore, the survey showed that there are stark differences between people living in different areas of the world; for instance, this rate was more than twice as high in Northern Europe (33%) and Central Asia (32%) than in Eastern Europe (15%), South America (13%), and Central Africa (12%).

Things do appear to be improving, to some degree, especially in light of the recent pandemic. For example, a recent survey in the UK, conducted by the Open Knowledge Foundation, has found that, following the COVID-19 pandemic, 64% of people are now “more likely to listen expert advice from qualified scientists and researchers”. Similar increases in public confidence have been found in other countries, such as Germany and the USA. However, despite these recent increases, there is still much room for improvement.

Open data can help increase the public’s confidence in scientists

The public’s lack of confidence in scientists is a complex, multifaceted issue, that is unlikely to be resolved by a single, neat solution. Nevertheless, one thing that can help alleviate this issue to some degree is open data, which is the practice of making data from scientific studies publicly accessible.

Research on the topic shows just how powerful this tool can be. For example, the recent survey by the Open Knowledge Foundation, conducted in the UK in response to the COVID-19 pandemic, found that 97% of those polled believed that it’s important for COVID-19 data to be openly available for people to check, and 67% believed that all COVID-19 related research and data should be openly available for anyone to use freely. Similarly, a 2019 US survey conducted before the pandemic found that 57% of Americans say that they trust the outcomes of scientific studies more if the data from the studies is openly available to the public.

Overall, such surveys strongly suggest that open data can help increase the public’s trust in scientists. However, it’s not enough for studies to just have open data for it to increase the public’s trust; if people don’t know about the open data, or if don’t fully understand what it means, then open data is unlikely to be as beneficial as it could be. As such, in the following section we will see some guidelines on how to properly incorporate open data into science communication, in order to utilize this tool as effectively as possible.

How to incorporate open data into science communication

To properly incorporate open data into science communication, there are several key things that people who engage in science communication—such as journalists and scientists—should generally do:

Say that the study has open data. That is, you should explicitly mention that the researchers have made the data from their research openly available. Do not assume that people will go to the original study and then learn there about the data being open.
Explain what open data is. That is, you should briefly explain what it means for the data to be openly available, and potentially also mention the benefits of making the data available, for example in terms of making research more transparent, and in terms of helping other researchers reproduce the results.
Describe what sort of data has been made openly available. For example, you can include descriptions of the type of data involved (surveys, clinical reports, brain scans, etc.), together with some concrete examples that help the audience understand the data.
Explain where the data can be found. For example, this can be in the article’s “supplementary information” section, though data should preferably be available in a repository where the dataset has its own persistent identifier, such as a DOI. This ensures that the audience can find and access the data, which may otherwise be hidden behind a paywall, and offers other benefits, such as allowing researchers to directly access and cite the dataset, without navigating through the article.

These practices can help people better understand the concept of open data, particularly as it pertains to the study in question, and can help increase their trust in the openness of the data, especially if it is placed somewhere that they can access themselves.

For one example of how open data might be communicated effectively in a press release, consider the following:

“The researchers have made all the data from this study openly available; this means that all the results from their experiments can be freely accessed by anyone through a repository available at: https://www.doi.org/10.xxxxx/xxxxxxx. This can help other scientists verify and reproduce their results, and will aid future research on the topic.”

Open data in different types of scientific communications

It’s important to note that there’s no single right way to incorporate open data into scientific communications. This can be attributed to various factors, such as:

Differences between fields (e.g. biology, economics, or psychology)
Differences between types of studies (e.g. computational or experimental)
Differences between media (e.g. press release or social media post).

Nevertheless, the guidelines outlined earlier can be beneficial as initial considerations to take into account when deciding how to incorporate open data into science communication. It is up to communicators to make the final modifications, in order to use open data as effectively as possible in their particular situation.

Summarizing what we’ve learned

Though the public’s trust in science is currently growing, there is much room for improvement. One powerful tool that can aid the academic community is open data—the practice of making data from research studies openly available. However, to benefit as much as possible from the presence of open data, it’s not sufficient for a study to merely make its data open. Rather, the accessibility of the data needs to be promoted and explained in scientific communication, and the dataset needs to be cited appropriately (see the Joint Declaration of Data Citation Principles for guidelines regarding this latter point).

What is currently being done

It is important to note that much work is already being done to promote the concept of open data. For example, organizations such as the Research Data Alliance promote discussion of the topic and publish relevant material, as in the case of their recent guidelines and recommendations regarding COVID-19 data.

In addition, at the University of Cambridge, in particular, we can already see a substantial push for open data practices, where appropriate, and from many angles as outlined in the University’s Open Research position statement. Many funding bodies mandate that data be made available, and the University facilitates the process of sharing the data via Apollo, the institutional repository. Furthermore, there are the various training courses and publications—including this very blog—led by bodies such as the Office of Scholarly Communication (OSC), which help to promote Open Research practices at the University. Most notably, there is the OSC’s Data Champion programme, which deals, among other things, with supporting researchers with open data practices.

Moving forward

Promoting the use of open data in scientific communication is something that different stakeholders can do in different ways.

For example, those engaging in science communication—such as journalists and universities’ communication offices—can mention and explain open data when covering studies. Similarly, scientists can ask relevant communicators to cite their open data, and can also mention this information themselves when they engage in science communication directly. In addition, consumers of scientific communication and other relevant stakeholders—such as the general public, politicians, regulators, and funding bodies—can ask, whenever they hear about new research findings, whether the data was made openly available, and if not, then why.

Overall, such actions will lead to increased and more effective use of open data over time, which will help increase the trust people have in scientists. Furthermore, this will help promote the adoption of open data practices in the scientific community, by making more scientists aware of the concept, and by increasing their incentives for engaging in it.

Published 19 June 2020

Written by Itamar Shatz

Clearing the final hurdle – automating embargo setting

3 April 2020Uncategorizedopen access, Research Excellence FrameworkArthur Smith

One of the biggest issues facing the Open Access Team has been keeping up with the constant stream of accepted manuscripts that need to be processed. In many cases we receive notification of an accepted manuscript well before formal publication. This has presented a significant challenge over the last five years because although we know there is a publication forthcoming (or at least we trust that there this), we have no idea as to when an article may actually be published.

This means that we have many thousands of publication records in Apollo which have ‘placeholder’ embargoes because we simply did not know the publication date at the point of archiving and therefore could not set an accurate embargo. After archiving, many of the records in Apollo may have been supplemented with a publication date thanks to metadata supplied via Symplectic Elements, but we still need to set an accurate embargo.

In other cases we might be waiting for an article to be published gold open access so that we can update Apollo with the published version of record.

While we are now very adept at archiving manuscripts in Apollo (thanks in large part to Fast Track and Orpheus) it remains a challenge to properly and accurately update Apollo records with either correct embargoes for accepted manuscripts, or the open access version of record. It is a futile task to be constantly checking whether a manuscript has been published. While the Open Access Team keeps a list of every publication that requires updating, this is a thankless job that should be highly automatable.

To that end, we have recently leveraged Orpheus to do at lot of the heavy lifting for us. By interrogating every journal article in Apollo and comparing its metadata against Orpheus we can now quickly determine which items can be updated and take the necessary next steps, changing embargoes where appropriate or identifying opportunities to archive the published version of record.

To do this we created a DSpace curation task to check every “Article” type in Apollo that had at least one file that was currently under embargo. We then compared the publication metadata against the information held in Orpheus to determine what steps needed to be taken. In total we found 9,164 items in need of some attention. The results are displayed below in a Tableau Public visual and summarised in Table 1.

Of these items, 3,864 had a published open access version archived alongside the embargoed manuscript, so we skipped any further updating of these records. This is actually a very good sign, and indicates that the Open Access Team has been going back to records and supplementing them with the open access version of record.

Amongst the remaining items, 2,794 were successfully matched against Orpheus and had their embargoes verified: 1,862 records were updated with shorter embargoes and 412 had longer embargoes applied, leaving 520 items which were unchanged because they already had the correct embargo period.

The final 2,506 items were primarily composed of records with no publication date (1,132 items), publications that could potentially be supplemented by the open access version of record (537 items) or had no embargo information in Orpheus (434 items).

Table 1. Summary of outcomes after comparing Apollo records against Orpheus.

Date archived in Apollo	2014	2015	2016	2017	2018	2019	2020	Total
The item has an open VoR version	7	105	1200	1019	1300	226	7	3864
Accepted version – embargo updated		2	145	76	132	2305	134	2794
No publication date available			10	159	32	714	217	1132
Orpheus VoR embargo: 0	1	4	51	18	5	451	7	537
No AAM embargo information available	3	6	64	39	33	264	25	434
Other outcome	8	37	114	47	23	162	12	403
Total	19	154	1584	1358	1525	4122	402	9164

We plan to run this curation task on a regular basis and periodically check the outcomes. Any items that continually fail to update will be processed manually by the Open Access Team, but our intention and desire is to move away from manual processing wherever possible.

Published 3 April 2020

Written by Dr Arthur Smith

2019 That Was The Year That Was 

26 February 2020UncategorizedMaria Angelaki

This is our traditional yearly blog about what we have been doing at the OSC in Cambridge. We are publishing it a little later than intended, but this is an indication of how busy the beginning of 2020 has been here in the Office of Scholarly Communication.

2019 saw us more in a ‘business as usual’ phase as we knuckled down and got on with supporting researchers in Cambridge. That aside, we still had some major developments in Open Research and this work will continue into 2020 and beyond. 

Policy changes 

2019 saw a number of happenings in the policy space at Cambridge. Most excitingly, the University’s Position Statement on Open Research was announced in February, making it one of the first UK universities to have such a statement. This demonstrates the University’s commitment to making open research a reality at Cambridge.

Following on from this, in July 2019, the University together with Cambridge University Press  announced that they have signed up to the San Francisco Declaration on Research Assessment (DORA). The newly created Open Research Steering Committee, headed by the University’s Pro-Vice Chancellor for Research, will have oversight over the open research direction and the implementation of DORA. The Steering Committee and their working groups are currently looking into open research training, open research infrastructure (such as electronic research notebooks), Plan S and DORA. 

In December, an updated version of the Research Data Management Policy Framework was released. This update brings the policy framework in alignment with funder requirements and acknowledges the important roles that Principal Investigators, research staff and students, and University support staff play in good data management practices. It sits beneath the Position Statement on Open Research, with the documents being closely aligned. 

Open access news

The Open Access Service made great strides towards automating many of its processes this year, headlined by the introduction of Orpheus and Fast Track. Orpheus is a custom database of publisher open access policies, and when combined with Fast Track for manuscript processing, it allows the Open Access Service to reduce the number of steps required to archive a manuscript in Apollo. In 2019, 8325 manuscript submissions were processed through Fast Track. In total, the Open Access Service responded to 13,609 submissions or enquiries in 2019, equal to 37 requests per day. 

Our Request a Copy service received 7,626 requests in 2019. One of the most requested items was “HIV-1 remission following CCR5Δ32/Δ32 haematopoietic stem-cell transplantation” (DOI: 10.1038/s41586-019-1027-4), which received 77 requests. The authors of the paper responded to and fulfilled each request, enabling the readers to obtain free access to the publication, and well ahead of Nature’s six-month embargo. However, since the accepted manuscript is now out of embargo, it has received a further 326 downloads to date in Apollo. The success of the Request a Copy service once again demonstrates the need for access to scholarly research at the earliest opportunity. Embargoes, even ‘short’ 6 month embargoes, are a needless barrier to the University’s research outputs. 

Data news 

Aside from the update to the Research Data Management Policy Framework (see above), the most significant development from 2019 has been the continued evolution of the Data Champion Programme.

We welcomed 40 new Data Champions (DCs) from across several Schools increasing the size of our network to 86. With such a large cohort of Champions a new idea of creating departmental hubs was initiated to increase collaboration and the sharing of practices by Data Champions from the same areas. This has proved really successful in both Chemistry and Engineering, with a more coordinated approach having the effect of greater productivity from the Champions in those areas in engaging others with data management.

In 2019, the Data Champions also tried out a mentoring scheme for the first time whereby established Champions support new Champions in finding their feet and give them ideas about how to provide support to their own community. This has proved to be a great success and the scheme is being run for a second year for the new cohort of Champions joining in early 2020.

Finally, a new paper on the Data Champion community was published, Establishing, Developing and Sustaining a Community of Data Champions, by DC alumnus James Savage and our colleague Lauren Cadwallader in Data Science Journal. 

Thesis news

The requirement to deposit an electronic copy of a PhD thesis in order to graduate has become normal business now. In 2019, 1197 of theses were deposited with 47% being made fully open access. In addition, around 100 requests to digitise historical theses were received from their authors and 1015 requests for scans of historical theses were received from requesters.

Training

In 2019 we took a broad perspective and examined how training was contributing to promoting and supporting Open Research at Cambridge. The Task Group on Open Research Training, comprised of representatives of several libraries and colleagues from other areas of the University, conducted a number projects to understand where we are at the moment and plan a strategy for the future. The details of that work will be presented at the RLUK 2020 conference in March but, as a ‘sneak peek’, here are some of the conclusions we drew: 

We’re stronger together: researchers will benefit if we build stronger communication between training providers. 
Open Research training should not be seen in isolation to the rest of research, rather it should be a key component of the way students learn to do research. 
Postdocs and senior researchers want to learn independently, we can support them with better-presented information online and by facilitating events and dialogue. 
We want to be able to constantly improve our training and demonstrate impact by exploring ways to evaluate ourselves, while also being aware of the lurking danger of irresponsible metrics in our own evaluation.

Alongside the strategy work, we continued to expand the training we offer on Open Access, Research Data Management, publishing, copyright and more. A growing number of departments have requested sessions and we have partnered with PLOS and the Office for Postdoctoral Affairs to deliver a regular session on peer review. We delivered 56 sessions, reaching over 800 researchers and librarians. In addition, we have offered a session about complying with the REF Open Access requirements to departments; the Open Access team outdid themselves by delivering 20 sessions to individual departments in just over three months. 

Outreach activities

In 2019 we hosted several events, from workshops to a one-day symposium dealing with open access monographs, FAIR data, preprints, reproducibility in social sciences, Plan S developments in the USA and open research in STEMM. 

Of notable interest is the Symposium on Open Monographs held in October at St Catharine’s College. This one-day event brought together researchers, funders, publishers and learned societies to discuss the benefits and challenges of an open landscape for academic books. The recordings are featured in the OSC YouTube channel and most of the presentations are available in our institutional repository, Apollo. A summary of the key themes that emerged from this symposium were later presented in Unlocking Research. 

October would not have been complete without celebrating Open Access Week. During the week we shared various blogs and online resources and we were delighted to announce the launch of our popular Research Support Ambassador Programme as an open educational resource designed to give learners either an introduction or refresher on key elements of research support. 

Systems 

Apollo has participated in a joint pilot study with Jisc, Symplectic and Sheffield Hallam University to look best approaches to integrate the Jisc Publications Router and the research information system Symplectic Elements, via institutional repositories. This pilot has involved working together to look at how well Elements could capture details of articles that Router had sent to our repositories. Router currently works with EPrints and DSpace repositories, the platforms used by Sheffield Hallam and Cambridge respectively.

Symplectic’s Repository Tools 2 (RT2) integration module was used to harvest Apollo and de-duplicate them against any existing Elements records. We tested how well this worked for repository records deposited automatically by Router, looking in particular at the volume of duplicate publications and how early after acceptance notifications were received from Router. The study demonstrated that Router and Elements are technically compatible when used in this way. As a result of this pilot, Jisc and Symplectic are now happy to offer this solution to institutions more widely. 

Some excellent work behind the scenes has resulted in Jisc publishing a series of blogs last November. Their third blog showcases the ORCID IDs in Research Data Management workflows at the University of Cambridge and how a workflow has been implemented in order to create seamless links between researchers and their works using identifiers and different services. Such solutions improve visibility and discoverability across systems, reduce duplication of effort in entering information and avoid identification errors.

This work was made possible by Agustina Martínez García of the Office of Scholarly Communication, Owen Roberson of the Research Office, and Dean Johnson of University Information Services (UIS) who were amongst the winners of the professional services recognition scheme two years ago for their effective collaborative work on the integration of Symplectic Elements and Apollo.

According to the blog, as of September 2019, 25,550 articles, 1,329 conference proceedings and 1,100 datasets in Apollo have ORCID IDs. 

Saying a big thank you

2019 saw the departure of the University’s first Head of Scholarly Communication, Dr Danny Kingsley. Many of the achievements of 2019 were due to hard work Danny put in before her departure and for this we’d like to thank her for all she contributed. 

Published 26 February

Compiled by: Maria Angelaki

Contributions from Agustina Martínez-García, Bea Gini, Maria Angelaki, Lauren Cadwallader, Sacha Jones and Arthur Smith.

Embarking on a career in open access

25 October 2019Uncategorizedopen access, scholarly communicationMaria Angelaki

Lorraine and Olivia started working as Scholarly Communication Support in the Open Access team at the Office of Scholarly Communication (OSC) in the University Library this summer. In this interview, they share their experience of starting a new role in the field of open access, from the perspective of their respective backgrounds in academia and publishing.

What does working in Scholarly Communication Support entail and what are your responsibilities in this role?

For the first few months joining the Open Access team we both started looking at “Fast Track deposits”, the simplest route of depositing author’s manuscripts into Apollo, the University of Cambridge institutional repository. This system allows the team to process items more quickly than the manual Apollo deposit. Since its launch in September 2018, it has considerably helped to reduce the workload as manuscript submission for archiving in Apollo continues to increase in view of the upcoming REF2021. On a daily basis, we also deal with queries from tickets created on the Open Access Helpdesk, contacting authors and publishers when further information is required and manually depositing manuscripts on Apollo while also updating their records on Symplectic Elements, the University’s research information management system.

Olivia and I are now being trained to respond to researchers’ funding queries and to process invoices for journals’ open access fees from the RCUK and COAF block grants. In order to do this we have had to learn more in depth about open access requirements and Research Councils’ funder requirements.

More recently, we have been working with Units of Assessment to support them with the open access component for REF (Research Excellence Framework) compliance, attending training sessions and reviewing Unit of Assessment outputs for eligibility. This has involved researching and interpreting the REF 2021 requirements for open access to disseminate effectively to academics and administrators. It has been illuminating to gain the perspective of different faculties, the way that they have to engage with REF, and their grapples with open access compliance.

What are your respective backgrounds and how did you decide to start working in OA?

Lorraine: Prior to working in open access, I completed a PhD in History of Art in Cambridge, looking at specific intersections between early modern artworks, medicine, and theories of the imagination. I also worked as a postdoctoral researcher at CRASSH (Centre for Research in the arts, social sciences and humanities) for one year.

I first became interested in OA and Scholarly Communication during my studies as a PhD representative for my peers in History of Art between 2017 and2018, the year that electronic deposits of PhD theses via Apollo became a requirement. There were anxieties from my peers around this new requirement, especially in relation to the open access feature: what would this mean for publishing their first monographs from their PhD thesis as Early Career Researchers? Would publishers still be interested in their work after it had been made OA? And, especially, what about the hundreds of copyrighted images present in their theses? It would have taken months to obtain permission to reproduce all of those images. During this time, I liaised with the OSC, the head of the AHRC  Doctoral Training Partnership programme (as part of the RCUK, the AHRC also has its own open access requirements that apply to PhDs), communicated with faculty staff during meetings, and reported the advice I had gathered to my peers. I see this new position in the OSC Open Access team as an excellent opportunity to understand better what happens behind the scenes of an institutional repository and gain more knowledge about the broader picture of open access in academic research.

Olivia: I left academic publishing with a sense that the model was broken. Expensive paywalls restrict access to those seeking to access information and academics were becoming increasingly disenchanted with the publishing model. These issues particularly hit home following two separate instances. The first, a letter sent to the publisher by a prisoner seeking further information on a criminology text, one which was prohibitively expensive and inaccessible to such an individual. The second, a cuttingly written forward by an academic around monograph publishing and the ivory towers in which university elites and academic publishers co-exist.

Academic publishing very much feels like the other side of what I am doing with open access, making research as freely and widely available as possible.

How do you think your past experiences have helped you to have the necessary skills for working in OA?

Lorraine: As a Cambridge student, I acquired a good knowledge of Cambridge’s unique research and teaching landscape (Schools, Faculties, Departments, Colleges, Research centres, etc.). My academic background also meant that I had hands-on understanding about the process of research, publishing in a peer-reviewed journal, and even submitting my outputs through Symplectic Elements. These were really helpful starting my new role: understanding how researchers work is crucial in scholarly communication and definitely helps me to advise and communicate with researchers better. I am, for instance, particularly interested in the relationship between open access and third-party copyright (especially images from cultural heritage institutions, i.e. galleries, libraries, archives and museums) and the challenges it brings to researchers in the Arts and Humanities.

Olivia: I have found my previous work in publishing an asset working in open access because of my knowledge of the editorial and production process as well as publishing revenue models. I am familiar with the time scales for journal articles and books production as well as publishers’copyright requirements which I have found I am using on a regular basis. Working extensively with academics in a production role, I am aware of the competing pressures placed on them and their need for clear and accessible information on fulfilling publishing commitments or REF compliance.

Now that you have started your new roles, what are the tips you would give to someone interested in starting a career in OA?

Picking up from last year’s blogpost, and from our own experience: keeping up to date with developments, attention to detail, supporting academics and seeking support from the open access community are four key areas when starting in a career in OA.

Keeping up to date with developments and attention to detail

Publisher’s and funders’ open access policies change very quickly, as do the methods we adopt within the team to cope with the workflow and with the challenges brought by REF 2021. Anyone starting a career in OA needs to keep up to date with changes, be capable of doing in-depth research about those, and be comfortable admitting not knowing everything! The landscape is constantly changing and having an awareness of new proposals and initiatives makes the big picture much clearer.

Supporting academics

Give academics a break. It will take you a while to feel confident with policy and guidance and for you, it is your whole job. For the academics submitting their papers and contacting the repository, this is one small part of their role; you need to guide them through it as painlessly as possible.

Seek support

You cannot and do not know everything about open access. Luckily, there are plenty of wonderful expert colleagues who can help, so it is really important to know how to work within a team and keep building the necessary knowledge as a group.

Published 25 October 2019

Written by Lorraine de la Verpilliere, Olivia Marsh

This icon displays that the content of this blog is licensed under CC BY 4.0

Image Copyright and Open Access in the Arts and Humanities

25 October 2019Uncategorizedart history, copyright education, heritage institutions, Humanities, image copyright, monographs publishing, open access, third-party copyrightMaria Angelaki

Copyright is a crucial topic in the Humanities because researchers in several disciplines (especially history of art, my field of study) rely on images for their work and because publishers usually require authors to pay copyright holders for permission to reproduce those images – failure to do so would make the author and the publisher liable for copyright infringement.

At the OSC Symposium last 2^nd October 2019 (Open Access Monographs: From Policy to Reality), Dr Nicola Kozicharow’s presentation on ‘Open Research Publishing in the Humanities’ made quite an impact on the discussions of the day. This early career historian of art, specialised in 19^th– and 20^th-century European and Russian art, talked about the challenge of publishing when third-party image copyright is involved. She detailed the difficult and sometimes grotesque situations that she and her contributor faced when publishing her first co-edited book Open Access, tracking down image copyright holders and paying exorbitant reproduction fees (1).

Not many academics outside the Arts and Humanities know about the invisible labour and material cost involved when working with images. Researchers struggle to find images on various heritage institutions’ websites (or GLAMs, as we call them – i.e. galleries, libraries, archives and museums), and pay to obtain digital images ‘for private use’ when the original work is unavailable or located too far. They often end up paying again in order to re-use those images when publishing their research. Even more frustrating is the lack of consistency between different institutions with regard to the amount of the fees and to the exemptions granted. If you beg the museum repeatedly and reach out to curators, you may have a small chance to have your permission fees waived (but still often in return for providing a free copy of your book/article). However, when sales department/companies act as intermediaries between researchers and museums, this kind of trick is most likely to fail, and the chances of opening the discussion about the absurdities of the fees get even slimmer. In 2018, Bridgeman Images, one of those ‘Image companies’, obtained the exclusive right of selling and licensing all images from Italian national museums, which was catastrophic news for art history (see their statement here).

The situation feels even more unacceptable when it concerns out-of-copyright works of art. In this case, heritage institutions in fact do not own copyright over the work as it has fallen in the public domain. Most GLAMs, however, manage to keep control of these works’ images by banning photography (the famous ‘no photo’ policies in permanent collections or temporary exhibitions) and by creating copyright by making their own photograph of the work that they subsequently sell to researchers.

An article by medieval art historian Kathryn M. Rudy published in Times Higher Education (also quoted by Kozicharow at the symposium) is a good example (2). There, Rudy detailed specific examples she encountered in her career and broke down the (shockingly high) real cost of working with images – she claims that the fees to publish images for her academic work since 2011 total £24,000 from her own pocket.* “The more successful I am, the poorer I get”, she says. The article went viral on academic Twitter networks and retweets and comments shine a light on the fact that many scholars face similar problems – one user ironically pointed out that it would be much cheaper to include with each book sold a packet of postcards from the museum than paying their prohibitive reproduction fees! (@winchester_books).

This thorny issue of image copyright permissions in research publications is sadly not new. In the last couple of years, however, historians of art in the UK have succeeded to keep the issue at the front of the public debate. Back in 2017, an ‘End-fees-for-images’ campaign was started by Dr Bendor Grosvenor and Dr Richard Stephen. Along with 28 leading British art historians, they openly called for UK national museums to abolish image fees for out of copyright works of art in a letter published in The Times (3). Many other researchers in the field quickly added their names to this call through a petition on change.org. This campaign was supported in parallel by Grosvenor’s blog, Art History News – his strong presence in the media as a BBC4 presenter and on social media (@arthistorynews) also helped to promote the campaign.

This campaign revealed that there are in fact tools in the UK’s legal arsenal that art historians could use to limit fees. The 2015 Re-Use of Public Sector Information Regulations (RPSI), for instance, which “prevents publicly funded bodies from commercialising public assets” including publicly owned pieces of art. These regulations “do allow image fees to be charged, but only to cover the actual costs involved, and a very small ‘profit’”(4). They remain, however, very little used and barely known – both by researchers and museums. Interestingly, during the OSC’s Open Access Monographs symposium, it was also brought up that ‘fair dealing’ exceptions to copyright by way of quotation for the purpose of ‘criticism or review’ have not often been used by researchers and applied to visual material (5). Both RPSI and ‘fair dealing’ by quotation are in the end quite complex legal tools and, understandably, no art historians nor their publishers want to take the risk of a court case. We also have to take into account the wish of scholars to preserve good relationships with national heritage institutions in the UK – as images are their primary materials, their academic work depends on it entirely!

During the Open Access Monograph symposium, the comment was made that this issue of high image reproduction fees as a barrier to Open Access publication was a misconception – that the real problem was instead about wider ‘digital’ and ‘online’ issues. However, the fact remains that permission fees are much higher if the image falls into the following categories (often used in image permission fees forms): ‘worldwide’, ‘online’, and ‘freely available’. How is this supposed to encourage researchers in the art and humanities to publish their research Open Access? We could, however, also frame the issue in a more positive way – what if Open Access itself could help humanities researchers deal with images better? Dr Kozicharow acknowledged the great support she received from Open Book Publishers (OBP) in allowing her to reproduce as many colour images she needed for her book. Kathryn M. Rudy, in her recent book also published with OBP, was able to display images in an innovative way (6). In order to contain costs, when images were already widely available, she instead added links on stable GLAMs websites – even QR codes in the case of the printed edition! Perhaps art historians should see open access publishing as a good opportunity to find innovative ways to think about solutions for images. Of course, there remains the problem of how Open Access is perceived in the Humanities, open access books not being sufficiently reviewed and often not deemed legitimate enough in the process of securing permanent positions and promotions – but this is a separate issue.

What would be needed to help with image permission costs in art and humanities publishing?

In light of the growing requirements for open access publications, there should be better financial provisions to support researchers from universities and funding bodies. A recent report on Open Access from the Universities UK Open Access and Monographs Group, however, shows that there is a growing acknowledgement of the impossible situation faced by specific disciplines who rely on third-party material when publishing – such as history of art or archaeology. The UUK OA Monographs group notably recommended that “Given the already complex nature and expense of re-use clearance for illustrations and other third-party rights material in books, and the additional complexity and expense introduced by OA, an exception should be considered in any OA policy for books that require significant use of third-party rights materials” (7).

Most of all, cultural heritage institutions have to do better. It does not seem unreasonable to be able to reproduce an image for free with the appropriate credit to the institution when a work of art is in the public domain. Some institutions worldwide have already started making their image collections open access or at least free of copyright fees for researcher’s publications. For example, Gallica, the Bibliothèque Nationale de France’s digital library, just changed its policy in favour of the latter. Positive changes such as these, that benefit the public and research, are being recorded and supported by the excellent Open GLAM initiative, funded by the European Commission. The new EU copyright directive (provided it can apply after Brexit?) should give the final push to get there, as it will allow free re-use of images of works of art in the public domain, even for commercial purposes.

Published 25 October 2019

Written by Dr Lorraine de la Verpillière

*Correction: The £24,000 figure in fact corresponds to fees Rudy paid to obtain the high-res image files for her academic work since 2011. The figure gets even higher when including the said images copyright fees – in the same article, she mentions for instance a £5,683 invoice from the Bodleian for the reproduction cost of her next book.

If you are a researcher at Cambridge University and need more information about third-party copyright, the following resources are for you:

Libguides

Architecture & History of Art: Copyright and plagiarism

Copyright for Researchers

Face-to-face training sessions [available to Cambridge University only]

Copyright: a survival guide (for PhD students in Humanities, Arts and Social Sciences)

Do You Really Own Your Research? Copyright, Collaboration, and Creative Commons

Your faculty or department may also run bespoke sessions, asking your librarian is the best way to find out.

References

(1) Louise Hardiman and Nicola Kozicharow, Modernism and the Spiritual in Russian Art: New Perspectives. Cambridge, UK: Open Book Publishers, 2017, https://doi.org/10.11647/OBP.0115

(2) Kathryn M. Rudy, ‘The true costs of research and publishing’, Times Higher Education, August 29 2019 (Url: https://www.timeshighereducation.com/features/true-costs-research-and-publishing#survey-answer)

(3) Matthew Moore, ‘Museum fees are killing art history, say academics’, The Times, November 6 2017 (Url: https://www.thetimes.co.uk/edition/news/museum-fees-are-killing-art-history-say-academics-qhfwmdws6 accessed: 10/10/2019)

(4) Bendor Grosvenor, ‘Why museums should abolish image fees (ctd.)’, Art History News blog, August 20 2018 (Url: https://www.arthistorynews.com/articles/5241_Why_museums_should_abolish_image_fees_(ctd.) accessed: 10/10/2019)

(5) Amendments to the The Copyright, Designs and Patents Act 1988 in the UK law since 2014, http://www.legislation.gov.uk/uksi/2014/2356/regulation/3/made

(6) Kathryn M. Rudy. Image, Knife, and Gluepot: Early Assemblage in Manuscript and Print. Cambridge, UK: Open Book Publishers, 2019, https://doi.org/10.11647/OBP.0145

(7) Universities UK Open Access and Monographs Group, ‘Third-party rights’, in Open access and monographs evidence review, October 2019, p. 10-12 (PDF: https://www.universitiesuk.ac.uk/policy-and-analysis/reports/Documents/2019/UUK-Open-Access-Evidence-Review.pdf accessed 13/10/2019).

Chasing cash cows in a swamp? Perspectives on Plan S from Australia and the USA

24 October 2019Uncategorizedmetrics, open access, Plan SMaria Angelaki

Plan S was born in Europe, yet from the very start it aspired to accelerate conversations around open access on a global scale. After all, if free access to research outputs is good in one place, it will be good everywhere, right? Well, it turns out that things may not be that simple.

In this Open Access Week, we look East and West to find out how Plan S is being received across the globe. Dr Danny Kingsley explores how reliance on foreign students has trapped Australian universities in a ‘Faustian bargain’ with publishers and reduced the scope for change. Micah Vandegrift reports on the type of conversations that Plan S has inspired in the USA, as well as the potential political barriers, sounding a note of cautious optimism.

The uptake of Plan S or equivalent principles in countries beyond Europe is crucial to the overall success of the movement. Publishers are using the fact that uptake currently has limited geographic scope to stall change, arguing that they cannot alter their model to suit the requirements of a relatively small percentage of authors. The number of supporting funders is still small and concentrated in Europe, with a few US players. China initially looked set to join in and thus change the game, but since the end of 2018 we have seen little progress on that front. Has Plan S been successful in shaping conversations around the world?

Hearing from our colleagues in other countries highlights some of the promises and challenges Plan S is facing in making an impact outside Europe. Learning about those raises a number of interesting points for how we advocate for open access at home too.

Dr Danny Kingsley: Australia

Photo of Sydney Opera House over a calm sea. — Sydney Opera House. ‘ Plan S has not really caused much of a ripple Down Under ‘.

Rankings are a natural enemy of openness

When first approached by the Office of Scholarly Communication to write a piece about Plan S in Australia, my initial response was it would be very short. That is, Plan S has not really caused much of a ripple Down Under. Those in the know – people working in scholarly communication and some senior members of research institutions – are aware and watching closely. But as far as opening up a general discussion amongst the academic community, this simply hasn’t happened.

Over the past six months I have been trying to understand where some of the problems lie when it comes to openness in Australia. It is more fundamental than the usual concerns researchers have about Open Access, and goes to the heart of how universities work here.

Where the money flows

First a quick run-down on how research funding to universities works in Australia. There are only two government funders – the National Health and Medical Council (NHMRC) and the Australian Research Council (ARC). The amount of funding these granted in 2017-2018 was about $943 million and $758 million respectively to all research organisations. As a comparison, the Wellcome Trust endowed in the range of £10m – £50m in Australia in 2017-18. For those interested there is a full breakdown of sources of research funding.

The funder policies on Open Access and Research Data Management are pretty weak overall. The NHMRC policy requires that any peer reviewed publication be available in a repository 12 months after publication and “strongly encourages researchers to consider the reuse value of their data and to take reasonable steps to share research data and associated metadata arising from NHMRC supported research”. The ARC policy requires the metadata of research outputs to be available in a repository 3 months after publication and the work to be OA 12 months after publication. But the policy specifically states: “For the purposes of this policy, Research Outputs do not include research data and research data outputs.”

Resourcing limitations mean these policies are not monitored, and there are no sanctions for non-compliance. This means they are basically ineffective, given the findings of a study last year that identified what policies need to ensure compliance.

But these policies simply reflect a lack of policy generally in Australia, partly due to the revolving door that has been the Prime Ministership over the past five years. So, on face value, the reason for the lack of engagement with discussions around Plan S just reflect this lassitude.

But I am wondering if there might be something deeper at play here.

Cash cow

Australian universities are heavily financially reliant on overseas students, with the numbers of international students several multiples greater than any other comparable university worldwide. Numbers of overseas students have doubled since 2008, with 398,563 students enrolled in 2018. In one instance, the University of Sydney, fees from Chinese students make up one fifth of its annual revenue with $500 million in 2017. Taken across the country, these figures outweigh public research funding significantly.

While this dependence has been labelled as highly risky from a financial perspective, it is also causing serious issues elsewhere in the sector including concerns about eroding educational standards. But it is also causing a perversion in the way research is managed.

The role of the ranking

University rankings are extremely important in the recruitment of overseas students. The vast majority of Australian university websites list some interpretation of their rankings. Monash University and the University of Western Australia both note they are in the “top 100 universities in the world”. Other universities are more specific, naming their place, like UNSW at 43^rd in the world and University of Queensland listing no fewer than five rankings, trumped by Queensland University of Technology with six rankings listed.

Chasing rankings comes at a price. In some instances, increasing a University’s position in the rankings is a specific strategy, with the University of Canberra a recent success story.

There is incredible pressure on researchers in Australia to perform. This can take the form of reward, with many universities offering financial incentives for publication in ‘top’ journals. This is fairly widespread, with some universities having this position on the public record. For example, Griffith University’s Research and Innovation Plan 2017-2020 includes: “Maintain a Nature and Science publication incentive scheme”. Publication in these two journals comprises 20% of the score in the Academic Ranking of World Universities.

Other institutions take a more draconian position. Murdoch University’s proposed ‘academic career framework’ identifies specific numbers of articles researchers are expected to publish in top journals per year. Not surprisingly this approach has been highly criticised for its “extremely narrow view of academic career success”.

Australia’s Chief Scientist has recently been arguing the need for a different way of assessing our researchers, with concern that the current system is fuelling bad science. With exception of some groundswell activity, this is as close as anyone is getting to using the ‘reproducibility’ word here in Australia, possibly from nervousness in the sector from government interference in the allocation of research grants in 2018. There is certainly nothing comparable to the UK or the US on this issue.

The Open Access challenge

But what has all of this to do with Open Access or Plan S? Well, everything actually.

For a start, signing up to the Declaration of Research Assessment (DORA), or the Leiden Manifesto is one of the principles of Plan S, with the Wellcome Trust stating that it will not fund research at institutions that have not signed up. Only a handful of Australian research organisations have signed DORA, none of which are universities. Given many Australian institutions are not only judging researchers on their publication record, but in some cases proscribing which journals in which they are allowed to publish, it would be extremely difficult for these institutions to become a signatory to DORA or the Leiden Manifesto.

But the main problem for the open agenda is the total reliance on specific metrics that deliver ranking numbers – metrics which enfold Australian universities into a Faustian bargain with the large commercial publishers.

Australian universities are not engaging with Plan S because they cannot afford to. And while the Australian funders remain silent on the topic (literally – a search for Plan S on each website comes up empty), there is little incentive to worry about it.

If anything, this situation further underlines the need to shift the academic reward system away from the single measure of publication of novel results in high impact journals. Given how deeply ingrained that measure is in Australia it will be interesting to see where we are at this time next year.

Micah Vandegrift: USA

An image of a river in the USA. — A meandering river in the USA. Plan S has sparked conversations in the USA, but progress is slow.

A shot heard around the world

A little more than a year ago, open access had its “shot heard around the world” moment. Plan S expanded out from Europe, encompassing angst and excitement, requiring think-pieces from thought leaders, policy briefs from the wonks, and general malaise from lots of stakeholders. The European open agenda is, by design or by accident, shaping the horizon and Plan S continues to be a marker of that progression. I had the unique opportunity to be on the ground in Europe for most of the fallout last fall, and now with the benefit of time and geographic remove, I am observing the after effects, especially in how U.S.-based research communities are responding in kind.

Ripples and tides

The greatest surprise is that Plan S seems to be the thing that is getting people from all corners out to debate the issues. The tidal wave of Plan S seems to have crashed on our shores with something for everyone – publishers, libraries, researchers, and funders. Librarianship tends to pivot around shifts in the publishing landscape, finding crevices to leverage our expertise and chances to show off that knowledge to researchers, and I expected Plan S to offer that as well. The weird thing, though, is that the responses have been uneven, distributed, and displaced. For example, I was invited along with Rick Anderson of Scholarly Kitchen fame to debate the Plan in front of 200+ managing and technical editors as the plenary session at their conference. On the flipside, Dr. Kelvin Droegemeier, announced as Director of the White House Office of Science and Technology Policy in January 2019 (after a vacancy since something happened in November 2016), flippantly addressed Plan S in an interview simply saying “we won’t ever tell people where to publish.” Bizarrely, a research policy affecting labs and scholars from Norway to Portugal is giving me a chance to meet and chat with publisher colleagues more than ever before, and not opened any new doors for communicating finer points of licensing with faculty on my campus.

A slow-flowing river

Following the current into the near future, I believe that there are three tributaries that will come together. Funders will continue to exert their influence, supplanting publishers as drivers of the conversation, disciplines will adapt discipline-specific means of scholarly sharing (see the rise of pre-prints [PDF]), and policy makers will attempt to legislate cautious action toward a global research marketplace. However… in the U.S. context there are two barriers that could dam the flow. Uncertainty in our political climate, and an America-first foreign policy agenda, is boiling up concern about “undue foreign influence,” and I fear that isolationism will compel a counter narrative to the open and public sharing of research worldwide. Secondly, America is a god-damn huge country and developing a coherent national framework for openness seems to be a fool’s errand. However, what sometimes appears to be a bog can actually be a river barely inching along. If Plan S was a splash, Plan Open U.S. will be a steady drip, creating geologic formations of systemic change toward a more open research ecosystem.

Conclusions

We read Danny and Micah’s contributions with great interest. They raised several questions about Plan S, which we hope to discuss with Micah after today’s talk.

What can we do to increase engagement of our local academic communities with the open access agenda?
Is it possible to uncouple decisions about research practice from financial or political/ideological considerations?
How can government funders find a balance between dictating open research mandates and respecting the academic freedom of researchers?
Can institutions measure research accurately without creating perverse incentives?
Is there any country in the world where the mention of politicians does not trigger an immediate eye-roll?

Published 24 October 2019

Written by Dr Danny Kingsley (Scholarly Communication Consultant) and Micah Vandegrift (Open Knowledge Librarian at NC State University Libraries).

Compiled by Dr Beatrice Gini

Open Access monographs: Reflections from our recent symposium

23 October 2019UncategorizedBook Processing Charges, BPCs, humanities and social sciences, Learned Societies, monographs, monographs publishing, open access, open access monographs, open access publishing, Plan S, REFMaria Angelaki

Open access book formats have been under discussion for several years and have attracted interest – and concern – from researchers in Humanities, Arts and Social Sciences as well as amongst institutions, publishers, and funders. Earlier this month the Office of Scholarly Communication organised a one-day symposium on ‘Open Access monographs: from policy to reality’ which took place at St Catharine’s College, Cambridge. It aimed to enable discussion about the open access monographs agenda and its future challenges with the Cambridge community and beyond, to bring together researchers with publishers, funders, experts and innovators in the field of open monograph publishing, and to share experiences about the opportunities and realities of publishing an open access book.

In this blog we summarise the key themes that emerged from this symposium. In favour of simplicity we accept that many of the issues discussed do not belong to one theme category only and are interlinked with each other.

Picture showing the front cover of the symposium programme. — Symposium programme

‘What would it take to implement open access books for REF?’

This was one of the first questions in Prof Martin Paul Eve’s (Birkbeck, University of London) keynote speech which highlights an uncomfortable truth in the discussions about open access monograph policy in the UK these days.

‘To publish 75% of anticipated monographic submission output for the next REF would require approximately £96m investment over the census period. This is equivalent to £19.2m per year. Academic library budgets as they are currently apportioned would not support this cost.’ [1]

The figures are staggering and immediately show that money is the number one challenge in any discussions about monographs in this context. Which brings us swiftly to our first theme: The economics of open access.

The economics of open access

The distribution of the economics is the most important factor in the puzzle of open access monograph publishing. The overall consensus from both publishers and academics are that BPCs (Book Publishing Charges) for monographs do not work well in the humanities. They scale badly and concentrate costs. However, it is clear that one business model does not fit all in this sphere. A diversity of business models and ecosystems in which monographs can be published as open access would give authors choice and avoid monopolies. It was thought provoking to hear Rupert Gatti say that Open Book Publishers couldn’t scale up on their current business model to publish 250-300 books (10 times the amount they do now) but they shouldn’t have to. Instead, they can envisage a system where numerous small publishers like themselves exist next to large publishers, like Cambridge University Press (CUP). The idea of avoiding monopolies is not only key for authors but also readers as having a few publishers controlling the methods of distribution of this literature could end up restricting the way we access and use content.

Questions were also raised about how BPCs (or their replacement) should be set. Monographs vary in length and complexity, usually determined by their subject matter, which in turn have vastly different production costs. Should there then be a pricing structure that better reflects this? And in a culture of openness, can we ask publishers to be transparent about their costs and services so researchers can make more informed choices about where to spend their grant money?

Publishers are very aware of the impact that open access is having on the business models and the need to maintain quality in production and the peer review process. CUP stated that digital sales are becoming an important part of monograph publishing and that timing of open access is also quite an important factor in the economics. Exploring models of delayed open access might provide one solution to protecting publisher incomes whilst still opening up access to content.

‘Students cannot learn without images’ (Dr Nicky Kozicharow, University of Cambridge)

Another important piece of the puzzle is who pays for the costs of publishing an open access book? The current model used for STEMM (Science, Technology, Engineering, Mathematics and Medicine) journal article APCs (Article Processing Charges), where funders usually pay the costs, was referred to an epistemic injustice that should not be replicated as researchers in less economically developed countries are disadvantaged.

The problem around costs of reproducing third-party images was also widely discussed, especially by Dr Nicky Kozicharow. Not enough is being done to support researchers, such as art historians, who rely on images for teaching and research activities. There is a (perceived) lack of training in copyright, which was a useful message, if not an uncomfortable one, to our librarians who routinely deliver training in this and are now revisiting their communications about this. But also image holders should consider how they support researchers – whilst some big holders, such as The Metropolitan Museum of Art, the Wellcome Trust, the Getty Collection and Wikimedia, do provide images free of charge, there was the suggestion that other collection holders should consider opening up access for researchers at affiliated institutions. This access would need to continue for a number of years beyond that affiliation so the images are accessible during the period in which a book will often be written up.

Ethics, Equality, Diversity and Inclusion

“Who and what is OA for? We need to start with the right question” (Prof Margot Finn, President of the Royal Historical Society)

It is also important to view the economic question of who can afford to pay to publish an open access monograph through the lens of equality and inclusion. How can we ensure everyone has equal access to the opportunity to publish open access? Is open access a human right? The role of politics here is critical if we want to make open access work for everyone. Policymakers need to consider issues such as access, gender and nationality when making decisions that institutions and publishers have to interpret and adhere to.

Another group that suffers from the current set up for open access monographs are the early career researchers (ECRs). They often work in a precarious situation, moving between institutions on short term contracts. This restricts their ability to publish a monograph, which takes considerable time and effort. It is important that when institutions look to sign up to and implement statements such as DORA (San Francisco Declaration on Research Assessment), that both monographs and journal articles are considered when looking at academic career progression.

Of course, as we strive to make open access work for everyone, we need to be mindful of impinging on academic freedoms. As noted by Dr Steven Hill (Director of Research, Research England), researchers should still be free to choose the questions they study within the constraints of the system. Academics should also be free to choose the licence that they publish their work under. This has been a major sticking point for many academics (as we have previously written) with CC BY licences seen as an ethical issue in the humanities. Instead, what is needed is a softening of licence choices with options such as non-commercial and non-derivative available – a point also highlighted in the recent Universities UK report on ’Open Access and Monograph s’. Finally, we should not make assumptions that the ethical issues around licenses are the same worldwide, because they are not.

Scalability and sustainability

‘We need to understand fully the obstacles that underpin academic research in order to have a sustainable, scalable, global open access model – but we are not there’ (Prof Margot Finn, President of the Royal Historical Society)

The issues around open access monographs are, at times, inextricably linked. Problems to do with economics are inseparable from issues of fairness (to sum the above section up badly) but also in scalability and sustainability. The academic monograph has its own distinct ecosystem in scholarly research. Open access monographs have a global readership, but production of open access books is not necessarily global, but concentrated on local or national levels. We need to consider the far-reaching consequences of this, including the relationship between the ‘global’ academic researcher and the ‘local’ publisher. We must also consider the role of the policymakers, often European, who set the rules in one country or part of the world and those academics who are not part of this system. Do the levels of academics in these countries and their outputs justify this dominance? We should also obtain more information on how open access books are used in order to justify the expenditure in publishing them, yet Hannah Hope, speaking for the Wellcome Trust, commented that the impact of open access books is hard to measure. Download statistics are often available and provide one measure. For example, UCL Press have had 2.5 million book downloads since its launch four years ago and Open Book Publishers report that their books are being freely accessed worldwide by over 20,000 readers each month.

The ability of publishers to innovate is seen as a key factor in the sustainability and scalability of open access monograph publishing. The size of the publisher will most probably determine their ability to innovate, with smaller publishers being in a better position to ‘take risks’ and try out new models, even if such models end up failing or not being appropriate to implement in bigger organisations due to scalability and financial constraints. Radical OA are experimenting with various business models and bringing down BPCs. The RHS’s New Historical Perspectives (NHP) book series is designed to provide high-quality publishing support to ECR historians (ECRs defined as researchers within 10 years of finishing their PhD) whilst absorbing the costs of BPCs and relying on a generous donation to cover some of the image costs. Indeed, ‘Learned societies have a long history of innovation and experimentation in publishing’ according to Prof Margot Finn, but how they take their experimentation to the next level is yet to be figured out.

Understanding how open access monographs can scale and be sustainable is key to figuring out the type of open access that will prevail. Green open access is not considered to be the future goal for CUP, who are experimenting with a number of different business models such as consortium models, crowdfunding and freemium models. They have also been engaging with authors, researchers and librarians worldwide to understand the monograph landscape better and to demystify issues concerning publishing process. Likewise, SpringerNature voiced concerns with green open access and although open access humanities publications account for a very small proportion of their overall open access publications (~10%) they feel that they are well positioned for a more open future.

Collaboration, relationships, communication

‘Publishing is not the end point. Academy as a whole needs to engage with that’ (Dr Rupert Gatti Trinity College/Faculty of Economics/Open Book Publishers)

What is clear is that there are many players in the field of open access monograph publishing and continued and open communication between all parties is key. Within institutions, academics and research support providers (such as librarians) need to have conversations to ensure the help required is accessible, so authors are not battling with copyright issues alone or remain unaware of the full spectrum of publishing options available to them. Senior leaders at universities must engage with their academic communities to understand the issues they face. They can then in turn engage with policymakers to encourage realistic rules and guidance that would lead to meaningful, measurable outcomes. It was reassuring to hear that consultative approaches are being taken and we welcome the continuation of these.

Publishers also have an important role in re-defining their relationship with academics. As Prof Martin Paul Eve questioned, are publishers solely service providers and academics content creators or are both parties co-producers and academic collaborators in the research process? Prof Roger Kain emphasised ‘the relationship of an author with their publisher with a journal is a very different relationship to the relationship of an author and their publisher with a monograph’ and as such this may lead to the chance to experiment further with business models, if publishers offer more added value through intellectual support for their authors. Of course, Learned Societies and holders of large collections of images have to be involved as well so that their position within the research process is well understood. These conversations should also happen in and between different geographical places because academic research will always have international collaborations.

We also have to be mindful of the messages that come out of these conversations. Time and again we heard the notion that “open access means bad peer review” is still alive in the academic community. This is a myth that all publishers, as well as librarians and other research support staff, are keen to debunk. Another myth is the misconception that “open access is the end of print”. As Prof Roger Kain put it ‘open access does not mean wholly replacing the physical copies of a book but help creators of content to reach wider audiences…OA and print will co-exist’. The term “open science” was noted as appearing to exclude the humanities and, therefore, disengaging researchers before they’ve even got started, even though open science includes all disciplines (this is the reason why in Cambridge we prefer to use “open research”). The language used in Plan S communication was seen as being too opaque, especially for non-native English speakers. If we are encouraging open research, we should be using language that is open and transparent, especially when open research is an international endeavour, as already mentioned. It is important that messages are correct and clear so humanities scholars and other stakeholders can engage fully in debates about the future of open access monograph publishing.

Summary

‘If we are going to take open access for monographs forward in a timely fashion it has to be taken forward as a shared enterprise…an enterprise involving academics as content creators, their funders, their universities…but above all their publishers’ (Prof Roger Kain, School of Advanced Study, University of London & Chairman of the UUK OA Monographs Group)

The symposium saw common themes emerging around issues with open access for monographs as the system currently stands, but also the potential benefits and possibilities that open access could open up into the future. There was consensus that open access needs to go forward as a shared enterprise with all stakeholders being equal players. Looking into the future there was also concern about the visibility of humanities research going forward when compared to the natural sciences and that humanities authors should strive to demonstrate the impact of their publishing activities.

Many of the themes discussed in this symposium echoed the recommendations as well as concerns outlined in the Universities UK Open Access Monograph report which was published a few days after this symposium took place. The report emphasised that complex questions still remained around issues such as costs, scalability and business models, but it was positive to read statements that the ‘academic book occupies a very distinct space in scholarly research’ reinforcing the fact that monographs are fundamentally different in intention and in kind when compared with journals or fields of research, and that ‘academic book publishing is an international activity’, with whatever implications this entails, as discussed earlier.

Perhaps it is fitting to conclude with a dose of pragmatism by quoting one of Dr Steven Hill’s remarks at the end of the symposium

‘...a really strong dose of pragmatism has entered this debate; that we all recognise that there are different visions of utopia that different actors in the system might have, but we can see that some of our visions of utopia have to be compromised in order to achieve something that is better than we have now and enable the kind of innovative scholarship that more openness will drive’.

and a note of optimism by Prof Martin Paul Eve who said the following when he was asked if there are lessons to be learnt from how open access has been applied to journals so far.

‘…we can learn a lot from how the open access debate has played out. I think we also learn a lot in seeing how compromises were reached within that to get to a point that is far better than a decade ago in terms of open access for journals…Momentum is growing, and acceptance is growing. And the idea that we don’t lose quality when things are available openly is growing. All these things are positive and I think we need to take those positives, articulate them from the start and see where that takes us rather than re-inventing the wheel, having the same argument, the same debates, and ending up in the same place, probably, but 20 years from now rather that in the next decade’.

Recordings and most of the presentations are available in the University of Cambridge institutional repository, Apollo as well as the OSC YouTube channel. We would like to acknowledge that this symposium was supported by the Arcadia Fund, a charitable fund of Lisbet Rausing and Peter Baldwin.

References

(1) Source: Eve, M.P. et al., (2017). Cost estimates of an open access mandate for monographs in the UK’s third Research Excellence Framework. Insights. https://doi.org/10.1629/uksg.392)

Published 23 October 2019

Written by Dr Lauren Cadwallader and Maria Angelaki

Open Research at the University of Cambridge: What have we done so far?

22 October 2019UncategorizedOpen Research, open research training, Research SupportMaria Angelaki

At the start of 2019 the University of Cambridge announced its Position Statement on Open Research. This blog looks at what has been happening since then and the current plans for making research at Cambridge more open.

Our Position

In February 2019, the University of Cambridge set out its position on open research to support and encourage open practices throughout the research lifecycle for all research outputs. The Position Statement made clear that both the University and researchers have responsibility in this space and that there would be no one size fits all approach to how to be open. As part of forming a position on open research, the University also created the Open Research Steering Committee to oversee the open research agenda of the University. This Committee is currently looking at three key areas –training, infrastructure and Plan S.

Training

In 2018, we ran a survey on open research [available to Cambridge University only] which highlighted our research community’s desire for more training on open research practices and tools. In order to delve into this further, a pilot was run with the Faculty of Education who submitted a disproportionately high number of responses to the survey, suggesting a strong interest in open research. The pilot, run earlier this year, encompassed six face-to-face training sessions on topics around open research, such as managing digital information, copyright, and publishing. These sessions were well received by both PhD students and postdocs.

In tandem to this, work is also being carried out to make the provision of open research related training more strategic, sustainable and efficient. For example, some of the courses the Office of Scholarly Communication run have already been embedded into existing PhD programmes, such as Doctoral Training Centres or the centrally run Researcher Development Programme but we could still increase the opportunities to work more closely with other parts of the University. With so many other pressures on time, it is essential we work together with all stakeholders involved to ensure we get the balance of training offered correct, so that we maximise the time benefits/costs of both the trainer and the student.

Finally, the question of sustainability for open research training is also being investigated. How can we ensure open research training reaches the 9,000 or so academics and postgraduate students we have at Cambridge? One answer to this question is online training. We are currently developing a digital course which will introduce the basics of open research, complementary to the soon-to-be-launched online research integrity training. However, we know that researchers value face-to-face sessions too, and intend to continue to develop our face-to-face offer, where we can provide deeper knowledge and discuss issues in more detail. Within the libraries at Cambridge we are also starting to work more closely with research support librarians and others in department libraries who can offer expertise and guidance that is tailored to the discipline.

Infrastructure

The University Position Statement on Open Research says “University support is important to make Open Research simple, effective and appropriate” and a key part of that support is in the form of infrastructure. This is a complicated area because it involves a number of service providers at the University who all have different priorities as well as the large body of researchers, who have a huge variety of needs and technical abilities. Finding common solutions or tools will always be difficult in a large, research intensive institution like Cambridge, which has Schools spread across the spectrum of arts, humanities, social sciences and STEMM subjects.

The Open Research Steering Committee is made up of representatives from across the University both from different academic Schools and University services. This is key to ensure that the drive towards open research infrastructure is holistic and proportional in the context of other University agendas. A landscape review of the services already provided has been carried out as has a ‘wish list’ of IT infrastructure that researchers would like. Whilst the ‘wish list’ has been carried out in a context wider than open research, it is really heartening to see many ‘wishes’ relate to systems that would improve open research practices.

There is also work underway to look at how research notebooks (or electronic lab notebooks if you prefer) are being used across the University. A trial of notebooks run in 2017 resulted in the decision not to provide an institution-wide research notebook platform, but guidance instead. This new work under the auspices of the Open Research Steering Committee aims to build on this work by extending the guidelines to include principles around data security, data export and procurement.

Plan S

Plan S looms large on our horizon and will present a challenge when it comes into force in 2021. Whilst we are waiting to see to what extent UKRI’s updated open access policy will reflect Plan S principles, we are busy contributing to the Transparent Pricing Working Group. This group was convened by the Wellcome Trust in partnership with UKRI and on behalf of cOAlition S to bring together publishers, funders and universities to develop a framework to guide publishers on how to communicate about the price of the services in a practical and transparent manner. The University is also looking into how we can implement the principles of DORA, which are supported by cOAlition S. This work is being led by Professor Steve Russell, an academic advocate for open research, and the work will very much be done in consultation with our academic community.

Summary

Cambridge is showing its commitment to enabling open research by taking seriously its role in providing infrastructure, training and the right culture for our academics. These areas need to be tackled holistically and the oversight of the Open Research Steering Committee should allow this to happen. It is important that we are collaborative with our research community and we hope that we have got that balance right with the inclusion of academics in the main Committee and working groups. Ensuring open research is embedded in everyday practice at the University will, of course, take time but we think we are making a good start.

Published 22nd October 2019

Written by Dr Lauren Cadwallader

Searching Open Access: steps towards improving discovery of OA in a less than 100% OA world

21 October 2019Library and training matters, UncategorizedApollo, discovery tools, lean library access plugin, Libraries, open access, scholarly communication, subscriptionsMaria Angelaki

At the heart of the University of Cambridge’s Open Access Policy is the commitment “to disseminating its research and scholarship as widely as possible to contribute to society”.

Behind this aim is the benefit to researchers worldwide, as the OA2020 vision has it, to “gain immediate, free and unrestricted access to all of the latest, peer-reviewed research”. It’s some irony indeed that the growth of the availability of research as open access does not automatically result, without further community investment, in a corresponding improvement in discoverability.

Key stakeholders met at the British Library to discuss the issue at the end of 2018 and produced an Open Access Discovery Roadmap , to identify areas of work in this space and encourage collaboration in the scholarly communications community.[1] A major theme included the dependence on reliable article licence metadata, but the main message was finding the open infrastructure/interoperability solutions for long-term sustainability “ensuring that the content remains accessible for future generations”.

New web pages on Open Access discovery

Recognizing where we are now, and responding to the present, (probably) partial awareness of the insufficiencies in the OA discovery landscape, Cambridge University Library has added pages to its e-resources website to highlight OA discovery tools and important websites indexing OA content. The motivations for highlighting the options for OA discovery on the new pages is described in this blog post. Our main aim is to bring to light search and discovery of OA as a live topic and prevent it “languishing in undiscoverable places rather than being in plain sight for everyone to find.”[2]

Recently, data from Unpaywall for July 2019 has been used to forecast for growth in availability of articles published as OA by 2025, predicting on the basis of current trends, but conservatively – without even taking full account of the impact of Plan S, for example. This forecast for 2025 predicts

44% of all journal articles will be available as OA
70% of article views will be to OA articles.[3]

Unpaywall’s estimate for availability OA right now is 31%. A third (growing soon to a half) is a significant proportion for anyone’s money, and wanting to signal the shift we have used that statistic as our headline on the page summarizing the most well-known and commonly-used Open Access browser plugins.

Screenshot containing the following text: 'Open Access Browser Plugins.A third to a half of articles have an OA version, but finding them can be a challenge. Save time with these easy-to-install OA discovery tools that search repositories, preprint servers, etc. for you' — Screenshot of Open Access browser plugins webpage

We want the Cambridge researcher to know about these plugins and to be using them, and aim to give minimal but salient information for a selection of one, or several, to be made. Our recommendation is for the Lean Library extension “Library Access” but we have been in touch with Kopernio and QxMD and ensured that members of the University registering to use these plugins will also pick up the connection to our proxy server for seamless off campus access to subscription content where it exists, before the plugin offers an alternative OA version.

Once installed in the user’s browser, the plugin will use the DOI and/or a combination of article metadata elements to search the plugin’s database and multiple other data sources. A discreet, clickable pop-up icon will become live (change colour), on finding an OA article and will deliver the link or the PDF direct to the user’s desktop. Most plugins are compatible with most browsers, Lean’s Library Access adding compatibility with Safari last month.

Each plugin has a different history of development and certain features that distinguish it from others, and we’ve attempted to bring these out on the page. For example noting Unpaywall’s trustworthiness in the library space thanks to its exclusion of ResearchGate and Academia.edu; its harvesting and showing of licence metadata; and its reach with integrating search of its data via library discovery systems. Features we think are relevant for potential users looking for a quick overview of what’s out there are also mentioned, such as Kopernio’s Dropbox file storage benefits and integration with Web of Science and QxMD’s special applications for medical researchers and professionals.

In an adjacent page, Search Open Access, there is coverage of search engines focused on discovering OA content (Google Scholar; 1findr; Dimensions; CORE), a range of sites indexing OA content in different disciplines, both publisher- and community-based, and a selection of repositories and preprint servers, including OpenDOAR.

A screenshot containing the following text: 'Search Open Access. Our selection of the leading and trusted sources to find OA content' — Screenshot of Search Open Access webpage

We hope the site design, based on the very cool Judge Business School Toolbox pages, gets across the basics about the OA plugins available and encourages their take-up. The plugins will definitely bring to the researcher OA alternative versions when subscription access puts the article behind a paywall and, regardless, will expose OA articles in search results that will otherwise be hard to find. The pages’ positioning top-left on the e-resources site is deliberately intended to grab attention, at least for reading left-to-right. It is interesting to see the approach other Universities have taken, using the LibGuide format for example at Queen’s University Belfast and at the University of Southampton.

Experiences with Lean Library’s Library Access plugin

Cambridge has had just over a year of experience implementing Lean Library’s Library Access plugin, and it’s been positive. The impetus for the institutional subscription to this product was as much to take action on the problem for the searcher of landing on publisher websites and struggling with Shibboleth federated sign-on. This problem is well documented (“spending hours of time to retrieve a minimal number of sources”) and most recently is being addressed by the RA21 project.[4] Equally though we wanted to promote OA content in the discovery process, and Lean Library’s latest development of its plugin to favour the delivery of the OA alternative before the default of the subscription version, is aligned with our values (considerations of versioning aside).

So we’re aiming to bring Lean to Cambridge researchers’ attention by recommending it as the plugin of choice for the period we’re in the transition to “immediate, free and unrestricted access” for all. It is only Lean that is providing the 24-hour updated and context-sensitive linking to our EZproxy server for off campus delivery of subscription content plus promoting OA alternative versions via the deployment of the Unpaywall database. The feedback from the Office of Scholarly Communication is favourable and the statistics support the positivity that we hear from our users (for the last year 66,731 for Google Scholar enhanced links; 49,556 article alternative views; a rough estimate against our EZproxy logs showing a probable 2/5 of off campus users are accessing the proxy via Lean).

One area of concern is the ownership of Lean by SAGE Publications, in contrast to the ownership say of Unpaywall as a project of the open-source ImpactStory, and what this means for users’ privacy. The concerns are shared by other libraries implementing Lean.[5] Our approach has been to make the extension’s privacy policy as prominent as possible on our page dedicated to promoting Lean, and to engage with Lean in depth over users’ concerns. We are satisfied with the answers to our questions from Lean and that our users’ data is adequately protected. Even in a rapidly changing arena for OA discovery tools the balance is not so fine when it comes to recommending installation of the Library Access plugin over a preference for the illegitimate and risk-prone SciHub.

Libraries’ discovery services are geared for subscription content

Allowing for influence of searchers’ discipline on choice of discovery service, it’s little surprise that the traditional library catalogue, even when upgraded to a web scale discovery service, prejudices inclusion of subscription over OA content. Of course it does, because this is the content the libraries pay for in the traditional subscription model and the discovery system is pretty much built around that. iDiscover is Cambridge’s discovery space for institutional subscriptions and print holdings of the University’s libraries and within iDiscover Open Access repository content has been enabled for search. Further, the pipe for the institutional repository content (Apollo) is established.

Nonetheless Cambridge will be looking to take advantage of the forthcoming link resolver service for Unpaywall. This is due for release in November 2019 and will surface a link to search Unpaywall from iDiscover when subscription content is unavailable. This link should kick in usually when the search in iDiscover is expanded beyond subscription content, and a form of which has been enabled already by at least one university by including the oadoi.org lookup in the Alma configuration.

English: The reefer ship Ivory Tirupati arriving in Brest with heavy list.
Français : Le navire frigorifique Ivory Tirupati arrivant à Brest, avec une gîte importante — A listing ship. Picture by Hervé Cozanet, licensed under the Creative Commons Attribution-Share Alike 3.0 Unported license.

The righting moment in the angle of list is that point a ship must find to keep it from capsizing, and Library discovery system providers’ integration with OA feels a bit like that – the OA indication was included in the May 2018 iDiscover release and suppliers have been working with CORE for inclusion of CORE content since 2017. That righting moment may be just over the horizon as integration with Unpaywall arrives, and the “competition” element dissipates, as the consultancy JISC used to review the OA discovery tools commented: “As the OA discovery landscape is crowded, OA discovery products compete for space and efficacy against established public infrastructure, library discovery services and commercial services”.[6]

A diffuse but developing landscape

Easy-to-install and effective to use, the OA discovery tools we are promoting are still widely thought of as at best providing a patch, a sticking-plaster, to the problem. A plethora of plugins is not necessarily what the researcher wants, or is attracted by, however necessary the plugin may be to saving time and exposing content in discovery. Possibly the really telling use case has yet to be tried wherein the plugin comes into its own in a big deal cancellation scenario.

Usage statistics for the Lean Library Access plugin are probably a reflection of the fact that the provision of most article content that is required by the University is available via IP access as subscription, and the need for the plugin is almost entirely limited to the off campus user. The Lean plugin’s relatively modest totals are though consistent with reports of plugin adoption by institutions that have cancelled big deals. The poll of the Bibsam Consortium members revealed 75% of researchers did not have any plug-in installed; the percentage for the University of Vienna in particular was 71%; the KTH Royal Institute of Technology authors “rarely used” a plugin.[7]

Another conjecture is that there is an antipathy to any plugin that could be collecting browsing history data and however “dumb” and programmatically-erased, the concern over privacy is such that the universal adoption libraries may hope for is unachievable. The likeliest explanation is possibly around the tipping-point from subscription to OA, and despite the Apollo repository’s usage being one of the highest in the country (1.1 million article downloads from July 2018 to July 2019), Cambridge’s reading of Gold OA is c. 13% of total subscription content, including journal archives. A comparison with the proportions of percentage views by OA types in Unpaywall’s recently published data (cited above) suggests this is on the low side in terms of worldwide trends, but it must be emphasized this is a subset of OA reading and excludes green, hybrid, and bronze. Just consider for instance the 1.5 billion downloads from arXiv globally to date.[8] Similarly, the stats from Unpaywall are overwhelmingly persuasive of the success of the plugin, as of February 2019 it delivered a million papers a day, 10 papers a second.

Graph is showing a steady growth in the total number of open access items from less than 475,000 in January 2016 to nearly 1,700,000. Likewise, the number of institutional repositories increased from 96 to 180 during the same period. — IRUS-UK growth of open access items since January 2016 (The red bars indicate total items, orange bars number of articles and green bars number of articles with DOIs. The blue line indicates the number of institutional repositories)

The inspirational statistician and “data artist” Edward Tufte wrote:

We thrive in information-thick worlds because of our marvellous and everyday capacities to select, edit, single out, structure, highlight, group, pair, merge, harmonize, synthesize, focus, organize, condense, reduce, boil down, choose, categorise, catalog, classify, list, abstract, scan, look into, idealize, isolate, discriminate, distinguish, screen, pigeonhole, pick over, sort, integrate, blend, inspect, filter, lump, skip, smooth, chunk, average, approximate, cluster, aggregate, outline, summarize, itemize, review, dip into, flip through, browse, glance into, leaf through, skim, refine, enumerate, glean, synopsize, winnow the wheat from the chaff, and separate the sheep from the goats.[9]

There’s thriving and there’s too much effort already. Any self-respecting OA plugin user will want to winnow, and make their own decisions on the plugin(s). In a less than 100% OA world, that combination of subscription and OA connection separated from physical location (on/off campus) is a critical advantage of the Lean Library offering, combined as it is with the Unpaywall database. Libraries will find much to critique in the institutional dashboards or analytics tools now built on top of some plugins (e.g. distinction of the physical location when accessing the alternative access version in the Kopernio usage for instance).

From the OA plugin user’s perspective, the emerging cutting edge is currently with the CORE Discovery plugin, as reported at the Open Repositories 2019 conference, in the “first large scale quantitative comparison” of Unpaywall, OA Button, CORE OA Discovery and Kopernio. This report reveals important truths for OA plugin critical adopters, for instance showing less than expected overlap in comparison of the plugins’ returned results from the test sample of DOIs, and the assertion “we can improve hit rate by combining the outputs from multiple discovery tools”.[10]

It’s become popular for our present day Johnson to quote his namesake, so in that vogue we should expect the take-up of Lean Library and CORE Discovery to bring closer that “resistless Day” when researchers the world over get “immediate, free and unrestricted access to all of the latest, peer-reviewed research” and the “misty Doubt” over the OA discovery landscape will be lifted.[11]

[1] Flanagan, D. (2018). Open Access Discovery Workshop at the British Library, Living Knowledge blog 18 December 2018. DOI: https://dx.doi.org/10.22020/v652-2876

[2] Fahmy, S. (2019). Perspectives on the open access discovery landscape, JISC scholarly communications blog. https://scholarlycommunications.jiscinvolve.org/wp/2019/04/24/perspectives-on-the-open-access-discovery-landscape/

[3] Piwowar, H., Priem, J. & Orr, R. (2019). The future of OA: a large-scale analysis projecting Open Access publication and readership. DOI: https://www.biorxiv.org/content/10.1101/795310v1

[4] Hinchliffe, L. Janicke. (2018). What will you do when they come for your proxy server?, Scholarly Kitchen blog. https://scholarlykitchen.sspnet.org/2018/01/16/what-will-you-do-when-they-come-for-your-proxy-server-ra21/

[5] Ferguson, C. (2019). Leaning into browser extensions, Serials Review, v. 45, issue 1-2, p. 48-53.

[6] Fahamy, S. (2019). Perspectives on the open access discovery landscape. JISC Scholarly Communications blog. https://scholarlycommunications.jiscinvolve.org/wp/2019/04/24/perspectives-on-the-open-access-discovery-landscape/

[7] See the presentations from the LIBER 2019 conference on zenodo here https://zenodo.org/record/3259809#.XaA0Qr57lhF and here https://zenodo.org/record/3260301#.XaAz6757lhF

[8] arXiv monthly download rates, https://arxiv.org/stats/monthly_downloads

[9] Tufte, E. Envisioning information, Cheshire, Connecticut, Graphics Press, p. 50.

[10] Knoth, P. (2019). Analysing the performance of open access discovery tools, OR 2019, Hamburg, Germany. https://www.slideshare.net/petrknoth/analysing-the-performance-of-open-access-papers-discovery-tools

[11] Johnson, S., In Eliot, T. S., Etchells, F., Macdonald, H., Johnson, S., & Chiswick Press,. (1930). London: a poem: And The vanity of human wishes. London: Frederick Etchells & Hugh Macdonald. l. 146.

Published Monday 21 October 2019

Written by James Caudwell (Deputy Head of Periodicals & Electronic Subscriptions Manager, Cambridge University Library)