Category Archives: Uncategorized

Informing the Elsevier negotiations: Dominic Dixon on the work of the Data Analysis Working Group

As part of our series of posts on the Elsevier negotiations, Dominic Dixon, Research Librarian at Cambridge University Libraries, explains the work of the library’s Data Analysis Working Group to access, understand and analyse the data relating to how researchers at Cambridge use Elsevier publications. These findings are also presented as a series of data visualisations on the recently launched Elsevier Data Dashboard [Cambridge University Raven account required].

Having a strong underpinning of data is critical to strengthening the University and sector position in negotiations with Elsevier. This post outlines our approach in the data analysis working group to gathering and presenting the data underpinning the negotiations, looks at some of the questions we have sought to answer, and shares some high-level findings from our analysis.

As with many data science projects, a large majority of the time has been spent on data cleaning. This is in part due to the way the exports from the platforms we used are structured but also to allow us to carry out a more fine-grained analysis than would have been possible with the data in its default state. Some of this work involved disambiguating publisher names, splitting and pivoting fields with multiple entries (e.g., funders, disciplines, and subjects), and enriching the records with metadata not included in the original files.

Publishing

To build a profile of research published by Cambridge researchers in Elsevier journals, we experimented with three platforms: Dimensions, Scopus, and Web of Science (WoS). Each of these platforms is commercial and each has varying levels of coverage and richness of metadata. A recent comparative analysis between WoS, Scopus and Dimensions found that Dimensions indexed 82.22% more journals than WoS and 48.17% more journals than Scopus. We decided to compare the coverage in each of these platforms for articles published between 2015 & 2020 by a Cambridge affiliated author. In this case, WoS (n=59, 587) returned 1% more results than Dimensions (n=58,908) and 32% more than Scopus (n=40,385).* However, filtering to Elsevier gave a different picture. We found that Dimensions (n=11,431) returned 16% more articles than WoS (n=9,504) and 44% more than Scopus (n=6,345). Given this and considering that our primary focus was research published by Elsevier, we opted to use Dimensions.

Of the 58,908 records exported from Dimensions, we found that 19% were published in Elsevier journals, making Elsevier the single most chosen publishing venue for Cambridge authors. Filtering to only articles with a Cambridge corresponding author, we again found that Elsevier was the most chosen publishing venue, with over 34% (n=4,564) of the articles published in Elsevier journals. Having looked at publishing levels more broadly, we then broke down the articles published with a Cambridge corresponding author by Open Access category. We found that 22% (n=1,137) of the articles were categorised as closed and therefore behind a paywall, 35%(n=1,585) were paid for via different routes including funder block grants administered by the University, 32% (n=1,467) were self-archived (Green OA), and 8% (n=375) were published in journals that do not charge APCs. Thus, the percentage of articles that are either behind a paywall, or are only available openly because an APC has been paid, is significantly higher than the amount that is published open access without any associated fees.

Another aspect of publishing we decided to focus on is funding, asking specifically “Who is funding the Cambridge research published with Elsevier?”. Given the inclusion of funder data in the Dimensions export, we were able to break down the articles by both funder and funder groups. This enabled us to determine who is funding the research. Looking at articles with a Cambridge affiliated author, Cambridge corresponding author, and articles resulting from grants we found that in each category over 70% were linked to at least one cOAlition S funder. The wider implication of this – specifically for the corresponding author articles – is that in the absence of a read and publish agreement, many of the funders would not pay the APCs associated with publishing in Elsevier journals.

Reading

To provide a picture of the extent to which articles published in Elsevier subscription journals are read at Cambridge, we gathered usage data from COUNTER and the Alma library management system. This allowed us to consider reading over the 6-year between 2015 and 2020 both overall and at a disciplinary level. We found that reading of Elsevier journals was consistently higher in each year than for any other publisher. Reading of Elsevier in 2020 represented 20% of all reading and was at its highest level in physical sciences and engineering. The single highest total of reading in the sub-categories within each discipline was in biochemistry, genetics, and molecular biology within the life sciences, with over 400,000 article downloads in 2020 alone.

Another question we considered is how frequently articles published in Elsevier journals are cited by researchers at Cambridge. To answer this, we took advantage of the Dimensions API to gather a dataset of the cited publications from articles published with a Cambridge affiliated author between 2015 and 2020. The resulting data set consisted of over 1.2m bibliographic records and revealed that 22% (n=269,917) of the cited articles were published by Elsevier. Interestingly, this percentage closely matches both the percentage of articles published in Elsevier journals by Cambridge affiliated authors (19%), the percentage of articles read at Cambridge (21.78%) (2015-20), as well as the percentage of publishing with Elsevier at the national level (20%). Using the Dimensions API to enrich the citation data with the open access category, we were able to see that 66% (over 174,000 publications) of the cited Elsevier content is currently paywalled. Elsevier is both the most cited and most paywalled publisher. This observation has wider implications for open research given that many of these articles would be inaccessible to those who are not affiliated with an institution that subscribes to the journals in which the articles appear.

Paying

One of the main questions we considered when looking at data relating to expenditure on Elsevier was how much we pay to publish with Elsevier journals. Our source for this data was OpenAPC – an initiative that aggregates data on open access expenditure and makes it openly available – combined with data from our internal compliance reports. Looking at the overall spend across all institutions that have contributed to the OpenAPC dataset, we can see that over €49,000,000 has been paid to Elsevier. This represents 19% of the total reported spend on article processing charges (APCs). Looking at data the data on Cambridge expenditure, we found that between 2015 and 2020, 30% (over £3,000,000) of our total spend on APCs from block grants was paid to Elsevier (the highest spend on any single publisher), with a single payment averaging at £3,302 and ranging between £450 and £7320.

Final notes

This post has covered just some of the questions we have been able to answer with the data. We think that overall, we have been able to demonstrate that Elsevier journals are among the most read and published in, but also consistently the most paywalled and expensive to publish in journals of all publishers. This serves to highlight the importance of the ongoing negotiations and of considering other options such as those explored in previous posts. Our complete findings are presented on a dashboard that is accessible to members of the University. Unfortunately, legal restrictions mean we are not able to share the dashboard or underlying datasets externally; however, we have made the Python code we used to gather the citation data available as a Jupyter notebook on Google Colab. This can be used to retrieve the dataset we used to carry out the citation analysis and is easily modifiable for other purposes (see the notebook for full details). We refer the interested reader to the Dimensions API Lab, and the ESAC guide to uncovering the publishing profile of your institution. The former was helpful for learning how to take advantage of the Dimensions API (as were the staff at Dimensions), and the latter has been useful in formulating our approach to the whole project. We are also happy to answer questions about any aspect of our work.

* The original percentage quoted here was 18%. This was incorrect and has now been corrected to 32%.

How might we reduce our dependency on legacy publishers such as Elsevier?

To coincide with our first townhall event on the Elsevier negotiations, Professor Stephen Eglen offers his perspective on the University’s future relationship with the publishing industry. Prof. Eglen is Professor of Computational Neuroscience in the Department of Applied Mathematics and Theoretical Physics at the University of Cambridge.

I’m often asked why I single out Elsevier when discussing spurious publishing practices*. The simple reason is that they are the single largest publisher that most institutions deal with. Other legacy publishers adopt similar practices, outlined below, that I disagree with. However, given that Elsevier tends to take about 40% of our journal subscription costs, it is worth focusing on. Even finding out these costs required an extensive set of FOI requests over several years, revealing a large disparity in costs between UK Universities. However, I do not blame Elsevier for the current situation – they are a successful business with shareholders to satisfy. Their consistent high operating margins (~ 30%) indicate that they are very capable. However, this comes at a price, e.g. their current median gender pay gap in 2020/21 was 36%, compared to 11.1% at the University of Cambridge, and 7.3% at Springer Nature.

Big Deals

We currently ‘rent’ the collection of Elsevier-published articles via ScienceDirect; this is analogous to a cable TV subscription where you pay a monthly amount to access all of the TV channels from a particular company. Just like a cable TV subscription, it is significantly cheaper and convenient to buy everything, than it is to just buy what you really need.

In the case of Elsevier, I would argue that they publish a few, very popular, journals, such as The Lancet, Cell, and many others within Cell Press. In my corresponding slides, I analyse our University’s download statistics of articles for 2019. I am unable to publicly share these findings, as the download statistics data are not freely available. However, it was no surprise to see that some journals are much more popular than others. Our costs for the Big Deal have steadily increased in recent years, on the notion that we are getting access to more content with more journals being added. However, these would seem to be journals that are rarely read.

Transformative deal elements

Jisc are now negotiating with Elsevier for a ‘transformative deal’ which will allow us to both read and publish in these journals. Although on the surface attractive, they simply maintain the ‘lock in’ to the publisher, and come with their own problems. For example, often the number of articles that can be published per year is capped – what happens when the cap is exceeded? They also unwittingly introduce problems for scholars in poorly funded institutions who cannot afford such deals nor high article processing charges (APCs). Why should we continue to support a system that negatively impacts on scholars in the Global South?

Embargoes and Rights Retention

The UKRI currently have clear requirements for dates by which research articles should be open access: 6 months for STEM, and 12 months elsewhere. Journals therefore had a choice: to allow authors to meet UK open access policy by paying an APC, or by reducing embargo periods on author accepted manuscripts to meet UK constraints without paying an APC. Unsurprisingly, Elsevier saw no reason to reduce their embargo periods. From 2022, however, UKRI have reduced these embargo periods to zero months. While it is highly unlikely that Elsevier will reduce their embargoes to comply with this new policy, we can look to alternative approaches to facilitate immediate open access.

For example, the Rights Retention Strategy is a recent innovation to allow authors to maintain rights over their work, rather than signing over copyright to the publishers, which enables the immediate release of author accepted manuscripts upon publication. This is seen as a valuable tool for promoting green open access. However, many publishers, including Elsevier, noted their vocal opposition to the Rights Retention Strategy. Further, In July 2021, Elsevier wrote to editorial board members noting that it had been lobbying UKRI and Government regarding its opposition to planned changes to UKRI policy.

What next?

The publishing industry is moving towards deals combining both read access to journals and the ability to publish in those journals. I dislike such deals, as I think they continue a ‘lock in’ of funds to one publisher, having negative consequences for those that cannot afford them. I believe all scientific articles need to be free to access, and ideally free for authors to publish. So, how might we reduce our dependency on legacy publishers such as Elsevier?

  1. Decline a big deal and instead subscribe to individual journals from the ScienceDirect catalogue that our tailored to our local needs. US institutions have done this recently, saving significant sums (Thornton and Brundy, 2021).
  2. Use the savings to invest in more ethical approaches to scholarly publishing. This would include Diamond OA journals, such as Discrete Analysis, Volcanica and Beilstein Journal of Organic Chemistry. Such journals are supported by grants, and are free of author-fees and free to read. We could also invest further in infrastructure to support scholarly publishing. For example, UK has been a long-term supporter of the arXiv preprint server – in 2020, we contributed $55,000 to its running costs. Finally, libraries are now choosing to directly fund open access journals in models such as Subscribe to Open and The Open Library of Humanities.
  3. Support researchers to adopt the Rights Retention Strategy as a way of maintaining rights over their works, and being able to publish their work as green open access.
  4. Work together with other institutions to ensure legal, low cost, alternative routes to accessing scientific literature (e.g. rapid interlibrary loans).

I hope the discussions that we are having in Cambridge around the Elsevier negotiations are just one part of a larger move to consider our future vision of an ethical and inclusive scholarly publishing system. Legacy publishers will not provide this; we in academia need to build this vision, working with forward-looking partners.

*(Disclaimer: I am writing this from the point of view of a researcher in Computational Neuroscience, where I routinely read journal articles in both biological and mathematical domains. I am aware that views on open access vary across disciplines.)

Staff introduction: Dr. Samuel Moore, Scholarly Communication Specialist

I am delighted to have joined the Office for Scholarly Communication here at Cambridge and wanted to post a brief introduction about my previous work in scholarly communication and the vision I have for my role as Scholarly Communication Specialist.

I have been involved in open research and scholarly communication for the past fifteen years, having both worked for a number of open access publishers and completed a PhD on the transition to open access in humanities disciplines. I am an information studies researcher by training and a strong advocate for openness in scholarly research. I therefore hope to help Cambridge continue to steer towards an open future for scholarly communication, but importantly one that does not leave any discipline or researcher behind.  Open research needs to be sensitively embedded in our disciplinary cultures so that it is a natural and easy thing to practice.

My doctoral research in the Department of Digital Humanities at King’s College London explored the contrasting approaches to open access publishing of policy-based and grassroots initiatives. From studying the UK funder policies, I identified a tendency to frame open access in terms of compliance rather than something good for its own sake. This meant that many researchers found open access an institutional burden or something not relevant to their own discipline or working practices, while others assumed that OA is just another way for commercial publishers to make increasing amounts of money. Though I think this reputation has improved, my position at Cambridge will be based on helping to show the exciting potential of open research, particularly its ability to contribute to a healthier publishing culture across all scholarly disciplines. For example, a focus of my initial work will be around monographs and the various ways of supporting researchers to explore the diverse ecosystem of long-form open publishers that exists in the humanities, especially those presses that do not charge a fee to publish.

Yet in order to move away from the culture of compliance across all disciplines, we not only have to show the full range of open access publishing opportunities available to researchers, we also have to build upon the work of educating our colleagues about publishing not just as a practice but an industry that shapes this practice. Fairly or unfairly, the publishing industry receives a great deal of bad press within higher education and this has led to a continual reappraisal of academia’s relationship with publishing, specifically with respect to open access. Using this blog and other channels, I hope to inform the university of the debates around the future of scholarly publishing so that researchers can better understand how their publishing decisions are situated in this changing environment. This will involve showcasing a range of views on publishing and the changing ways in researchers communicate and distribute their work.

One way of increasing academic engagement in scholarly publishing is through community consultation on forthcoming developments. The libraries have recently announced a renegotiation of Cambridge’s contract with Elsevier, which is due to expire at the end of this year, in order to seek an affordable Read & Publish deal with the publisher if possible. We are hoping to hear from as many voices at Cambridge about what the university’s future relationship with Elsevier should look like. Alongside showcasing views on this blog, we encourage academics to get in touch via this form to let us know your views and to stay informed about future activities in this area. Please do also contact me if you are interested in writing a blogpost on the topic or interested in learning more.

Related to the future of open access, I am also interested in providing support for academic governance of scholarly communication. I am a scholar of digital commons and community governance and I hope to impart of some of this knowledge to ensure greater accountability of publishing by research communities themselves. Currently, academics have a great deal of editorial oversight over the publications they edit, but less surrounding issues of price, ownership and other policy-related matters, despite the free labour and content we give to publishing houses. I will be discussing with academics and publishers about how we can work together to return accountability of publishing to research communities from the market at large.

Finally, I hope to support and showcase all the excellent work going in scholarly communication at Cambridge. There are pockets of activity across the university that would benefit from wider recognition and greater support, and I have already been contacted by colleagues looking to start or reinvigorate their small-scale publishing project. I will be exploring the ways that libraries can help here, ideally through resources and software, but also through sharing expertise with one another. Again, do get in touch if you have a publishing project that I should know about or can help with.

Scholarly communication is changing rapidly, not least due to the pandemic’s demand for openness, collaboration and immediacy of dissemination, but also through policies like Plan S and the soon-to-be-announced revised UKRI Open Access Policy. As we move in the direction of openness, it is important that all voices in the academic community are heard and that researchers feel confident that open research works for them. I look forward to working with colleagues to help shape Cambridge’s strategy for the future of scholarly communication.