Category Archives: Uncategorized

The Data Picture

I was recently named one of “the next generation of [library] leaders” as part of the CILIP 125, having been recognised as an individual who contributes energy and knowledge to improving and impacting their organisation. My area of expertise, and thus recognition, lies with the use of data within libraries. As a data analyst for the Office of Scholarly Communications at Cambridge University Library, my role focuses on empowering decisions with data driven understanding – such as supporting the Springer Nature negotiations. To develop my understanding of data, and its role within a wider organisation, further, I engage with data beyond the library – such as the Big Data London conference and the Carruthers and Jackson Data Leaders’ Summer School. Reflecting on the use of data in the wider world, what can be expected of the library and data?


The summer school provided practical advice, proven methodologies, and guidance that could apply across a variety of businesses. The course is designed to provide insight on the workflow of data officers, and their role within an organisation – no matter its stage of data maturity and literacy. Over the course of the ten weeks, leading experts discussed the role of a chief data officer (CDO), both as a business development opportunity, and as a career path for individuals. It explored the risk and governance of data within an organisation, and the final weeks focused strongly on the role of people and teams associated with data.

Peter Jackson and Caroline Carruthers addressed the differing types of CDO and described a pendulum between ‘risk aversion’ and ‘value added’. Understanding the balance between secure and proper data governance (GDPR for example) and providing value through data (such as setting up automation). The pendulum of risk to reward is relevant to many roles, including those within the library. Understanding the need to divide time and energy between creating policies and getting decision making results, is just as relevant to my role as a chief data officer. In my role I have supported decision making staff through data production, but equally, to instil a culture of data, time and energy must be dedicated to risk aversion, through tasks of researching data management, preparing training sessions for data storage, and supporting staff in data preparation.

Another important concept introduced was the DIKW pyramid – Data, Information, Knowledge, Wisdom – for understanding the value created from data. The base of the pyramid is (raw) Data, which can be processed into (useful) Information. This Information is data with meaning and a purpose and can be organised into (insightful) knowledge. Knowledge combines experiences, values, insights, and contextual information, which can then transcend to (integral) Wisdom. Wisdom is considered a deeper understanding with ethical implications and the ability to define ‘why’. The DIKW pyramid provided a frame of thought for presenting and approaching future data projects. Understanding the requirement to provide, data, information or knowledge, to better support a decision-making team.

To develop communication skills, expert Scott Taylor, known as The Data Whisperer, spoke about the three V’s for data storytelling: Vocabulary, Voice and Vision. Combining an accessible vocabulary, with a common voice will illuminate the business vision, and why that is important. This overarching concept for an organisations data approach can be scaled down to support individual data workers, to provide value – which should either grow, improve or protect the business case. Understanding how to communicate the data is a key skill as “Hardware comes and goes, software comes and goes, but data remains”. And that data that remains should be used to either grow, improve or protect the business, such that data gathered should be usable data!

At Big Data London, the organisation Women in Data hosted conversations about nurturing a culture of learning within data teams. Pulling from their experiences from minority backgrounds, the speakers highlighted the power in upskilling, sharing skills across teams and being an advocate on oneself and skills. As for what to upskill, data literacy was a hot topic across the conference. Data literacy, also called data fluency and data confidence, is the combination of ability, skills and confidence surround data and its uses. Data literacy enables more efficient work, and begs the question, what is the base level of data literacy / confidence across the library? Librarians use data daily; checking in/out material, answering students’ queries, or tracking the use of space, but are all librarians confident to use that data? This is an area I hope to explore further at the CUL, to ensure staff can use the data they have to support decisions.


Engaging with the world of data provides a big picture of the possibilities within the library. Conversations of AI (Artificial Intelligence), data policies and maturity, and shiny-new databases, software, and services, demonstrate the growing adoption of data, and therefore, libraries should follow suit. Actively taking snippets of larger conversations, developing ideas within the library space, and exploring the possibilities with data will help libraries thrive in this world of technological growth.


Should the UK make a deal with Springer Nature?

This is a guest post by Prof. Stephen J. Eglen on the concurrent negotiations between the UK academic sector and the publisher Springer Nature. Prof. Eglen is a Fellow of Magdalene College and Professor of Computational Neuroscience in the Department of Applied Mathematics and Theoretical Physics at the University of Cambridge. This post does not necessarily reflect the view of Cambridge University Libraries.

The UK academic sector is currently in discussion with Springer Nature around a renewed ‘read and publish’ deal for journal content. I understand that most institutions are likely to reject the current deal, but wish to continue negotiations. My position is that further discussions with Springer Nature are futile; we should stop accepting ‘transformative deals’. The likely effect of this deal would be that more of Springer Nature’s content may be openly available to read, but with the ‘paywall’ shifted to the publish side. Here I list my key objections:

  1. There is still no justification for the high APCs (9500 EUR + taxes) for Nature tier journals. Accepting a deal, regardless of the level of discounts that could be achieved, is implicitly accepting their business model. Springer Nature declined to engage with the Journal Comparison Service run by cOAlition S that aims to help understand how costs are determined.
  2. Springer Nature’s view is that ‘gold OA’ is the only viable way to open access. Other models for open access are available, and show promise, including diamond OA journals and Subscribe to Open. However, Springer Nature assert that “they haven’t found a way of making them financially sustainable”.  If we accept a gold-only view of open access,  how can we objectively assess the sustainability of alternative models?
  3. A move to a ‘gold only’ OA world would shift the barrier from reading to publishing content. Springer Nature recently announced a waiver policy for researchers from about 70 lower income countries. This still excludes many researchers worldwide e.g. from Brazil and South Africa, perpetuating neo-colonial attitudes towards the creation of scholarly content and reinforcing existing institutional inequalities within countries. Any waiver programme for APCs should be “no-questions-asked” regardless of where researchers are based. This would need to be properly costed and part of the justification of the APC (point 1).
  4. As of January 2023, several UK institutions have rights retention policies in place, with more expected to follow in the coming months. Individual researchers can also use rights retention strategy by themselves. Rights retention statements allow researchers to meet UK funder’s requirement by depositing their author-accepted manuscript without embargo. I believe Springer Nature should publicly state that they will allow any author worldwide to maintain their rights on their own author-accepted manuscripts.
  5. Over half of Springer Nature’s hybrid journals failed to meet their 2021 targets for open access articles within hybrid journals.  Those hybrid journals that fail again this year to meet their targets will be removed from cOAlition S’s transformative journal program.  Having some journals ineligible for cOAlition S funding but part of a UK read-and-publish deal would further complicate an already confusing system.  It would also question Springer Nature’s commitment to open access.

A detailed public critique of the deal is not possible because of the confidential nature of the negotiations.  Finances aside, I feel there was one element that was simply unworkable and unethical due to it requiring scholars to keep one aspect confidential if the deal were accepted.

The UK is one of only a few countries with a  heavy reliance on transformative agreements.  Sweden has already decided that transformative agreements are not sustainable and the transition period should finish at the end of 2024. Coalition S has also confirmed it will end its support of hybrid journals by the end of 2024. I would like to see the UK move away from transformative agreements. We could instead work internationally to promote more ethical and sustainable alternatives that put scholars at the heart of scholarly communication. In particular, the APC model has been tried, and introduces as many headaches as it has tried to solve. 

It is time instead to try new approaches.  There are several interesting models being developed by forward-looking organizations that the UK could endorse.  For example, MIT press recently launched shift+OPEN as a way to flip subscription based journals to diamond open access model.  Another interesting approach is Subscribe to Open where journals drop their paywall if a threshold amount of subscriptions are received.  Money saved on dealing with legacy publishers like Springer Nature is better spent investing in our own infrastructure and new approaches.

Rights retention: publisher responses to the University’s pilot

The University’s one-year rights retention pilot has been running for six months now, during which time many papers containing the rights retention declaration have been submitted by Cambridge authors. As expected, the Office of Scholarly Communication is receiving more queries about rights retention from Cambridge academics, many of which relate to how publishers are responding to submissions containing the rights retention declaration. This post covers some of these queries to offer a picture of how rights retention is being received.   

It is worth reminding ourselves what the rights retention pilot entails. All researchers at Cambridge can sign up to participate in the pilot here. In doing so, the researcher enters into a non-exclusive agreement with the university to make all their papers immediately open access under a Creative Commons attribution (CC BY) licence. When a researcher submits an article to a publisher, they include the following statement in the acknowledgements or funding section of the article file: 

For the purpose of open access, the author has applied a Creative Commons Attribution (CC BY) licence to any Author Accepted Manuscript version arising from this submission’ 

Upon editorial acceptance, the researcher uploads a copy of the accepted manuscript to Symplectic Elements. The Open Access team will deposit the manuscript into Apollo and will release it publicly at the appropriate time. 

Publisher responses 

One of the primary fears researchers have regarding rights retention is that a publisher may editorially reject their article at the point of submission. While we are still dealing in small numbers of submissions and queries associated with the pilot, we have heard from at least two researchers that have been rejected from the journal at the point of submission due to rights retention language in their manuscript. In these cases, journals from the Seismological Society of America and the American Society of Hematology informed the respective authors that rights retention is not permitted because copyright transfer and an embargo period is required for publication in their journals. As a consequence, the authors in each case decided to submit to an alternative journal so that they could comply with their funder requirements. We are also aware of authors who received different answers from the American Society of Hematology, including to pay a fee or to accept rights retention. We hope rights retention will be approved in due course by the publisher as an acceptable route for all authors. 

A second group of publishers have asked for the rights retention language to be removed either because they deemed it not necessary to comply with or because another compliant route was available to the authors. For example, a journal published by Springer Nature asked for the rights retention language to be removed because it was not required for compliance purposes (because the article was submitted prior to the relevant policy coming into effect). Journals published by Elsevier, the American Chemical Society and Optica all asked for the rights retention language to be removed because of pre-existing publishing agreements that allow Cambridge researchers to publish open access free of charge. In these instances, authors were willing to remove the language from the final published version and so it was not clear what would have happened if they had not done so. We have received advice that removing this wording does not negate the fact that the publisher has been informed of the prior licence and so rights retention is still permissible here. We are recommending that researchers include the rights retention declaration where possible even when publishers ask for it to be removed.  

Despite the queries reported here, we have also seen a notable uptick in the number of submissions in the repository containing rights retention language, including within journals published by Elsevier, Wiley, Sage, Springer Nature, the Royal Society of Chemistry, Company of Biologists and JMIR Publications (to name a few). One journal published by the American Psychological Association was willing to accept immediate CC BY for UKRI-funded authors, although this was still subject to a copyright transfer agreement. In the case of Springer Nature, acceptance of the rights retention language also entailed payment of colour charges – something the authors had not anticipated and which we detailed further in this Twitter thread. We urge publishers to be as clear as possible about whether they accept rights retention and upon what conditions.  

I am sharing this data because it offers a snapshot of some of the responses we have seen from publishers so far. While we encourage our researchers to report any publisher pushback, we cannot be sure of all publisher responses, simply because researchers are under no obligation to report them. It is interesting, though, that some publishers are asking researchers to remove the rights retention declaration when there is a publishing agreement in place. We can hypothesise that this is because publishers want to prevent as many articles as possible from using this language because it would set a precedent for other researchers without access to such agreements to use rights retention too. Given this, the Office of Scholarly Communication is continuing to advise that the declaration is included in all manuscripts where possible, although this will be down to how persistent an author wants to be in requesting the language be retained.