The Data Picture

I was recently named one of “the next generation of [library] leaders” as part of the CILIP 125, having been recognised as an individual who contributes energy and knowledge to improving and impacting their organisation. My area of expertise, and thus recognition, lies with the use of data within libraries. As a data analyst for the Office of Scholarly Communications at Cambridge University Library, my role focuses on empowering decisions with data driven understanding – such as supporting the Springer Nature negotiations. To develop my understanding of data, and its role within a wider organisation, further, I engage with data beyond the library – such as the Big Data London conference and the Carruthers and Jackson Data Leaders’ Summer School. Reflecting on the use of data in the wider world, what can be expected of the library and data?


The summer school provided practical advice, proven methodologies, and guidance that could apply across a variety of businesses. The course is designed to provide insight on the workflow of data officers, and their role within an organisation – no matter its stage of data maturity and literacy. Over the course of the ten weeks, leading experts discussed the role of a chief data officer (CDO), both as a business development opportunity, and as a career path for individuals. It explored the risk and governance of data within an organisation, and the final weeks focused strongly on the role of people and teams associated with data.

Peter Jackson and Caroline Carruthers addressed the differing types of CDO and described a pendulum between ‘risk aversion’ and ‘value added’. Understanding the balance between secure and proper data governance (GDPR for example) and providing value through data (such as setting up automation). The pendulum of risk to reward is relevant to many roles, including those within the library. Understanding the need to divide time and energy between creating policies and getting decision making results, is just as relevant to my role as a chief data officer. In my role I have supported decision making staff through data production, but equally, to instil a culture of data, time and energy must be dedicated to risk aversion, through tasks of researching data management, preparing training sessions for data storage, and supporting staff in data preparation.

Another important concept introduced was the DIKW pyramid – Data, Information, Knowledge, Wisdom – for understanding the value created from data. The base of the pyramid is (raw) Data, which can be processed into (useful) Information. This Information is data with meaning and a purpose and can be organised into (insightful) knowledge. Knowledge combines experiences, values, insights, and contextual information, which can then transcend to (integral) Wisdom. Wisdom is considered a deeper understanding with ethical implications and the ability to define ‘why’. The DIKW pyramid provided a frame of thought for presenting and approaching future data projects. Understanding the requirement to provide, data, information or knowledge, to better support a decision-making team.

To develop communication skills, expert Scott Taylor, known as The Data Whisperer, spoke about the three V’s for data storytelling: Vocabulary, Voice and Vision. Combining an accessible vocabulary, with a common voice will illuminate the business vision, and why that is important. This overarching concept for an organisations data approach can be scaled down to support individual data workers, to provide value – which should either grow, improve or protect the business case. Understanding how to communicate the data is a key skill as “Hardware comes and goes, software comes and goes, but data remains”. And that data that remains should be used to either grow, improve or protect the business, such that data gathered should be usable data!

At Big Data London, the organisation Women in Data hosted conversations about nurturing a culture of learning within data teams. Pulling from their experiences from minority backgrounds, the speakers highlighted the power in upskilling, sharing skills across teams and being an advocate on oneself and skills. As for what to upskill, data literacy was a hot topic across the conference. Data literacy, also called data fluency and data confidence, is the combination of ability, skills and confidence surround data and its uses. Data literacy enables more efficient work, and begs the question, what is the base level of data literacy / confidence across the library? Librarians use data daily; checking in/out material, answering students’ queries, or tracking the use of space, but are all librarians confident to use that data? This is an area I hope to explore further at the CUL, to ensure staff can use the data they have to support decisions.


Engaging with the world of data provides a big picture of the possibilities within the library. Conversations of AI (Artificial Intelligence), data policies and maturity, and shiny-new databases, software, and services, demonstrate the growing adoption of data, and therefore, libraries should follow suit. Actively taking snippets of larger conversations, developing ideas within the library space, and exploring the possibilities with data will help libraries thrive in this world of technological growth.


The September 2023 Data Champion Forum

The Cambridge Data Champions had a fantastic September Forum at the West Hub. The forum started with an introduction to the West Hub by  Library Manager Daniele Campello and we welcomed Clair Castle as the new interim Research Data Manager with the Office of Scholarly Communication (University Library).

Dr Mandy Wigdorowitz kicked off the presentations by sharing with the Data Champions what she aims to achieve as the University’s Open Research Community Manager. This includes raising the profile of Open Research at the University and ensuring that scholarly and research outputs that are deemed to be open are indeed accessible and interoperable in accordance with FAIR principles.  As Open Research Community Manager, Mandy advocates for Open Research among University researchers from both the STEMM and AHSS (Art, Humanities and Social Sciences) disciplines. The latter proves to be more challenging as researchers in AHSS may often have valid reasons from refraining from making their research data open, such as working with sensitive data or working with interlocutors who object to their data being shared. Such issues will be addressed at the Cambridge Open Research Conference that she is organising, which takes place on 17th November 2023 at Downing College, Cambridge as well as online. To end, Mandy invited the Data Champions to join her Open Research initiative, a community of advocates for Open Research across the University.

Before lunch, Madeleine Taylor (Information Security Risk and Governance Manager with University Information Services, UIS) presented a follow up to a webinar session on monitoring the Information and Cybersecurity (ICS) risks for research data across the university, which she conducted with the Data Champions a couple weeks prior. After a brief introduction of what she has done so far to protect Cambridge’s research communities against ICS threats, she asked the Data Champions for help in her task of securing research data against ICS risks. They can do so by providing her with a sense of what data their own research communities are working with and how they were storing them. As the Data Champions ate the delicious lunch of sandwiches and cakes provided by the West Hub caterers, they provided feedback to Madeleine on two forms that she proposed as methods of gathering the information she needed: a 3-minute research data impact assessment form and a research data cyber security risk form. Maddy will continue to work with the Research Data Team and the Data Champions to refine, and gather information, through these forms.

Thank you to the West Hub and Daniele Campello for hosting the Data Champions Forum in your welcoming building!

If you are a member of the University of Cambridge and are interested in attending the Data Champions Forum, please join us as a Data Champion. If you are passionate about research data management and data sharing or you would like to find out more about what being a Data Champion entails, please visit the Data Champions webpage. We welcome applications from those working in all academic subjects across AHSS and STEMM disciplines. If you are unsure about how being a Data Champion would impact your research, please get in touch with the Research Data Team!

Cartoon by Clare Trowell CC-BY-NC-ND



Reflections from the Edinburgh Open Research Conference

Dr Mandy Wigdorowitz holds the position of Open Research Community Manager for Cambridge University Libraries where she is developing an open research community across Cambridge. She has a PhD in Theoretical and Applied Linguistics from the University of Cambridge and is a registered Research Psychologist with the Health Professions Council of South Africa. She also holds the position of Associate Editor for the Journal of Open Humanities Data.

The Edinburgh Open Research Conference 2023, offered by the University of Edinburgh Library Research Support Team and grassroots group Edinburgh ReproducibiliTea, provided a platform for the exchange of ideas and discussions about open research under the theme ‘Open Research as a Tool for Addressing Global Challenges’. Living up to its theme, the conference held numerous presentations focussing on the various ways in which open research practices can positively support efforts to address various challenges centring around open initiatives. The conference provided an opportunity for people from across the world to come together in a hybrid format to discuss how adopting the open research principles of open access, participation in research, transparency, and open data can ensure that the efforts of research are set up to help address global challenges, including in education, climate action, and global pandemics.

As a presenter and attendee, I reflect on the main take-homes from this event.  

With any conscientious and inclusive movement, clarification of terminology is important. The open research movement is no exception. Throughout the conference, many speakers acknowledged ‘open science’ as being an inclusive term, encompassing all areas of ‘openness’ or ‘open scholarship’, and one which extends beyond the ‘sciences’ to include all disciplines where knowledge synthesis and open research is considered. It was proposed that the phrase ‘open science’ is about intent and the larger goal of open research, and it should not be reduced to disciplines that fall under the ‘sciences’ umbrella per se. While the sentiment of this stance is reassuring and inclusive in intent, it is undeniable that there is weight behind the words we use. Instead, I would argue that it would be more inclusive to replace ‘science’ with ‘research’ when referring to the broad ‘open research’ movement. Doing so would safeguard against unintended misinterpretations about who may partake in and benefit from this movement.

A highlight from the conference was its celebration and acknowledgement of the growing impact of public engagement and citizen-led research. Case studies offered insight into how involving the public in data collection, analysis, and decision-making processes can enhance the relevance and societal impact of open research endeavours. For instance, UCL’s Institute for Global Prosperity aims to understand what prosperity means for people as informed by members of their respective communities. In addition, the Extreme Citizen Science: Analysis and Visualisation project employs the use of culturally appropriate geographical analyses and visualisation tools that can be used by varying communities with differing degrees of literacy to formulate research questions and collect relevant data. Attendees were encouraged to explore innovative ways of collaborating with non-academic communities to foster a culture of inclusivity, knowledge sharing, and insights that are driven from the communities under investigation, and to think about the value of smaller, local-scale projects in addition to large-scale projects.

Much attention was afforded to the dissemination avenues that prioritise FAIR principles (Findable, Accessible, Interoperable, and Reusable) and open practices, as well as who the contributors and accessors of such research outputs are. These efforts have largely been attributed to the increased availability of digital collections, the development of new data-intensive methods, increased pressure from funders, the requirement of data management plans for preservation purposes, the involvement and collaboration of research libraries, and the rollout of rights retention policies. Discussions centred around digital objects and data, including how these are produced, how and where they are openly and transparently shared, how they can be accessed and preserved, and what the potential of their reuse is. Such questions lead to the need for reputable sharing outlets that service people from all parts of the world and across all disciplines. Significant outlets that were mentioned included repositories, data dashboards, and data papers.

Data dashboards provide an overview of the various aspects associated with a research project, which allows for clear access to data insights when conducting large projects. An effective use of a data dashboard comes from DecodeME, the world’s largest study of ME/CFS (myalgic encephalomyelitis). Data papers are peer-reviewed publications that describe curated datasets. Data papers can be shared in traditional research journals as one subtype of article publication, or in data journals which are dedicated to the publication of data papers. This avenue of dissemination has been active in the STEM and Health disciplines, but it is being increasingly recognised and promoted within the Humanities and Social Sciences, largely driven by data journals in these areas, such as the Journal of Open Humanities Data. Overall, these discussions shed light on the challenges and potential solutions to ensure the quality and accessibility of open outputs derived from various research projects.

In addition to the many discussions about open software, which are ubiquitous in open research, open hardware was recognised as an emerging area in this arena. Open hardware can include, for instance, computing devices, scientific instrumentation, and remote sensing satellites that contribute to the conduction of research and discovery of knowledge. Typically, legal restrictions prohibit the investigation and modification of closed source hardware, resulting in a lack of reproducibility, duplication of effort, obsolescence, and financial burdens which ultimately reinforce global inequities. There have been recent efforts, however, to develop open-source hardware tools and devices to address global challenges particularly in under-resourced communities. Real-world case studies were presented that explore where and how open hardware has been used to address global challenges (e.g., in microscopy, space exploration, environmental monitoring) and make a difference in the lives of everyday people. The Gathering for Open Science Hardware was identified as a community whose mission is to promote open hardware and the practices ensuring its success. Open hardware presents an exciting opportunity for progress as its potential for solving global problems is far-reaching and scalable.

Education also emerged as key to the open research movement. The conference presented best practices in research data management and open educational resources for postgraduate students and educators from the perspective of a university lecturer. Training and mentoring programmes about open practices were mentioned, where people interested in applying open principles in their work and becoming ambassadors in their communities could sign up to Open Life Science to participate in an open research training course, and to Open Hardware Makers to support open hardware projects.

In sum, the Edinburgh Open Research Conference was successful in showcasing the advancement of open research with a focus on addressing global challenges. Open research is a fundamentally iterative process where we can all learn and build upon the accumulated work and knowledge that has been done before us. In this way, the event illustrated the remarkable progress that has been made in various domains and throughout the research lifecycle. By bringing together individuals from diverse backgrounds and contexts, this conference provided a platform for knowledge sharing and community-building at the forefront of open research.

You can find all the talks and slides from the conference here.