Tag Archives: Open Research

Reflections from the Edinburgh Open Research Conference

Dr Mandy Wigdorowitz holds the position of Open Research Community Manager for Cambridge University Libraries where she is developing an open research community across Cambridge. She has a PhD in Theoretical and Applied Linguistics from the University of Cambridge and is a registered Research Psychologist with the Health Professions Council of South Africa. She also holds the position of Associate Editor for the Journal of Open Humanities Data.

The Edinburgh Open Research Conference 2023, offered by the University of Edinburgh Library Research Support Team and grassroots group Edinburgh ReproducibiliTea, provided a platform for the exchange of ideas and discussions about open research under the theme ‘Open Research as a Tool for Addressing Global Challenges’. Living up to its theme, the conference held numerous presentations focussing on the various ways in which open research practices can positively support efforts to address various challenges centring around open initiatives. The conference provided an opportunity for people from across the world to come together in a hybrid format to discuss how adopting the open research principles of open access, participation in research, transparency, and open data can ensure that the efforts of research are set up to help address global challenges, including in education, climate action, and global pandemics.

As a presenter and attendee, I reflect on the main take-homes from this event.  

With any conscientious and inclusive movement, clarification of terminology is important. The open research movement is no exception. Throughout the conference, many speakers acknowledged ‘open science’ as being an inclusive term, encompassing all areas of ‘openness’ or ‘open scholarship’, and one which extends beyond the ‘sciences’ to include all disciplines where knowledge synthesis and open research is considered. It was proposed that the phrase ‘open science’ is about intent and the larger goal of open research, and it should not be reduced to disciplines that fall under the ‘sciences’ umbrella per se. While the sentiment of this stance is reassuring and inclusive in intent, it is undeniable that there is weight behind the words we use. Instead, I would argue that it would be more inclusive to replace ‘science’ with ‘research’ when referring to the broad ‘open research’ movement. Doing so would safeguard against unintended misinterpretations about who may partake in and benefit from this movement.

A highlight from the conference was its celebration and acknowledgement of the growing impact of public engagement and citizen-led research. Case studies offered insight into how involving the public in data collection, analysis, and decision-making processes can enhance the relevance and societal impact of open research endeavours. For instance, UCL’s Institute for Global Prosperity aims to understand what prosperity means for people as informed by members of their respective communities. In addition, the Extreme Citizen Science: Analysis and Visualisation project employs the use of culturally appropriate geographical analyses and visualisation tools that can be used by varying communities with differing degrees of literacy to formulate research questions and collect relevant data. Attendees were encouraged to explore innovative ways of collaborating with non-academic communities to foster a culture of inclusivity, knowledge sharing, and insights that are driven from the communities under investigation, and to think about the value of smaller, local-scale projects in addition to large-scale projects.

Much attention was afforded to the dissemination avenues that prioritise FAIR principles (Findable, Accessible, Interoperable, and Reusable) and open practices, as well as who the contributors and accessors of such research outputs are. These efforts have largely been attributed to the increased availability of digital collections, the development of new data-intensive methods, increased pressure from funders, the requirement of data management plans for preservation purposes, the involvement and collaboration of research libraries, and the rollout of rights retention policies. Discussions centred around digital objects and data, including how these are produced, how and where they are openly and transparently shared, how they can be accessed and preserved, and what the potential of their reuse is. Such questions lead to the need for reputable sharing outlets that service people from all parts of the world and across all disciplines. Significant outlets that were mentioned included repositories, data dashboards, and data papers.

Data dashboards provide an overview of the various aspects associated with a research project, which allows for clear access to data insights when conducting large projects. An effective use of a data dashboard comes from DecodeME, the world’s largest study of ME/CFS (myalgic encephalomyelitis). Data papers are peer-reviewed publications that describe curated datasets. Data papers can be shared in traditional research journals as one subtype of article publication, or in data journals which are dedicated to the publication of data papers. This avenue of dissemination has been active in the STEM and Health disciplines, but it is being increasingly recognised and promoted within the Humanities and Social Sciences, largely driven by data journals in these areas, such as the Journal of Open Humanities Data. Overall, these discussions shed light on the challenges and potential solutions to ensure the quality and accessibility of open outputs derived from various research projects.

In addition to the many discussions about open software, which are ubiquitous in open research, open hardware was recognised as an emerging area in this arena. Open hardware can include, for instance, computing devices, scientific instrumentation, and remote sensing satellites that contribute to the conduction of research and discovery of knowledge. Typically, legal restrictions prohibit the investigation and modification of closed source hardware, resulting in a lack of reproducibility, duplication of effort, obsolescence, and financial burdens which ultimately reinforce global inequities. There have been recent efforts, however, to develop open-source hardware tools and devices to address global challenges particularly in under-resourced communities. Real-world case studies were presented that explore where and how open hardware has been used to address global challenges (e.g., in microscopy, space exploration, environmental monitoring) and make a difference in the lives of everyday people. The Gathering for Open Science Hardware was identified as a community whose mission is to promote open hardware and the practices ensuring its success. Open hardware presents an exciting opportunity for progress as its potential for solving global problems is far-reaching and scalable.

Education also emerged as key to the open research movement. The conference presented best practices in research data management and open educational resources for postgraduate students and educators from the perspective of a university lecturer. Training and mentoring programmes about open practices were mentioned, where people interested in applying open principles in their work and becoming ambassadors in their communities could sign up to Open Life Science to participate in an open research training course, and to Open Hardware Makers to support open hardware projects.

In sum, the Edinburgh Open Research Conference was successful in showcasing the advancement of open research with a focus on addressing global challenges. Open research is a fundamentally iterative process where we can all learn and build upon the accumulated work and knowledge that has been done before us. In this way, the event illustrated the remarkable progress that has been made in various domains and throughout the research lifecycle. By bringing together individuals from diverse backgrounds and contexts, this conference provided a platform for knowledge sharing and community-building at the forefront of open research.

You can find all the talks and slides from the conference here.

Open access: where next? – event round-up

Dr. Samuel Moore, Scholarly Communication Specialist, Cambridge University Libraries

On Friday 18th November, participants from across Cambridge and beyond gathered for a hybrid meeting on the future of open access publishing. Hosted by Homerton College, ‘Open Access: Where Next?’ explored issues relating to article-processing charges, research assessment and innovation in scientific publishing. 65 in-person attendees and 78 online attendees participated in the day-long event consisting of four panels and a keynote from Professor Gina Neff of the Minderoo Centre for Technology and Democracy.

Prof. Neff kicked off the event with a timely and insightful talk titled ‘Further than the academy: the stakes for open research’. Covering themes such as misinformation, preservation and widening participation in knowledge, Prof. Neff explored the importance of democratic and responsible approaches to our digital present and future, looking especially to libraries as key to supporting these issues.

The first panel of the day, ‘Further than privileged universities’, was introduced by Dr. Matthias Ammon and featured Dr. Juliet Vickery, Chief Executive of the British Trust for Ornithology, Dr. Tabitha Mwangi, Cambridge-Africa Programme Manager, and Dr. Stuart Pracy, Lecturer in Medieval History at the University of Exeter. Each panellist spoke on the challenges of open access that arise from either being outside privileged university spaces or without secure employment within them. Despite representing quite different communities, there were a number of commonalities between the experiences of each speaker, most notably the fact that moving from paying to access scholarly material to paying to publish it added a new exclusionary dimension to their ability to communicate research.

In the second panel, we heard from three speakers who are working against the move toward paying to publish. ‘Further than APCs and BPCs’ featured speakers working on publishing projects that do not require authors to pay processing charges to publish their work – so-called Diamond open access. Cambridge librarians Dr Meg Westbury (Academic Services Librarian, Human and Social Sciences) and Dr Yvonne Nobis (Head of Physical Sciences libraries) described their respective publishing projects, The Journal of Information Literacy and Discrete Analysis. The audience learned about both the challenges around running a journal on a shoestring, but also the advantages of a DIY approach to publishing without recourse to expensive publishing networks. In addition, Dr. Joe Deville of Lancaster University explained the work of the soon-to-launch Open Book Collective to collaboratively fund the publication of open access books in the humanities and social sciences.

After lunch, Niamh Tumelty chaired a roundtable with Cambridge researchers on research assessment and its relationship with publishing. Prof. Steve Russell, Head of Department of Genetics, described his work as Chair of DORA (the Declaration on Research Assessment) alongside the work needed for the university to fulfil its commitment to ensuring researchers are no longer judged by the venues in which they publish. Following this, Liz Simmonds – the University’s Head of Research Culture – described the pros and cons of alternative approaches to assessment such as narrative CVs. Finally, Prof. Emma Gilby of the Faculty of Modern and Medieval Languages and Linguistics explained the view from the humanities, particularly how declarations such as DORA are designed and implemented with the sciences in mind.

The final panel of the day was on innovations in scholarly publishing. Chaired by Dr. Samuel Moore, three panellists described their publishing approaches to moving beyond the traditional journal article. Dr. Mónica Moniz of Cambridge University Press & Assessment presented Research Directions – the press’ approach to publishing the research lifecycle across a variety of disciplinary questions. Following this, F1000’s Head of Data and Software Publishing, Dr. Beck Grant, described the publisher’s approach to automated data publishing in partnership with the Wellcome Sanger Institute. Finally, Dr. Damian Pattinson discussed eLife’s new approach to removing accept/reject decisions from its publishing process – and an invigorating discussion ensued!

At the end of the day, Niamh Tumelty summarised the event and reminded participants to fill out the postcards they were given at the start of the day to document what actions they will take in response to the issues covered in the conference. We will be posting these postcards to participants in January as a reminder of what you planned to do (with vouchers to three lucky recipients). Special thanks to all participants, attendees and organisers, but especially to Bea Gini for all her help with this, her last, event as part of the Office of Scholarly Communication. Thanks also to Clare Trowell for designing our postcards.

Open Research in the Humanities

Authors: Emma Gilby, Matthias Ammon, Rachel Leow and Sam Moore

This is the first in a series of blog posts, presenting the reflections of the Working Group on Open Research in the Humanities. The working group aimed to reframe open research in a way that was more meaningful to humanities disciplines, and their work will inform the University of Cambridge approach to open research. This post introduces the working group and provides a top level overview of the issues the group discussed between July and December 2021.

The Working Group on Open Research in the Humanities was chaired by Prof. Emma Gilby (MMLL) with Dr. Rachel Leow (History), Dr. Amelie Roper (UL), Dr. Matthias Ammon (MMLL and OSC), Dr. Sam Moore (UL), Prof. Alexander Bird (Philosophy), and Prof. Ingo Gildenhard (Classics). We met for four meetings in July, September, October and December 2021, with a view to steering and developing services in support of Open Research in the Humanities. We aimed notably to offer input on how to define Open Research in the Humanities, how to communicate effectively with colleagues in the Arts and Humanities (A&H), and how to reinforce the prestige around Open Research. We hope to add our perspective to the debate on Open Science by providing a view ‘from the ground’ and from the perspective of a select group of humanities researchers. These disciplinary considerations inevitably overlap, in some measure, with the social sciences and indeed some aspects of STEM, and we hope that they will therefore have a broad audience and applicability.

Academics in A&H are, in the main, deeply committed to sharing their research. They consider their main professional contribution to be the instigation and furthering of diverse cultural conversations. They also consider open public access to their work to be a valuable goal, alongside other equally prominent ambitions: aiming at research quality and diversity, and offering support to early career scholars in a challenging and often precarious employment landscape.  

Although A&H cover a diverse range of disciplines, it is possible to discern certain common elements which guide their profile and impact. These common elements also guide the discussion that follows.  

  • A&H colleagues tend to produce longer and more intensively edited books and articles. The in-depth study of 80,000 words+ is still considered to be a particularly useful and therefore prestigious research output. This work is deeply reliant upon the additional work of librarians, translators, copy editors, managing editors, general editors, etc., all of whom are highly skilled professionals in their own right. 
  • A&H scholars would often go further than our STEM colleagues in wanting the open access version of our work to correspond to the final version of record, as opposed to an unformatted (and therefore unfinished) ‘accepted manuscript’ or ‘preprint’. This is because, as just mentioned, editorial activity (the work as process) is a vital part of the end result (the work as product). Moreover, in A&H, citations often refer to individual pages rather than to an article as a whole, so having access to versions with differing pagination is unhelpful for authors and readers. 
  • A&H work can be vastly commercially profitable, especially in the entertainment industries, but often has an indirect commercial use value, and one does not get the sense that profiteering is a discipline-wide issue. Far fewer A&H journals would be owned by for-profit multinational businesses. They tend instead to be closely connected to scholarly societies, who themselves plough their profits back into running conferences and supporting communities and early career scholars, while maintaining a diverse set of publishing arrangements with university or smaller scholarly presses. The complaint from colleagues in STEM that profit-oriented journals ‘take our work and then sell it back to us’ is less frequently heard in A&H contexts; A&H researchers would perhaps tend to have a less antagonistic relationship to publishers than in STEM.  
  • A&H scholars do not tend to produce data from scratch via experiment. The material that we work with would often be available in the form of printed texts or images, or generated via discussion in the case of, say, oral histories or interview pieces. However, we also often deal with data that we do not own. In these cases, we pay to publish from private archives or collections or from other resources that are under copyright.   
  • A much smaller percentage of A&H research is funded by the research councils than is the case in the STEM subjects.  To an extent, this follows from the fact that (notwithstanding the copyright payments mentioned above) A&H research is often less expensive to carry out than STEM research, requiring less equipment, space etc. Even so, there is a significant funding gap in the A&H, often partially filled by registered charities such as the Leverhulme Trust, the British Academy, etc. Department and faculty research budgets are vanishingly small.  
  • Many A&H researchers (often in fields such as music, art history, drama and so on) are located outside the higher education system altogether, working for instance in museums, galleries, private houses or collections, theatres, or charities.  
  • It is less the case in the A&H than in the sciences that English is the international language of communication. Indeed, publication in foreign-language journals or the translation of one’s books into languages other than English would be a particular mark of prestige in the A&H, demonstrating international reach, irrespective of the size of the publics reached.  

The Five Pillars of Open Research in the Arts and Humanities: Opportunities for Cultural Change 

The Working Group set itself the task of revisiting a document produced in 2018 by the League of European Research Universities (LERU): Open Science and its Role in Universities: A Roadmap for Cultural Change. LERU’s ‘eight dimensions of open science’, often referred to as the ‘eight pillars’, are as follows: 

  1. The Future of Scholarly Publishing 
  1. FAIR data (findable, accessible, interoperable and reproducible) 
  1. The European Open Science Cloud (EOSC) 
  1. Education and Skills 
  1. Rewards and Incentives 
  1. Next-generation Metrics 
  1. Research Integrity 
  1. Citizen Science  

The outline and detailed descriptions of the ‘eight pillars’ are often explicitly or implicitly science-based, and reflect assumptions about knowledge production in the STEM disciplines. We have now rewritten these to give the ‘five pillars of open research in the arts and humanities’. A more detailed examination of each pillar follows, as a way to structure our recommendations for the ways in which our institution, and HE institutions in general, can support open research in the A&H. In each of the five sections, detailed in the next five blog posts, opportunities are noted and recommendations for institutional support, development and training are given.

  1. The Future of Scholarly Communication
  2. CORE Data
  3. Research Integrity and Care 
  4. Public Engagement
  5. Research Evaluation

The full, citable report is available in Apollo: https://doi.org/10.17863/CAM.86734

Open Research at Cambridge Conference – Opening session

The Open Research at Cambridge conference took place between 22–26 November 2021. In a series of talks, panel discussions and interactive Q&A sessions, researchers, publishers, and other stakeholders explored how Cambridge can make the most of the opportunities offered by open research. This blog is part of a series summarising each event. 

The opening session, chaired by Dr Jessica Gardner (University Librarian and Director of Library Services) included talks by Professor Anne Ferguson-Smith (Pro-Vice-Chancellor for Research), Professor Steve Russell (Acting Head of Department of Genetics and Chair of Open Research Steering Committee), Mandy Hill (Managing Director of Academic Publishing at Cambridge University Press) and Dr Neal Spencer (Deputy Director for Collections and Research at the Fitzwilliam Museum). All four speakers foresee an increasingly open future, with benefits for both institutions and researchers. They also considered some of the challenges that still need to be worked through to avoid potential problems.

What is working well?

In recent years, we have made great progress in the proportion of publications that are open access. Over three quarters of publications with Cambridge authors last year were openly available in some form.

The trend is continuing and it is not unique to our institution. CUP have set an ambitious goal for the vast majority of research articles they publish to be open access by 2025.

Other forms of publication are becoming common, meeting different dissemination needs. Preprints have been the star of the show during the pandemic, allowing rapid dissemination while formal peer review follows down the line.

Diagram from Mandy Hill’s slide: ‘Increasingly open platforms and formal publishing will meet different dissemination needs’

In the scholarly communication arena, open access articles benefit from more downloads and citations. Museum-based projects involving artisans, schools and artists all found enthusiastic responses.

What can we look forward to?

Research culture is coming under the spotlight across the sector, and Cambridge has committed to an ambitious action plan to create a thriving environment to do research. Key principles include openness, collaboration, inclusivity, and fair recognition of all contributions.

Diagram from Prof Steve Russell: ‘Going Forward’

Implementing the San Francisco Declaration on Research Assessment (DORA) is part of this progress. We want to assess research on its own merits rather than on the basis of journal or publisher metrics. This also means recognising all research outputs and a broad range of impacts.

Reproducibility is increasingly recognised as critical in a number of disciplines. A developing UKRN group within the University aims to ‘take nobody’s word for it’ – but rather support reproducible workflows that underpin confidence in the conclusions of research. By sharing and rewarding best practice we can become world leaders in this area, and in open research more widely.

In the past, museum collections have tended to be documented in limited ways, with poor accessibility and interoperability, which made it hard to discover and use materials. Several exciting projects at the Fitzwilliam Museum and more broadly have started to change that. There are opportunities for a single discovery portal, tying together different collections. The Fitzwilliam Museum is also making its collection discovery process richer, by providing opportunities for deeper dives, and more connected, by linking with other collections and resources.

Deep zoom access to an image in the Fitzwilliam collection. Adapted from Dr Neal Spencer’s slide ‘Fitzwilliam Museum Collections Search’.

What problems should we be mindful of?

There are still barriers that hinder some open research aspirations. Historical constraints on the ways we find materials, conduct research, and publish results remain. Some systems may need to be reimagined, while not scrapping structures that are still serving us well.

Cambridge is a large and complex institution, where change takes time. Nevertheless, there is an established governance structures and an evolving set of policies that support open research.

Most importantly, researchers should be at the centre of the move towards open research. It is important that they benefit from open practices, rather than finding themselves torn between competing priorities. Conversations continued throughout the week to explore possible approaches in different disciplines, drawing from the rich diversity of experiences to shape the future of open research at Cambridge.

Practical steps toward more reproducible research

The Open Research at Cambridge conference took place between 22–26 November 2021. In a series of talks, panel discussions and interactive Q&A sessions, researchers, publishers, and other stakeholders explored how Cambridge can make the most of the opportunities offered by open research. This blog is part of a series summarising each event. 

On 26 November 2021 the University’s Reproducibility Working Group hosted a workshop for researchers from across Cambridge to explore approaches to supporting more reproducible research. Talks were provided by Professor Alexander Bird (Faculty of Philosophy), Dr Florian Markowetz (Cancer Research UK Cambridge Institute) and Dr Maria Tsapali (Faculty of Education) exploring approaches to reproducible research and reasons to work reproducibility across qualitative and quantitative research.

The recording of the session can be found below:

Talks were followed by interdisciplinary discussion sessions designed to identify the obstacles to reproducible research across Cambridge and how these might be tackled.  The key findings from the discussions included:

  • Training on reproducibility, including statistical training, reproducible methods and use of key tools exist in departments across the University, but more needs to be done to share provision and create synergies and central provision where possible. 
  • Training should begin at undergraduate or Masters level to build key skills early.
  • Awareness of training, and the importance of reproducibility training, needs to be enhanced.
  • The need for University guidance on how to make research reproducible, particularly to overcome key challenges to reproducibility such as balancing reproducibility with the need to protect sensitive or confidential data.
  • That the University can help by making the production of open and reproducible research as painless as possible, for example by facilitating peer review of codes and providing easy access to data storage and expertise in best practice.
  • That reproducibility looks very different across the disciplines and that in some areas transparency and methods reproducibility will be the focus, rather than reproducible outcomes.

The Reproducibility Working Group will draw on the ideas raised at this workshop to help shape proposals for future University approaches to supporting reproducible research. The group plans to host a number of further events to map, consolidate, and extend existing resources for reproducibility across Cambridge with the aim of boosting grassroots activities and magnifying their impact across all levels of the institution.

For more information and resources on reproducible research see: UK Reproducibility Network: https://www.ukrn.org/

Open Research 101

Dr. Sacha Jones and Dr. Samuel Moore, Office of Scholarly Communication, Cambridge University Libraries

The Open Research at Cambridge conference took place between 22–26 November 2021. In a series of talks, panel discussions and interactive Q&A sessions, researchers, publishers, and other stakeholders explored how Cambridge can make the most of the opportunities offered by open research. This blog is part of a series summarising each event. 

As part of the Cambridge Open Research conference, the Office of Scholarly Communication hosted a ‘101’ session on open research, covering the basics and answering queries for the audience on all aspects of open access publication and open data. With over 80 participants, we were thrilled with the response and wanted to recap some of the topics we covered in this post.

Firstly, as we discussed in the session, it is easy to assume that open research is simply an issue for the sciences rather than all academic disciplines. Practices such as open access and open data have been taken up widely in the sciences, although in different ways, and there is a common association with science and openness. This is compounded by the fact that in many European countries Open Science is inclusive of arts and humanities scholarship and so is functionally equivalent to open research. At the OSC, we are keen to support open practices across all disciplines while being sensitive to different ways of working. We are guided by the university’s Open Research Position Statement that requires work to be ‘as open as possible, as closed as necessary’.

After an introduction to open research, Sam then outlined the key issues in open access, including the different licences for making your research open access, the differences between green and gold open access, and the many and various reasons for making your work open access. Open access allows us to reach new audiences, improve the economics of research access, and reassess knowledge production and dissemination in a digital world. We also learned about open access monographs, the complex policy landscape and the various ways in which you can make your research open access through repositories and journals. The OSC’s Open Access webpages are an excellent set of resources for learning more.

We then moved onto open data – research data shared publicly – and how this fits into open research (see the University’s policy framework on research data). After highlighting that all research regardless of discipline generates or uses data of one kind or another (e.g. text, audio-visual, numerical, etc.), Sacha posed a series of questions with answers, anticipating what the audience might want to know more about. Do I have to share my data? What data do I share – is it meant to be everything from my research? My data contains sensitive information so I can’t share my data, or can I? How do I share my data? I don’t want to be criticised after making my data open, so how can I prevent this? How can I stop someone else from taking my data, using it, and getting all the credit? The OSC’s Research Data website contain information about data management and data sharing, and check out our list of Cambridge Data Champion experts to see if there’s anyone who’s volunteered to be a local source of data-related advice in your department or discipline.

We are always available as a source of support and guidance in all matters relating to open research and encourage you to contact us if you have any questions. The OSC has webpages on open research and sites dedicated to both open access and research data. For general open research enquires, we can be emailed at info@osc.cam.ac.uk, for open access at info@openaccess.cam.ac.uk and for data at info@data.cam.ac.uk. There are also a number of training sessions provided throughout the year and online that relate to the topics covered in this session. If you think that those in your department or institute at Cambridge would like to know more about the topics covered here then please do get in touch as we’d be happy to speak to these and answer any questions you may have.

Cambridge Data Week 2020 day 1: Who are the winners and losers of good data practices?

Cambridge Data Week 2020 was an event run by the Office of Scholarly Communication at Cambridge University Libraries from 23–27 November 2020. In a series of talks, panel discussions and interactive Q&A sessions, researchers, funders, publishers and other stakeholders explored and debated different approaches to research data management. This blog is part of a series summarising each event.  

The rest of the blogs comprising this series are as follows:
Cambridge Data Week day 2 blog
Cambridge Data Week day 3 blog
Cambridge Data Week day 4 blog
Cambridge Data Week day 5 blog

Introduction

The first day of Cambridge Data Week 2020 kicked off with a tantalisingly open question: who are the winners and losers of good data practices? This question was addressed via two different perspectives: those of a funder, provided by Dr Georgie Humphreys (Wellcome), and of a publisher, provided by Dr Catriona MacCallum (Hindawi). Discussion of this topic during presentations and the Q&A session looked through various (but not mutually exclusive) lenses, including those of data sharing, quality, ethics, and research culture. Funder mandates for data sharing and what these have achieved (e.g. saving research funds related to data reuse) were reflected upon, as were disciplinary differences between STEMM, social sciences, arts and humanities. There was also a discussion of evidence relating to shifts in research culture and if this is pointing to better data practices. As a whole, the webinar explored a broader view of good data practices, the consequences of these, and the progress being made in embedding good data management in research. 

Topical for this year, both speakers discussed data sharing related to Covid-19 research. Catriona stated that Covid has exposed systemic flaws in the existing system (in relation to data sharing), and Georgie highlighted some surprising results regarding data availability statements in Covid-related articles. The CARE Principles for Indigenous Data Governance were also bought to the fore by Catriona, who argued for attention to be placed on potential power issues surrounding data sharing. These are a set of principles, complementary to the FAIR principles, but which encourage the open research movement to fully engage with Indigenous Peoples rights and interests. A pervasive undercurrent ran throughout the webinar – research culture and some problems therein. These were addressed explicitly by both speakers, with both stating that more needs to be done by institutions to implement DORA and reward researchers for their achievements and good research practices and not just according to where (i.e. in what journals) their research is published. Catriona highlighted results from a 2019 EUA report that shows that institutions have some way to go in this regard, that the value of data is not fully recognised, and that responsible research assessment is at the heart of cultural change in the right direction.

We had some great questions from the audience that were answered in the Q&A session, such as “In countries without the REF, is data sharing better?”, and “How do you get qualitative researchers on board with this?”, and “What is the role of universities in the so-called data-driven economy?”. Our audience also responded to the poll we held at the end of the webinar, where we asked participants to select one from seven given options that they regard as most likely to prevent good data practices among researchers. Resource indicators (knowledge, time, money for RDM) amounted to 46% of responses (blue in the chart below) and cultural indicators amounted to 53% (orange in the chart). Overall, the results were rather surprising but optimistic, revealing that a dominant perception among the participants is that a shift in cultural practices is one of the leading factors necessary to drive forward good data practices in research.

Graph showing the results of the poll held during the webinar, indicating what participants consider most likely to prevent or inhibit good data practices.
Figure 1. Results of the poll held during the webinar, where participants were asked to choose one of seven factors that they consider most likely to prevent or inhibit good data practices.

Audience composition

We had 274 registrations for this webinar, with just over 70% originating from the Higher Education sector. Researchers and PhD students accounted for 40% of registrations and research support staff for an additional 30%. On the day, we were thrilled to see that 164 people attended the webinar, participating from a wide range of countries.

Recording , transcript and presentations

The video recording of the webinar can be found below and the recording, transcript and presentations are present in Apollo, the University of Cambridge repository.

Bonus material

There were a few questions we did not have time to address during the live session, so we put them to the speakers afterwards. Here are their answers:

What are the ethics of using secondary data, particularly in relation to primary versus secondary researchers’ objectives, meaning of data/methods, consent of participants, and in the case of qualitative data, the personal relationships built between researcher and participants?

Georgie Humphreys This question seems to allude to informed consent which is still a topic of active discussion in terms of what one tries to build into the original informed consent to allow subsequent secondary use down the line. There is this idea of broad consent now where a participant would consent to that particular project but they’re also consenting to their data being kept and maybe reused for other purposes related to different scientific questions, but maybe with clauses such as ‘not for commercial benefits’. There are potential concerns about re-identification but there are mechanisms for dealing with that – mechanisms which reduce risk whilst retaining value, such as anonymisation or synthetic data creation. But there are other datasets where that’s just not going to be possible, where you lose all value of the original dataset. The UKDS have a nice page on informed consent, providing information on what you put in your consent forms to enable secondary use. This needs to be thought about at the very start of the study prior to collection of the primary data.

Catriona MacCallum This question is really focusing on data privacy issues. The primary researcher collects the data, the secondary researcher reuses the data. There are ways that researchers can be given access to the data while maintaining privacy. The primary researcher is creating the relationships with participants in order to obtain data, so what does this mean ethically for those wishing to reuse the data? Safety nets do need to be put into place. Here, it’s important to raise the CARE principles again. These were the result of a working group that came about as a result of concerns about how data from indigenous people are being treated. The slogan is now ‘Be FAIR and CARE’. The CARE principles are emerging in the UN’s agenda, and UNESCO, and I’m sure it will come up with the Research Council’s too.

What are the best practices to ensure data quality? 

Catriona MacCallum It depends what is meant by ‘quality’ as there are various ways of looking at this. The European Commission came up with the economic loss of not publishing failed experiments; in other words, the publication bias that results. We need to redefine what we mean by quality, integrity and again this speaks to the research culture as no one gets rewarded for publishing a failed result and in fact the researchers end up feeling embarrassed and tend not to do it. Publication bias is huge! It also applies to the humanities and social sciences as well but potentially in a different way, and there are huge biases in terms of what gets published and what is allowed to get published.

Georgie Humphreys This issue is probably a plug for the open peer review model where the filter is not at the beginning but later on. [In open peer review, authors and reviewers are aware of each other’s identity and encouraged to engage in open discussion. This makes the process more transparent, removing bias or conflicts of interest. Manuscripts are made publicly available pre-review, and reviews are published alongside the article].

Conclusion

So, who are the winners and losers of good data practices? Georgie believes that everyone, in the long term, will be a winner. If time is spent ensuring data is well-documented, well-organised, has dictionaries, is stored somewhere for the long term, then it will benefit the data creators just as much as anyone else. In the short term, she acknowledges that there may be people that find being a champion in this field a challenge for them individually, but it’s just about continuing along this journey to get to the point where everything is in place to truly reward and recognise those that have good open practices and good data management practices. Catriona says that there are so many winners: the economy, society, and science, the social sciences and humanities – all will benefit from data sharing. Taking society as an example, sharing data and sharing it well (through good research data management) will increase public trust in science, benefit public health and even help toward achieving multiple sustainable development goals.

Resources

A Covid-19 press release by Wellcome in January 2020 called on researchers, publishers and funders to share or facilitate the sharing of interim and final data as rapidly as possible. Wellcome have been exploring the impact of this statement on data sharing.

‘The FAIR Guiding Principles for Scientific Data Management and Stewardship’ by Wilkinson et al. in Scientific Data (March 2016).

CARE Principles of Indigenous Data Governance. The full CARE principles are outlined here.

UKDS information on informed consent, including a downloadable model consent form with suggested wording to allow secondary data reuse.

An April 2020 publication by Colavizza et al. on ‘The citation advantage of linking publications to research data’ showing that article citations are greater when they have data availability statements that include a link (e.g. DOI) to data archived in a repository.

A European University Association (EUA) report published in October 2019 by Saenen et al. on ‘Research assessment in the transition to Open Science: 2019 EUA Open Science and Access Survey Results’.

Published 25 January 2021

Written by Dr Sacha Jones with contributions from Dr Georgie Humphreys, Dr Catriona MacCallum and Maria Angelaki.  

CCBY icon

Cambridge Data Week 2020 day 2: Who is reusing data? Successes and future trends?

Cambridge Data Week 2020 was an event run by the Office of Scholarly Communication at Cambridge University Libraries from 23–27 November 2020. In a series of talks, panel discussions and interactive Q&A sessions, researchers, funders, publishers and other stakeholders explored and debated different approaches to research data management. This blog is part of a series summarising each event.  

The rest of the blogs comprising this series are as follows:
Cambridge Data Week day 1 blog
Cambridge Data Week day 3 blog
Cambridge Data Week day 4 blog
Cambridge Data Week day 5 blog

Introduction

Reuse of data is the final element of the FAIR principles and has long been argued as a central benefit of data sharing, allowing others access to a wealth of research and making research funding more efficient by removing the need to duplicate work. Yet we are still in the process of evaluating success in this area. This webinar brought together speakers to discuss what we know about the current state of play around data reuse, what researchers can do to increase the reuse potential of their data, and possible future developments in data reuse.

Our speakers – Louise Corti (UK Data Archive) and Tiberius Ignat (Scientific Knowledge Services) – looked at data reuse from two different perspectives. Louise focused on the reuse of UK Data Service collections, sharing some examples of their most widely used data sets, discussing what makes them popular and sharing some principles that can be used both to make data more reusable and to promote it for reuse. Tiberius discussed the prevalence of data reuse by machines and the possibility of granting machines data reuse rights.

Louise’s presentation gave an overview of the portfolio of data sets hosted by the UK Data Service, looked at their top 20 most downloaded datasets and discussed the underlying principles that have led to them being widely reused. As well as demonstrating some commonalities between these datasets, Louise also outlined the principles used by the UK Data Service to promote their collections for reuse.

Tiberius’ presentation looked at data reuse from a different perspective, serving as a call to action to share research data responsibly and protect it against the reuse of machines designed to persuade humans. One of Tiberius’ main arguments was that no research data from public projects should be made available to feed and develop persuasive algorithms.

The presentations motivated an interesting discussion covering a broad range of topics. These included the reuse of qualitative data, how we can implement ethical safeguards data reuse, the idea of data ethics as a continuum, whether we can accept positive cases of algorithmic persuasion such as to promote equality and diversity, and the possibility of creating specific licences prohibiting data reuse by persuasive algorithms. See below for a video and transcript of the session.

Audience composition

We had 341 registrations with just over 65% originating from the Higher Education sector. Researchers and PhD students accounted for nearly 37% of the registrations whilst research support staff accounted for an additional 33%. We also had registrations from at least 30 countries outside of the UK including significant attendance from Denmark, Holland, Germany and Canada. We were thrilled to see that on the actual day 187 people attended the webinar.

We held five online webinars during Cambridge Data Week and were pleased to see that nearly 25% of the participants attended more than one webinar. A total of 1364 people registered and more than 700 attended all together, with the rest possibly watching the recordings at a later date. Most of all we were pleased to welcome participants from all over the world and see how important research data management topics are globally.

Where data was available, we identified the following countries apart from the UK:  Australia, Austria, Bangladesh, Brazil, Canada, Colombia, Croatia, Czech Republic, Denmark, France, Germany, Greece, Holland, Hungary, Iran, Luxembourg, Moldova, Norway, Poland, Romania, Singapore, Spain, Sweden, Switzerland, Turkey, Ukraine and the USA.

Recording , transcript and presentations

The video recording of the webinar can be found below and the recording, transcript and presentations are present in Apollo, the University of Cambridge repository

Bonus material

After the session ended, we continued the discussion with Louise and Tiberius looking in particular at one question posed by an audience member:

AI can always be used either for good or bad. Instead of locking-in, how can we enhance technology through data and regulation? 

Tiberius Ignat I think at this point we need regulation. I’m not a big fan of using regulations, to be honest. I think it’s much better to motivate people but, in this case, it’s quite a bit of control that has been lost, so I think we should have a regulation on how research data can be reused by others. This is how the internet has been made profitable during the last decade — through non-human persuasion. All these companies that are giving so much away for free are making billions of dollars when you look at the stock market. We were not clear how they were making this profit until recently when we realised that they are doing it by changing our behaviour and I think the rest of society – including research organisations – are behind them, so we need some regulation.

A good example is with GDPR. It has been introduced to protect our data, our digital footprint. On ResearchGate or Eurosport, or any other website, we used to be asked to agree to cookies or not. Recently, a new option called “Legitimate interest” has been slipped in and our digital data is again collected – less noticeably – by invoking questionable legitimate rights. The organisations whose model is based on persuading need cookie data, so they have moved the discussion away from remaining GDPR compliant to defending their legitimate interests. They are fighting to take data away from us. We can tackle this with regulation faster but in the long term we need to educate people to be more aware. We do have licenses such as Creative Commons but I’m not sure we have the right ones to protect us.

Louise Corti There are a variety of licenses, but they are abused and it’s very hard to track along the way what has gone wrong. I quite like the UK Government’s approach with some of their statistical data that has to go through a legal gateway. Some data can be made available for research, but it has to be done for the public good. We also have the Ethics Self-Assessment Tool, which is a grid you go through provided by the Statistics Authority and it asks you to think along lots of different dimensions of ethics. This helps researchers get a better sense of what they are trying to do, but whether the people we are talking about would care about it is a very different matter. Having been in research ethics for a very long time, that is by far the best tool I’ve seen and I recommend everyone uses it. The UK Data Archive uses it to evaluate some of the projects we deal with because you find often university ethics approvals are not good enough for the Statistics Authority because often they don’t understand quantitative secondary analysis, so the ethics scrutiny is not good enough. Self-Assessment is a much more nuanced thinking about the different dimensions of ethics and it helps researchers to be a bit more reflective about what’s good and what’s not.

Conclusion

Overall, the session provided a compelling blend of both the practical and conceptual elements of data reuse, each raising questions which could have easily been entire sessions in themselves. Louise’s presentation gave an excellent overview of the UK Data Service’s approach to making their datasets more reusable and promoting them to maximise their chances of being reused. Tiberius’ session raised some interesting questions surrounding data reuse and the ethics of using algorithms to persuade humans, as well as looking at some practical options for protecting research data from reuse for nefarious ends. At the end of the session, the audience were asked to participate in a poll on “What future developments are needed to increase the prevalence of data reuse?”.

Audience responses to poll held at the end of the event

The results were unsurprising to either speaker, with each touching on the idea that a change in research culture is necessary to ensure data reuse projects are seen as equal to data-generating projects. The need for cultural change is a theme that ran throughout each of the sessions in Data Week and is perhaps one of the current major challenges in scholarly communication.

Resources

Data Access and Research Transparency (DA-RT): A Joint Statement by Political Science Journal Editors

Robots appear more persuasive when pretending to be human

Behavioural evidence for a transparency–efficiency tradeoff in human–machine cooperation

The next-generation bots interfering with the US election

IBM’s AI Machine Makes A Convincing Case That It’s Mastering The Human Art Of Persuasion

AI Learns the Art of Debate

CSI-COP

Published on 25 January 2021

Written by Dominic Dixon

CCBY icon

Cambridge Data Week 2020 day 3: Is data management just a footnote to reproducibility?

Cambridge Data Week 2020 was an event run by the Office of Scholarly Communication at Cambridge University Libraries from 23–27 November 2020. In a series of talks, panel discussions and interactive Q&A sessions, researchers, funders, publishers and other stakeholders explored and debated different approaches to research data management. This blog is part of a series summarising each event:

The rest of the blogs comprising this series are as follows:
Cambridge Data Week day 1 blog
Cambridge Data Week day 2 blog
Cambridge Data Week day 4 blog
Cambridge Data Week day 5 blog

Introduction

The third day of Cambridge Data Week consisted of a panel discussion about the relationship between reproducibility and Research Data Management (RDM), looking for ways to advocate effectively to reach positive outcomes in both areas. Alexia Cardona (University of Cambridge), Lennart Stoy (European University Association), Florian Markowetz (University of Cambridge & UK Reproducibility Network), and René Schneider (Geneva School of Business Administration) offered their perspectives on whether RDM really is just a ‘footnote’ to the more popular concept of reproducibility.

The speakers agreed that we are still in need of cultural change towards better data management and reproducibility. The word ‘reproducibility’ is more likely to excite researchers and it is important to craft messages that work for each group, hence the emphasis on this term. In contrast to the Cambridge Data Week event on data peer review, the discussion here focused on engaging senior researchers, from PIs to Heads of Institutions, motivating them to be not just good data managers, but great data leaders.

Among the key elements needed to drive best practice in this area, two stood out. The first is communities. Whether these are reproducibility circles of peers, or networks like the Cambridge Data Champions, communities are key to creating and implementing guidelines for data management. The second element is a solid technological infrastructure. For instance, block chains could be used to enable reproducibility in citations in the humanities, or Persistent Identifiers, used at a very granular level, could lead to better data reuse.

Recording , transcript and presentations

The video recording of the webinar can be found below and the recording, transcript and presentations are present in Apollo, the University of Cambridge repository.

Bonus material

There were a few questions we did not have time to address during the live session, so we put them to the speakers afterwards. Here are their answers:

What are good practices regarding data deletion?

Florian Markowetz It very much depends on what kind of data you have, it’s hard to give general directions. However, drives and other hardware are becoming cheaper and cheaper, so I would say ‘save everything’.

René Schneider I would agree. I have spoken to researchers who keep all their data, because it would create too much work to sort what to keep and what to delete.

Alexia Cardona We tend to talk more about data archiving than data deletion. I often hear about data deletion where it has created problems, for example an account has been deleted in bulk when a researcher left an institution, so unpublished data and scripts are lost due to lack of communication. There are also cases on the internet of PhD students losing all their thesis when the laptop crashed, so this issue goes hand in hand with data storage and backup. Let’s focus on good practices and archiving of data, deletion is the very last thing to worry about.

Lennart Stoy It’s worth mentioning that there is often a compulsory period that data should be kept for, perhaps 3 years or 5 years according to funders mandates, so data should be stored for some time. I suppose the expense could become an issue in the coming years, some Universities are already concerned about the cost of having to buy large amounts of cloud storage space. There are also discussions in the Open Science Could teams about what to preserve in the long term. We want to make sure we preserve the higher value datasets, but of course it’s hard to define which ones those are.

Couldn’t scholarly communities of practice or learned societies create guidelines for reproducibility and good data management?

Lennart Stoy Absolutely, they must be involved as they are the ones with the specific knowledge. This is the idea behind Research Data Alliance (RDA) and the National Research Data Infrastructure (NFDI) in Germany. In those cases, you have to prove a link to the community in that field to establish a consortium. It is great when communities structure their areas of infrastructure from the bottom up.

What roles could Early Career Researchers (ECRs) have? Could they act as code-checkers to assist reproducibility, or are we asking too much of them given their busy schedules? Would they receive credit for this?

Florian Markowetz Senior academics have no excuses for not getting more involved in this once they have stable positions. It’s easy for people in my position to point to students, or to funders, saying they are not doing enough, but we should not be pointing away from ourselves, we should do the work. It could be coupled to pay rises: if you hold any role above grade 12 it’s your job now to sort this all out.

René Schneider I have been thinking about the role of data custodians or similar. If we ask researchers to spend a lot of time just checking data, like ‘warehouse workers’, we could be undervaluing their role. I don’t think it’s necessarily the researchers who should do the work, especially not ECRs, there should be other roles dedicated to this.

Alexia Cardona I second that, researchers are supposed to focus on the research, not necessarily the data checking and curation. But the unfortunate truth is that with short contracts and lack of resources the work is left to them. Another problem is the lack of rewards. For instance in my area, training, there’s no reward for people who take the time to make their training FAIR. We should embrace more openness and fairness, including rewarding those who do the work.

Lennart Stoy This is something we’ve been working on but it’s a challenging system to change because there are so many elements to disentangle. It relates to intense competition for jobs, the culture in different disciplines, and the pressure to publish in certain journals. Some Universities are very serious about implementing DORA and I hope that in a few years these will be able to show high levels of satisfaction among PhD students and ECRs. A lot depends on the leadership at the institutional level to initiate change, for instance the rector at Ghent University in Belgium has been driving DORA-inspired reward mechanisms and the Netherlands is also moving ahead and moving away from journal-based factors. The University of Bath is an example in the UK that I’ve heard mentioned a lot. We’re following progress in all these examples and will write up DORA good practice case studies to inspire other organisations. But it is a hard problem, ECRs have a lot on the line, it’s important not to jeopardise their careers.

Conclusion

This compelling discussion left us feeling that it does not matter too much which words we emphasise: reproducibility, data management, data leadership, or something else entirely. What matters is that we spark interest and commitment in the right groups of researchers to drive progress. Creating a culture where great research practices are routine will take effective advocacy, but also rewards that align with our aims and the right technical infrastructure to underpin them.

Resources

UK data service is a data repository funded by the Economic and Social Research Council (ESRC), which also provides extensive resources on data practices.

The journal PLOS Computational Biology introduced a pilot in 2019 where all papers are checked for the reproducibility of models.

Is there a reproducibility crisis? Baker’s 2016 paper in Nature reporting the results of a survey that exposed the extent of the reproducibility crisis.

San Francisco Declaration on Research Assessment (DORA), a set of recommendations for institutions, funders, publishers, metrics companies and researchers, aiming for a fairer and more varied system of research quality assessment.

Published on 25 January 2021

Written by Beatrice Gini

CCBY icon

Cambridge Data Week 2020 day 4: Supporting researchers on data management – do we need a fairy godmother?

Cambridge Data Week 2020 was an event run by the Office of Scholarly Communication at Cambridge University Libraries from 23–27 November 2020. In a series of talks, panel discussions and interactive Q&A sessions, researchers, funders, publishers and other stakeholders explored and debated different approaches to research data management. This blog is part of a series summarising each event: 

The rest of the blogs comprising this series are as follows:
Cambridge Data Week day 1 blog
Cambridge Data Week day 2 blog
Cambridge Data Week day 3 blog
Cambridge Data Week day 5 blog

Introduction 

How should researchers’ data management activities and skills be supported? What are the data management responsibilities of the funder, the institution, the research group and the individual researcher? Should we focus on training researchers so they can carry out good data management themselves or should we be funding specialist teams who can work with research groups, allowing the researchers to concentrate on research instead of data management? These were the questions addressed on day 4 of Cambridge Data Week 2020. This session benefitted from the perspectives of three speakers deriving from three different components of the research ecosystem: national funder, institutional research support and department/institute. Respectively, these were provided by Tao-Tao Chang (Arts and Humanities Research Council [AHRC]), Marta Teperek (TU Delft) and Alastair Downie (The Gurdon Institute, Cambridge). 

From a funder’s perspective, and following UKRI community consultation, Tao-Tao specifies that digital research infrastructure is recognised as an area for urgent investment, particularly in the arts and humanities, where both software and data loss are acute. Going forwards, AHRC’s key priorities will be to prevent further data loss, invest in skills, build capability, and work with the community to effect a sustained change in research culture. At an institutional level, Marta argues that it is unfair for researchers to be left unsupported to manage their data. The TU Delft model addresses this via three methods: central data support, disciplinary support by data stewards as permanent faculty staff, and hands-on support for research groups via data managers and research software engineers. Regarding the latter, an important take-home message for all researchers, regardless of institutional affiliation, is to build data management costs into grant proposals. Alastair takes up the discussion at the level of the department, research group and even individual, highlighting how researchers are locked into infrastructure silos, and locked into an unhelpful, competitive culture where altruism is a risky proposition and the career benefits of sharing seem intangible or insufficient. Alastair proposes that the climate is right and the community is ready for change, and goes on to discuss some positive changes afoot in the School of Biological Sciences to counteract these.  

Audience composition  

We had 291 registrations for the webinar, with just over 70% originating from the Higher Education sector. Researchers and PhD students accounted for 30% of the registrations whilst research support staff from various organisations accounted for an impressive 46%. On the day, we were thrilled to see that 136 people attended the webinar, participating from a wide range of countries. 

Recording, transcript and presentations 

The video recording of the webinar can be found below and the recording, transcript and presentations are present in Apollo, the University of Cambridge repository.

Bonus material 

There were a few questions we did not have time to address during the live session, so we put them to the speakers afterwards. Here are their answers: 

Talking about the technical side have you yet come across anyone using a machine implementable DMP? Setting up a data management infrastructure for a large project it’s become apparent that checking compliance with a DMP is a huge job and of course there is minimal resource for doing this.

Marta Teperek Work is being done in this area by Research Data Alliance where there are several groups working on machine actionable DMPs. Basically, the idea is that instead of asking researchers to write long essays about how they are planning to manage their data, they are asked to provide answers that are structured. These can be multiple choice options, for example, where the researcher specifies that they will be depositing large amounts of data in the repository and the repository will be notified of data coming their way. In other words, actions are made depending on what the researcher says they will do. University of Queensland is doing a lot on this already [see link to blog post here and in Resources further below].

What are the best cross-platform, mobile and desktop tools for data management?

Alastair Downie RDM encompasses a far too broad a range of activities – it’s a concept rather than a single activity that you can build into a neat little app. In the context of electronic lab notebooks, for example, there are hundreds of apps that serve that function and some of them cross over into lab management as well. Those products that try to do too much become very bloated and complex, which makes them unattractive and so we don’t see uptake of those kind of products. I think a suite approach is better than a single solution.

Institutions audit spending on research grants, they should do the same for research data and should be a requirement of holding a grant.

Alastair Downie Wellcome Trust are now challenging researchers to demonstrate that they have complied with their DMPs. It’s not particularly empirical but the fact that they are demonstrating their determination to make sure that everyone’s doing things properly is very helpful. 

Are there any specific infrastructure projects that the AHRC is sponsoring? I’m curious about what infrastructure/services would be useful for Arts and Humanities researchers

Tao-Tao Chang Not at this juncture. But we are hoping that this will change. AHRC recognises the importance of good data management practice and the need to support it. We also recognise that there is a skills gap and that all researchers at every level need support.

Is there a 2020 edition of the State of Open Data report?

Yes, this was published five days after this webinar! See the Digital Science website and further below under ‘Resources’.

Conclusion 

There are two outcomes of the webinar to draw upon here. The first raises again the question: do researchers need, or even want, a fairy godmother to support their research data management?  We held a poll at the end of the webinar, asking participants to choose which one of the following statements they believe most strongly: (1) ‘Individual researchers should learn how to manage their own data well’ or (2) ‘Researchers’ data should be managed by funded RDM specialists so that researchers can focus on research’. Of the 78 respondents, 67% chose the first option and 33% chose the second. There was not an intermediate option to incorporate both, simply because we wanted to force a choice in the direction of strongest belief when the two options are considered relative to one another. 

The results of the poll and the discussions during the webinar (between the speakers and within the chat) indicate that while individual researchers are responsible for managing their research data, support does need to be made available and promoted actively (we provide in the ‘Resources’ section some links to University of Cambridge research data management support). A second outcome reveals that support needs to be provided under several different guises. On the one hand, there is support that comes via the provision of funding, research data services and individually tailored expertise. Yet, on the other hand, there is support that will derive, albeit in a less tangible sense, from positive changes in research culture, specifically in terms of how the research of individual researchers is assessed and rewarded.  

Resources  

Some links to University of Cambridge research data management support include: the Research Data Management Policy Framework that outlines, for example, the data management responsibilities of research students and staff; our data management guide; a list of Cambridge Data Champions, searchable by areas of expertise. 

A recent Postdoc Academy podcast on ‘How can we improve the research culture at Cambridge?’ 

description of different data management support roles at TU Delft, by Alastair Dunning and Marta Teperek: data steward, data manager, research software engineer, data scientist and data champion.  

A Gurdon Computing blog post by Alastair Downie on ‘Research data management as a national service’; in other words, rather than duplicating infrastructure and services across the research landscape. 

An article by Florian Markowetz, discussed in the webinar, on ‘Five selfish reasons to work reproducibly’ (in Genome Biology)

TU Delft Open Working blog post by Marta Teperek on machine actionable Data Management Plans (DMPs) at the University of Queensland. For more information, see this article by Miksa and colleagues on the ‘Ten principles for machine-actionable data management plans’ (in PLOS Computational Biology).  

The State of Open Data 2020 report, published on 1 December 2020. 

Published on 25 January 2021

Written by Dr Sacha Jones with contributions from Tao-Tao Chang, Dr Marta Teperek, Alastair Downie and Maria Angelaki. 

CCBY icon