Tag Archives: reproducibility

Open Research at Cambridge Conference – Opening session

The Open Research at Cambridge conference took place between 22–26 November 2021. In a series of talks, panel discussions and interactive Q&A sessions, researchers, publishers, and other stakeholders explored how Cambridge can make the most of the opportunities offered by open research. This blog is part of a series summarising each event. 

The opening session, chaired by Dr Jessica Gardner (University Librarian and Director of Library Services) included talks by Professor Anne Ferguson-Smith (Pro-Vice-Chancellor for Research), Professor Steve Russell (Acting Head of Department of Genetics and Chair of Open Research Steering Committee), Mandy Hill (Managing Director of Academic Publishing at Cambridge University Press) and Dr Neal Spencer (Deputy Director for Collections and Research at the Fitzwilliam Museum). All four speakers foresee an increasingly open future, with benefits for both institutions and researchers. They also considered some of the challenges that still need to be worked through to avoid potential problems.

What is working well?

In recent years, we have made great progress in the proportion of publications that are open access. Over three quarters of publications with Cambridge authors last year were openly available in some form.

The trend is continuing and it is not unique to our institution. CUP have set an ambitious goal for the vast majority of research articles they publish to be open access by 2025.

Other forms of publication are becoming common, meeting different dissemination needs. Preprints have been the star of the show during the pandemic, allowing rapid dissemination while formal peer review follows down the line.

Diagram from Mandy Hill’s slide: ‘Increasingly open platforms and formal publishing will meet different dissemination needs’

In the scholarly communication arena, open access articles benefit from more downloads and citations. Museum-based projects involving artisans, schools and artists all found enthusiastic responses.

What can we look forward to?

Research culture is coming under the spotlight across the sector, and Cambridge has committed to an ambitious action plan to create a thriving environment to do research. Key principles include openness, collaboration, inclusivity, and fair recognition of all contributions.

Diagram from Prof Steve Russell: ‘Going Forward’

Implementing the San Francisco Declaration on Research Assessment (DORA) is part of this progress. We want to assess research on its own merits rather than on the basis of journal or publisher metrics. This also means recognising all research outputs and a broad range of impacts.

Reproducibility is increasingly recognised as critical in a number of disciplines. A developing UKRN group within the University aims to ‘take nobody’s word for it’ – but rather support reproducible workflows that underpin confidence in the conclusions of research. By sharing and rewarding best practice we can become world leaders in this area, and in open research more widely.

In the past, museum collections have tended to be documented in limited ways, with poor accessibility and interoperability, which made it hard to discover and use materials. Several exciting projects at the Fitzwilliam Museum and more broadly have started to change that. There are opportunities for a single discovery portal, tying together different collections. The Fitzwilliam Museum is also making its collection discovery process richer, by providing opportunities for deeper dives, and more connected, by linking with other collections and resources.

Deep zoom access to an image in the Fitzwilliam collection. Adapted from Dr Neal Spencer’s slide ‘Fitzwilliam Museum Collections Search’.

What problems should we be mindful of?

There are still barriers that hinder some open research aspirations. Historical constraints on the ways we find materials, conduct research, and publish results remain. Some systems may need to be reimagined, while not scrapping structures that are still serving us well.

Cambridge is a large and complex institution, where change takes time. Nevertheless, there is an established governance structures and an evolving set of policies that support open research.

Most importantly, researchers should be at the centre of the move towards open research. It is important that they benefit from open practices, rather than finding themselves torn between competing priorities. Conversations continued throughout the week to explore possible approaches in different disciplines, drawing from the rich diversity of experiences to shape the future of open research at Cambridge.

Practical steps toward more reproducible research

The Open Research at Cambridge conference took place between 22–26 November 2021. In a series of talks, panel discussions and interactive Q&A sessions, researchers, publishers, and other stakeholders explored how Cambridge can make the most of the opportunities offered by open research. This blog is part of a series summarising each event. 

On 26 November 2021 the University’s Reproducibility Working Group hosted a workshop for researchers from across Cambridge to explore approaches to supporting more reproducible research. Talks were provided by Professor Alexander Bird (Faculty of Philosophy), Dr Florian Markowetz (Cancer Research UK Cambridge Institute) and Dr Maria Tsapali (Faculty of Education) exploring approaches to reproducible research and reasons to work reproducibility across qualitative and quantitative research.

The recording of the session can be found below:

Talks were followed by interdisciplinary discussion sessions designed to identify the obstacles to reproducible research across Cambridge and how these might be tackled.  The key findings from the discussions included:

  • Training on reproducibility, including statistical training, reproducible methods and use of key tools exist in departments across the University, but more needs to be done to share provision and create synergies and central provision where possible. 
  • Training should begin at undergraduate or Masters level to build key skills early.
  • Awareness of training, and the importance of reproducibility training, needs to be enhanced.
  • The need for University guidance on how to make research reproducible, particularly to overcome key challenges to reproducibility such as balancing reproducibility with the need to protect sensitive or confidential data.
  • That the University can help by making the production of open and reproducible research as painless as possible, for example by facilitating peer review of codes and providing easy access to data storage and expertise in best practice.
  • That reproducibility looks very different across the disciplines and that in some areas transparency and methods reproducibility will be the focus, rather than reproducible outcomes.

The Reproducibility Working Group will draw on the ideas raised at this workshop to help shape proposals for future University approaches to supporting reproducible research. The group plans to host a number of further events to map, consolidate, and extend existing resources for reproducibility across Cambridge with the aim of boosting grassroots activities and magnifying their impact across all levels of the institution.

For more information and resources on reproducible research see: UK Reproducibility Network: https://www.ukrn.org/

Cambridge Data Week 2020 day 3: Is data management just a footnote to reproducibility?

Cambridge Data Week 2020 was an event run by the Office of Scholarly Communication at Cambridge University Libraries from 23–27 November 2020. In a series of talks, panel discussions and interactive Q&A sessions, researchers, funders, publishers and other stakeholders explored and debated different approaches to research data management. This blog is part of a series summarising each event:

The rest of the blogs comprising this series are as follows:
Cambridge Data Week day 1 blog
Cambridge Data Week day 2 blog
Cambridge Data Week day 4 blog
Cambridge Data Week day 5 blog

Introduction

The third day of Cambridge Data Week consisted of a panel discussion about the relationship between reproducibility and Research Data Management (RDM), looking for ways to advocate effectively to reach positive outcomes in both areas. Alexia Cardona (University of Cambridge), Lennart Stoy (European University Association), Florian Markowetz (University of Cambridge & UK Reproducibility Network), and René Schneider (Geneva School of Business Administration) offered their perspectives on whether RDM really is just a ‘footnote’ to the more popular concept of reproducibility.

The speakers agreed that we are still in need of cultural change towards better data management and reproducibility. The word ‘reproducibility’ is more likely to excite researchers and it is important to craft messages that work for each group, hence the emphasis on this term. In contrast to the Cambridge Data Week event on data peer review, the discussion here focused on engaging senior researchers, from PIs to Heads of Institutions, motivating them to be not just good data managers, but great data leaders.

Among the key elements needed to drive best practice in this area, two stood out. The first is communities. Whether these are reproducibility circles of peers, or networks like the Cambridge Data Champions, communities are key to creating and implementing guidelines for data management. The second element is a solid technological infrastructure. For instance, block chains could be used to enable reproducibility in citations in the humanities, or Persistent Identifiers, used at a very granular level, could lead to better data reuse.

Recording , transcript and presentations

The video recording of the webinar can be found below and the recording, transcript and presentations are present in Apollo, the University of Cambridge repository.

Bonus material

There were a few questions we did not have time to address during the live session, so we put them to the speakers afterwards. Here are their answers:

What are good practices regarding data deletion?

Florian Markowetz It very much depends on what kind of data you have, it’s hard to give general directions. However, drives and other hardware are becoming cheaper and cheaper, so I would say ‘save everything’.

René Schneider I would agree. I have spoken to researchers who keep all their data, because it would create too much work to sort what to keep and what to delete.

Alexia Cardona We tend to talk more about data archiving than data deletion. I often hear about data deletion where it has created problems, for example an account has been deleted in bulk when a researcher left an institution, so unpublished data and scripts are lost due to lack of communication. There are also cases on the internet of PhD students losing all their thesis when the laptop crashed, so this issue goes hand in hand with data storage and backup. Let’s focus on good practices and archiving of data, deletion is the very last thing to worry about.

Lennart Stoy It’s worth mentioning that there is often a compulsory period that data should be kept for, perhaps 3 years or 5 years according to funders mandates, so data should be stored for some time. I suppose the expense could become an issue in the coming years, some Universities are already concerned about the cost of having to buy large amounts of cloud storage space. There are also discussions in the Open Science Could teams about what to preserve in the long term. We want to make sure we preserve the higher value datasets, but of course it’s hard to define which ones those are.

Couldn’t scholarly communities of practice or learned societies create guidelines for reproducibility and good data management?

Lennart Stoy Absolutely, they must be involved as they are the ones with the specific knowledge. This is the idea behind Research Data Alliance (RDA) and the National Research Data Infrastructure (NFDI) in Germany. In those cases, you have to prove a link to the community in that field to establish a consortium. It is great when communities structure their areas of infrastructure from the bottom up.

What roles could Early Career Researchers (ECRs) have? Could they act as code-checkers to assist reproducibility, or are we asking too much of them given their busy schedules? Would they receive credit for this?

Florian Markowetz Senior academics have no excuses for not getting more involved in this once they have stable positions. It’s easy for people in my position to point to students, or to funders, saying they are not doing enough, but we should not be pointing away from ourselves, we should do the work. It could be coupled to pay rises: if you hold any role above grade 12 it’s your job now to sort this all out.

René Schneider I have been thinking about the role of data custodians or similar. If we ask researchers to spend a lot of time just checking data, like ‘warehouse workers’, we could be undervaluing their role. I don’t think it’s necessarily the researchers who should do the work, especially not ECRs, there should be other roles dedicated to this.

Alexia Cardona I second that, researchers are supposed to focus on the research, not necessarily the data checking and curation. But the unfortunate truth is that with short contracts and lack of resources the work is left to them. Another problem is the lack of rewards. For instance in my area, training, there’s no reward for people who take the time to make their training FAIR. We should embrace more openness and fairness, including rewarding those who do the work.

Lennart Stoy This is something we’ve been working on but it’s a challenging system to change because there are so many elements to disentangle. It relates to intense competition for jobs, the culture in different disciplines, and the pressure to publish in certain journals. Some Universities are very serious about implementing DORA and I hope that in a few years these will be able to show high levels of satisfaction among PhD students and ECRs. A lot depends on the leadership at the institutional level to initiate change, for instance the rector at Ghent University in Belgium has been driving DORA-inspired reward mechanisms and the Netherlands is also moving ahead and moving away from journal-based factors. The University of Bath is an example in the UK that I’ve heard mentioned a lot. We’re following progress in all these examples and will write up DORA good practice case studies to inspire other organisations. But it is a hard problem, ECRs have a lot on the line, it’s important not to jeopardise their careers.

Conclusion

This compelling discussion left us feeling that it does not matter too much which words we emphasise: reproducibility, data management, data leadership, or something else entirely. What matters is that we spark interest and commitment in the right groups of researchers to drive progress. Creating a culture where great research practices are routine will take effective advocacy, but also rewards that align with our aims and the right technical infrastructure to underpin them.

Resources

UK data service is a data repository funded by the Economic and Social Research Council (ESRC), which also provides extensive resources on data practices.

The journal PLOS Computational Biology introduced a pilot in 2019 where all papers are checked for the reproducibility of models.

Is there a reproducibility crisis? Baker’s 2016 paper in Nature reporting the results of a survey that exposed the extent of the reproducibility crisis.

San Francisco Declaration on Research Assessment (DORA), a set of recommendations for institutions, funders, publishers, metrics companies and researchers, aiming for a fairer and more varied system of research quality assessment.

Published on 25 January 2021

Written by Beatrice Gini

CCBY icon