Tag Archives: research data management

In Conversation with the Wellcome Trust – sharing & managing research outputs

In July 2017, the Wellcome Trust updated their policy on the management and sharing of research outputs.  This policy helps deliver Wellcome’s mission – to improve health for everyone by enabling great ideas to thrive.  The University of Cambridge’s Research Data Management Facility invited Wellcome Trust to Cambridge to talk with their funded research community (and potential researchers) about what this updated policy means for them.  On 5th December in the Gurdon Institute Tea Room, the Deputy Head of Scholarly Communication Dr Lauren Cadwallader, welcomed Robert Kiley, Head of Open Research, and David Carr, Open Research Programme Manager, from the Wellcome’s Open Research Team. 

This blog summarises the presentations from David and Robert about the research outputs policy and how it has been working and the questions raised by the audience.

Maximising the value of research outputs: Wellcome’s approach

David Carr outlined key points about the new policy, which now, in addition to sharing openly publications and data, includes sharing software and materials as other valued outputs of research.

An outputs management plan is required to show how the outputs of the project will be managed and the value of the outputs maximised (whilst taking into consideration that not all outputs can be shared openly).  Updated guidance on outputs management plans has been published and can be found on Wellcome’s website.

Researchers are also to note that:

  • Outputs should be made available with as few restrictions as possible.
  • Data and software underlying publications must be made available at the time of publication at the latest.
  • Data relevant to a public health emergency should be shared as soon as it has been quality assured regardless of publication timelines.
  • Outputs should be placed in community repositories, have persistent identifiers and be discoverable.
  • A check at the final report stage, to ensure outputs have been shared according to the policy, has been introduced (recognising that parameters change during the research and management plans can change accordingly).
  • Of course, management and sharing of research outputs comes with a cost and Wellcome Trust commit to reviewing and supporting associated costs as part of the grant.

Wellcome have periodically reviewed take-up and implementation of their research outputs sharing and management policy and have observed some key responses:

  • Researchers are producing better quality plans; however, the formats and level of detail included in the plans do remain variable.
  • There is uncertainty amongst stakeholders (researchers, reviewers and institutions) in how to fulfil the policy.
  • Resources required to deliver plans are often not fully considered or requested.
  • Follow-up and reporting about compliance has been patchy.

In response to these findings, Wellcome will continue to update their guidance and work with their communities to advise, educate and define best practice.  They will encourage researchers to work more closely with their institutions, particularly over resource planning.  They will also develop a proportionate mechanism to monitor compliance.

Developing Open Research

Robert Kiley then described the three areas which the dedicated Open Research Team at Wellcome lead and coordinate: funder-led activities; community-led activities and policy leadership.

Funder-led activities include:

  • Wellcome Open Research, the publishing platform launched in partnership with F1000 around a year ago; here Wellcome-funded researchers can rapidly and transparently publish any results they think are worth sharing. Average submission to publication time for the first 100 papers published was 72 days – much faster than other publication venues.
  • Wellcome Trust is working with ASAP-Bio and other funders to support pre-prints and continues to support e-Life as an innovative Open Access journal.
  • Wellcome Trust will review their Open Access policy during 2018 and will consult their funded researchers and institutions as part of this process.
  • Wellcome provides the secretariat for the independent review panel for the com (CSDR) platform which provides access to anonymised clinical trial data from 13 pharmaceutical companies. From January 2018, they will extend the resource to allow listing of academic clinical trials supported by Wellcome, MRC, CRUK and Gates Foundation.  Note that CDSR is not a repository but provides a common discoverability and access portal.

Community-led activities

Wellcome are inviting the community to develop and test innovative ideas in Open Research.  Some exciting initiatives include:

  • The Open Science Prize: this initiative was run last year in partnership with US National Institutes of Health and Howard Hughes Medical Institute. It supported prototyping and development of tools and services to build on data and content.  New prizes and challenges currently being developed will build on this model.
  • Research Enrichment – Open Research: this was launched in November 2017. Awards of up to £50K are available for Wellcome grant-holders to develop Open Research ideas that increase the impact of their funded research.
  • Forthcoming: more awards and themed challenges aimed at Open Research – including a funding competition for pioneering experiments in open research, and a prize for innovative data re-use.
  • The Open Research Pilot Project: whereby four Wellcome-funded groups are being supported at the University of Cambridge to make their research open.

Policy Leadership

In this area, Wellcome Trust engage in policy discussions in key policy groups at the national, European and international level.  They also convene international Open Research funder’s webinars.  They are working towards reform on rewards and incentives for researchers, by:

  • Policy development and declarations
  • Reviewing grant assessment procedures: for example, providing guidance to staff, reviewers and panel members so that there is a more holistic approach on the value and impact of research outputs.
  • Engagement: for example, by being clear on how grant applicants are being evaluated and committing to celebrate grantees who are practicing Open Research. 

Questions & Answers

Policy questions

I am an administrator of two Wellcome Trust programmes; how is this information about the new policy being disseminated to students? Has it been done?

When the Wellcome Open Research platform was announced last year, there was a lot of communication, for example, in grants newsletters and working with the centres.

Further dissemination of information about the updated policy on outputs management could be realised through attending events, asking questions to our teams, or inviting us to present to specific groups.  In general, we are available and want to help.

Following this, the Office of Scholarly Communication added that they usually put information about things like funder policy changes in the Research Operations Office Bulletin.

Regarding your new updated policy, have you been in communication with the Government?

We work closely with HEFCE and RCUK. They are all very aware about what we aim to do.

One of the big challenges is to answer the question from researchers: “If we are not using a particular ‘big journal’ name, what are we using to help us show the quality of the research?”.

We have been working with other funders (including Research Councils) to look at issues around this.  Once we have other funders on board, we need to work with institutions on staff promotion and tenure criteria.  We are working with others to support a dedicated person charged with implementing the San Francisco Declaration on Research Assessment (DORA) and identify best practice.

How do you see Open Outputs going forward?

There is a growing consensus over the importance of making research outputs available, and a strong commitment from funders to overcome the challenges. Our policy is geared to openness in ways that maximise the health benefits that flow from the research we fund.

Is there a licence that you encourage researchers to use?

No. We encourage researchers to utilise existing sources of expertise (e.g. The Software Sustainability Institute) and select the licence most appropriate for them.

Some researchers could just do data collection instead of publishing papers. Will we have future where people are just generating data and publishing it on its own and not doing the analysis?

It could happen. Encouraging adoption of CRediT Taxonomy roles in publication authorship is one thing that can help.

Outputs Management Plans

How will you approach checking outputs against the outputs management plan?

We will check the information submitted at the end of grants – what outputs were reported and how these were shared – and refer back to the plan submitted at application. We will not rule out sanctions in the future once things are in place. At the moment there are no sanctions as it is premature to do this.  We need to get the data first, monitor the situation and make any changes later in the process.

What are your thoughts on providing training for reviewers regarding the data management plans as well as for the people who will do the final checks? Are you going to provide any training and identify gaps on research for this?

We have provided guidance on what plans should contain; this is something we can look at going forwards.

One of the key elements to the outputs management plan is commenting on how outputs will be preserved. Does the Wellcome Trust define what it means by long term preservation anywhere?

Long term preservation is tricky. We have common best practice guidelines for data retention – 10 years for regular data and 20 years for clinical research. We encourage people to use community repositories where these exist.

What happens to the output if 10 years have passed since the last time of access?

This is a huge problem. There need to be criteria to determine what outputs are worth keeping which take into account whether the data can be regenerated. Software availability is also a consideration.

Research enrichment awards

You said that there will be prizes for data re-use, and dialogue on infrastructure is still in the early stages. What is the timeline? It would be good to push to get the timeline going worldwide.

Research enrichment awards are already live and Wellcome will assess them on an ongoing basis. Please apply if you have a Wellcome grant. Other funding opportunities will be launched in 2018. The Pioneer awards will be open to everyone in the spring and it is aimed for those who have worked out ways to make their work more FAIR.  The same applies to our themed challenges for innovative data re-use which will also launch in the spring – we will identify a data set and get people to look at it.  For illustration, a similar example is The NEJM SPRINT Data Analysis Challenge.

Publishing Open Access

What proportion of people are updating their articles on Wellcome Open Research?

Many people, around 15%, are editing their articles to Version 2 following review. We have one article at Version 3.

Has the Wellcome Trust any plans for overlay journals, and if so, in which repository will they be based?

Not at the moment. There will be a lot of content being published on platforms such as Open Research, the Gates platform and others. In the future, one could imagine a model where content is openly published on these platforms, and the role of journals is to identify particular articles with interesting content or high impact (rather than to manage peer review).  Learned societies have the expertise in their subjects; they potentially have a role here, for example in identifying lead publications in their field from a review of the research.

Can you give us any hints about the outcome of your review of the Wellcome Trust Open Access policy? Are you going to consider not paying for hybrid journals when you review your policy?

We are about to start this review of the policy. Hybrid journals are on the agenda. We will try to simplify the process for the researcher.  We are nervous about banning hybrid journals.  Data from the last analysis showed that 70% of papers from Wellcome Trust grants, for which Wellcome Trust paid an article processing charge, were in hybrid journals.  So if we banned hybrid journals it would not be popular.  Researchers would need to know which are hybrid journals.  Possibly with public health emergencies we could consider a different approach.  So there is a lot to consider and a balance to keep.  We will consult both researchers and institutions as part of the exercise.  There is also another problem in that there is a big gap in choice between hybrid and other journals.

If researchers can publish in hybrid journals, would Wellcome Trust consider making rules regarding offsetting?

That would be interesting. However, more rules could complicate things as researchers would then also need to check both the journal’s Open Access policy and find out if they have an approved offset deal in place.

Open Data & other research outputs

What is your opinion on medical data? For example, when we write an article, we can’t publish the genetics data as there is a risk that a person could be identified.

Wellcome Trust recognise that some data cannot be made available. Our approach is to support managed access. Once the data access committee has considered that the requirement is valid, then access can be provided. The author will need to be clear on how the researcher can get hold of the data.  Wellcome Trust has done work around best practice in this area.

Does Open Access mean free access? There is a cost for processing.

Yes, there is usually a cost. For some resources, those requesting data do have to pay a fee. For example, major cohort studies such as ALSPAC and UK Biobank have a fee which covers the cost of considering the request and preparing the data.

ALSPAC is developing a pilot with Wellcome Open Research to encourage those who access data and return derived datasets to the resource, to publish data papers describing data they are depositing.  Because the cost of access has already been met, such data will be available at no cost.

Does the software that is used in the analysis need to be included?

Yes, the policy is that if the data is published, the software should be made available too. It is a requirement, so that everybody can reproduce the study.

Is there a limit to volume of data that can be uploaded?

Wellcome Open Research uses third party data resources (e.g. Figshare). The normal dataset limit there is 5GB, but both Figshare and subject repositories can store much higher volumes of data where required.

What can be done about misuse of data?

In the survey that we did, researchers expressed fears of data misuse. How do we address such a fear? Demonstrating the value of data will play a great role in this.  It is also hard to know the extent to which these fears play out in reality – only a very small proportion of respondents indicated that they had actually experienced data being used inappropriately.  We need to gather more evidence of the relative benefits and risks, and it could be argued that publishing via preprints and getting a DOI are your proofs that you got there first.

Published 26 January 2018
Written by Dr Debbie Hansen
Creative Commons License

How do you know if you’re achieving cultural change?

On 15th November 2017, the University of Cambridge held its first research data management (RDM) conference, Engaging Researchers in Good Data Management. The Office of Scholarly Communication collaborated with SPARC Europe and Jisc, hosted the one-day event at St. Catherine’s College. In attendance were researchers, administrators, and librarians all sharing their experiences with promoting good RDM. Having a mixture of people from various disciplines and backgrounds allowed many different points of view on engaging researchers to be discussed. In the afternoon, the attendees split off into focus groups to concentrate on a number of nagging questions.

Our group’s topic of discussion: How do we effectively measure cultural change in attitudes towards data management? Leading the discussion was Marta Teperek from Delft University of Technology. There was a mixture of around 30 librarians and researchers from all over the world discussing strategies for engaging with researchers.

How do we set about achieving ‘cultural change’?

Marta started the conversation off by asking what everyone present was already doing at their institutions to engage researchers. Many shared their experiences and some frustrations at pushing good data management habits. One person shared that at his university the initial push toward better data management was achieved by creating and delivering RDM workshops for PhDs and young researchers in the Digital Humanities. These students were already interested in digital preservation, so they were a keen audience. Targeting PhD students and early career researchers may be a more effective strategy because they could develop good data management habits early in their careers. The earlier the intervention, the easier it would (hopefully) be.

Overall, most agreed that directly speaking to researchers is more effective than having initiatives relayed from the top-down. Attendees perceived compliance as a driver rather than a useful stick to persuade researchers to take data management seriously. Even if only a few researchers turned up to data management events, it was still increasing exposure.

Some argued for a multi-prong strategy. Initiatives like the Data Stewards at Delft TU and the Data Champions at the University of Cambridge were perceived as good ways to reach out to researchers in their departments and provide more customized advice. At the same time, having expectations of good data management relayed from on high could help creating greater impetus.

What do we mean by ‘cultural change’?

Naturally, the conversation progressed to what the phrase ‘cultural change’ actually means. It was difficult to determine in 45 minutes what kind of ‘cultural change’ we wanted to see within our different institutions. We started by asking some questions. What were our goals? What would need to happen before we said yes, the culture is changing? Which really meant what do we measure to find evidence of cultural change? Is it better metadata, more awareness of copyright, researchers reaching out to us for help, or an increase in number of grants awarded that would signal an actual change? It would seem that there could be many definitions of ‘cultural change’, but the crucial takeaway is that it is essential to define what your parameters of cultural change will be in the planning stages of any RDM programme.

Where is the evidence?

The conversation progressed to how do we find and gather evidence. With all of the work being done by researchers, librarians, and administrators, how do we know what is actually effective? We cannot state that engaging with researchers (which can be time-consuming) is working without having actual evidence to confirm it. A number of different ideas were discussed, with the time when feedback was gathered being a particular point of variance.

Quantifiable information such as number of datasets deposited, number of datasets downloaded and re-used, and number of grants with a Data Management Plan could be collected. For example, the University of Illinois conducted a detailed analysis of 1,260 data management plans using a controlled vocabulary list and looked at possible correlations between solutions for data management listed in funded and unfunded proposals.

Another method of benchmarking included asking researchers to periodically complete short surveys on data management practice in order to measure any noticeable changes. In that way, an institution can assess whether their engagement strategies work and whether it achieves the desirable effects (improvement of data management practice). Delft, EPFL, Cambridge and Illinois collaborated on development of an agreed set of survey questions. Conducting this same survey across different institutions enables benchmarking and comparison of the different techniques and how effective they are in achieving cultural change in data management. In addition to this survey, the team also interviews some researchers in order to gather additional qualitative data and more detailed insights into data management practice. The hope is that carrying out these quantitative surveys and qualitative interviews periodically will correct for the potential problem of self-selecting participants.

In the future

Ultimately, it turned out that most of those attending the focus group discussion were already working actively to develop systems to measure impact and gather feedback. However, the possibility of carrying out long-term cross-institutional research that would allow comparisons between different data management programmes is very tantalising. The final takeaway from this focus group discussion was that the majority of those attending would be very keen to take part in such research, so watch this space!

Published 18 December 2017
Written by Katie Hughes and Lucy Welch
Creative Commons License

From data curators to intellectual entrepreneurs: observations from IFLA

In this blog post, Clair Castle, Librarian, University of Cambridge, Department of Chemistry reflects on her experience at the IFLA Satellite Meeting 2017 in Warsaw, Poland.

Earlier this year I was invited by the Office of Scholarly Communication (OSC) at the University of Cambridge to present a paper on Data Curator’s Roles and Responsibilities: International and Interdisciplinary Perspectives. This was my first time writing a paper for a conference and presenting it; it was slightly daunting but exciting too!

IFLA is the International Federation of Library Associations and Institutions, the international body that represents the interests of library and information services and their users. It celebrates its 90th birthday in 2017. This conference was a pre-Congress Satellite Conference, taking place just before the IFLA World Library and Information Congress held in Wrocław, Poland, from 19–25 August.

There were three sessions of four presentations in the programme – which includes links to every presentation. You can find most of the papers that were presented here. The main conference hashtag on Twitter was #wlic2017 (learn more about the 2017 and upcoming 2018 congress by following @iflawic).

Conference focus

Data curation has emerged as a new area of responsibility for researchers, librarians, and information professionals in the digital environment. The huge variety and amount of data that needs to be processed, preserved, and disseminated is creating new roles, responsibilities and challenges for researchers and the library and information professionals who support them. The primary goal of the conference was to engage the international scholarly community in a conversation that led to a better understanding of these challenges, and to discuss the main trends in data curation and Research Data Management (RDM) practices and education.

To ‘curate’ means to ‘take care of’. What resonated with me the most from the conference was the fact that while we are curating data we are curating people as well. We are doing this by changing research culture, evolving the profession, changing research (and research support) practices, doing outreach and advocacy work, and liaising with related university support services. The conference presentations returned to this theme again and again.

I won’t discuss every presentation here, instead I will collate and relate the ideas that I found most thought-provoking.

Intellectual entrepreneurship

This term was introduced to me by Nitecki and Davis’ presentation ‘Expanding librarians’ roles in the research life cycle’. The definition I have since found that explains this the best is from Charles J. Chumas at Stony Brook University:

“Take … the textbook definition of entrepreneur: A person who organizes and manages any enterprise, especially a business, usually with considerable initiative and risk. Now, switch out the words “enterprise” and “business” with words such as “research” or “education”. This is the concept of intellectual entrepreneurship. It is the concept of taking risk, seizing opportunity, discovering and creating knowledge and employing one’s own innovation and strategies, with the ultimate goal of solving problems in corporate, societal or governmental environments. An intellectual entrepreneur … actively seeks out their own education … The philosophy of IE embodies four core values: vision and discovery, ownership and accountability, integrative thinking and action, and collaboration and teamwork”.

I feel that this describes the role of data curators exactly: researchers and the people supporting them are planning data curation strategically and innovatively, acquiring the necessary knowledge and skills to develop it in their institution, and working to bring systems, services and people together to achieve their overall goal of managing data effectively.

Zhang’s presentation ‘Data curators: A glimpse at their roles at the academic libraries in the United States’ mentioned the Association of Research Librarians’ Strategic Thinking and Design Initiative: Extended and Updated Report (2016) which estimates that the research librarian will have shifted from knowledge service provider to collaborative partner within the research ecosystem by 2033. In one example of this, librarians have shifted from providing a service support role to working with researchers to further open science: the FOSTER portal is an e-learning platform that brings together the best training resources addressed to those who need to know more about Open Science, or need to develop strategies and skills for implementing Open Science practices in their daily workflows. It provides training materials for many different users – from early-career researchers, to data managers, librarians, research administrators, and graduate schools. This reflects the self-education aspect of intellectual entrepreneurship.

Upskilling librarians

Many library science curricula around the world do not (yet) include an RDM module. Experienced librarians may not therefore have the necessary knowledge or skills to support RDM. Many data curation post advertisements require leadership, partnership, outreach and collaborative responsibilities but not a professional library qualification. Data curation posts have been repurposed from experienced librarian posts, taken up by new graduates, contractors, PhDs, or sometimes are joint appointments with different academic units. A review of the library profession with regard to RDM skills and knowledge is required to inform future education and training.

Peters’ presentation ‘Reskilling academic librarians for data management services’ highlighted Lewis’ research data management pyramid for libraries (p.16). Areas of early engagement with RDM are situated at the bottom of the pyramid, and as you get to the top you can take on the world!

Role of IT in data curation

Several speakers touched upon this: after all, IT underpins everything and IT support staff are often closer to researchers than librarians are. However, there may be a perception that data curation is not an IT role, per se. In another example of intellectual entrepreneurship, IT and data librarians can work together to provide research data support services: IT can bring UX (User Experience) skills e.g. design of systems, project management, and data librarians can bring their expertise in repository infrastructures, digital preservation, discovery and indexing methods for example.

The definition of data curation is evolving

The IFLA Library Theory and Research Panel Data Curation Project identified the role and responsibilities of data curators in international context. One aspect of the methodology was to undertake a review of literature and vocabulary describing data curation roles (using a cool keyphrase digger tool!), and analysing the content analysis of job advertisements (in 35 countries). They found varying terms to describe data curation (e.g. data stewardship, digital preservation, data science, and RDM, the preferred term). Outreach and advocacy to researchers was found to be an important aspect of roles, which again relates back to the theme of intellectual entrepreneurship.

Central vs. discipline-specific RDM activities at the University of Cambridge

As I have mentioned, I presented my paper on behalf of the OSC. Since its establishment in 2015 the OSC has developed many services to support RDM at the University, including a central website, RDM training and support, and a data repository. It communicates with researchers and support staff including librarians and administrators across the University using a variety of methods. There is therefore a considerable amount of outreach into departments and faculties where research takes place. However, its resources are limited: it is not possible for it to deliver RDM training for example in every department or faculty in the University, especially on a discipline-specific basis.

Most departments and faculties in the University have an embedded library service, which is discipline-specific. Librarians are in a key position to be able to collaborate with the OSC and their own researchers in developing and implementing RDM services locally. My paper presents a case study of how centralised RDM services have been rolled out in the Department of Chemistry, thus adapting the central RDM messages to discipline-specific needs. I describe how customising centralised RDM training to all new graduate students in the Department, being a member of the University’s RDM Project Group, and being involved in the OSC’s Data Champions programme has benefitted both the OSC and the Department.

Identity crisis?

The conference taught me that the identity of data curators is constantly evolving. Does it even matter what we call ourselves? Whatever the term used to describe us, we have similar roles and goals, and need to equip ourselves for future challenges. The concept of intellectual entrepreneurship is worth exploring further as a way of empowering ourselves.

The conference gave me a great opportunity to share and learn about RDM best practice from practitioners across the world. It reinforced for me the fact that we are all in it together, facing the same challenges and working together to come up with solutions.

Observations

The conference took place at the very impressive University of Warsaw Library, which is centrally located beside the Old Town in Warsaw, right next to the Vistula River. Around 40 delegates attended from all over the world.

Warsaw itself is a lively city, though with a rich, if at times tragic, history. After the conference dinner (a BBQ outside on a very warm evening!) we were treated to an entertaining evening bus tour around the city. We passed the amazing POLIN Museum of the History of Polish Jews, travelled through the area where the Warsaw Ghetto had been, and took in examples of communist era architecture (in particular the imposing Palace of Culture and Science).

        

Published 15 December 2017
Written by Clair Castle @chemlibcam
Creative Commons License