On 9-10 November 2015 Marta Teperek attended the fourteenth Research Data Management Forum (RDMF#14) in York organised by the Digital Curation Centre. Research Data Management Fora are organised twice yearly to bring together the community of people supporting research data management across different institutions in the UK. This was the second RDMF meeting Marta has attended and these meetings offer an excellent opportunity for the community to get together and think strategically about how to best support research data management needs of our researchers. Below are Marta’s impressions from the meeting.
Battling inertia
While the title of the meeting was ‘Research Data (and) Systems’, a more appropriate title could have been ‘Time to wake up and act’. Imperial College London’s Torsten Reimer had a message in his keynote speech: that the community should be grateful to the EPSRC for creating their policy on research data.
According to Torsten the introduction of this policy is a consequence of our inertia – the community had the opportunity to address research data management needs before the EPSRC policy was introduced. The lack of initiative from the academic community, the passiveness on our side despite a clear need to develop guidelines on good practice in research data management, prompted the funders to tackle the problems for us.
Now both institutions and academics are complaining that the EPSRC policy is not realistic and that many issues remain unsolved. However, what right do we have to complain if we did not take action when there was the time?
Time to wake up and act
While we have not shown historic initiative it is still not too late to be pro-active.
First, Torsten called out to the community to lead the process of interpreting the funders’ policies. Funders created their policies to help researchers to manage and share their data. Now, and in-line with word’s from Michael Ball from the BBSRC, it is up to institutions and researchers themselves to find the best discipline-specific solutions for data management and sharing.
If institutions (and researchers) do not come up together with solutions to address discipline-specific data management needs, the funders will again need to take the lead and perhaps develop detailed guidelines for particular situations. But is this something which we really need? Would it not be better to have guidelines and policies developed by the community and endorsed by the community?
Second, Torsten called upon the community to act together to develop joint minimal metadata standards to be adopted by all data repositories. There are numerous repositories all over the world which can be used by researchers to deposit and share their research data. The challenge is that there are no common metadata standards used by all these repositories.
This leads to problems. For example – how can institutions know about research data created by their researchers if there is no institutional affiliation associated with the submission of their data? How can funders know about outcomes resulting from their funding if researchers do not indicate who funded their research when submitting data to the repository? Torsten suggested that if we jointly decide on what are the minimal metadata standards, we would jointly have the chance to get these standards implemented. As someone who manages the deposit of data sets, I personally could not agree more with this suggestion.
Shared RDM services
The biggest discussion during the meeting fitted extremely well in the topic of joint ventures – it was around the development of shared RDM services. John Kaye from Jisc spoke about the plans to develop shared RDM infrastructure and called for six to eight institutions to take part in the pilot.
There are numerous benefits of developing joint services. At the moment most of the institutions have their own data repositories, meaning that across the UK there are hundreds of repositories and even more repository managers and repository developers. Every institutional repository needs to be integrated with other institutional systems, requiring even more skilled technical workforce needed at every institution. In addition, who is providing the data storage capabilities? On what conditions? And who is doing all the negotiations with service providers?
These are all resource-hungry processes at an institutional level. Shared services could inevitably be more cost-efficient and result in taxpayers money better spent.
But this idea does open up many questions:
- Given most institutions have already invested substantial resources to create their own local solutions is it too late to develop shared RDM services?
- Would institutions need to abandon their existing processes and contribute to the shared development?
- What would happen to the research data already stored locally?
- How sustainable are the shared solutions? Would funders support them?
- What is the business model behind shared solutions?
- And what would happen if the pilot failed?
There is a further problem in that even if the pilot succeeded, the solutions will not be available until 2017. This means piloting institutions will have to co-develop the shared solutions (investing time and resources), while continuing to support their own local solutions before the shared ones become available.
It is a difficult decision to make whether to join the pilot project or not. Cambridge debated this for a while, but in the end we decided that long-term benefits and efficiency of the joint approach should substantially outweigh the short term increase of the resources needed for both maintaining the local solutions and developing the joint services. As Torsten suggested, at Cambridge we believe that acting together, collaboratively, is the way forward. Lonely silos are inefficient and in the time when funding and other resources are limited, we need to ensure that we invest them wisely, thinking of long-term benefits.
Suggestion for future RDM foras
Summarising, I would like to thank the Digital Curation Centre for bringing the whole community of research data managers together. RDM foras are always an excellent opportunity to exchange practice, views and to share suggestions with colleagues at other institutions.
It was extremely useful that during RDMF#14 all presenters introduced their institutions – their size, the type of research done, the size of the RDM support team. What it made us realise that irrespective of these differences we all share similar high-level needs and we all need similar high-levels actions.
So my suggestion for the future foras is to better leverage the fact that the whole community is gathered in one place and focus more on the actions. If we are to jointly decide on what our needs are, or what do we think the minimal metadata standards should be, why do not we do it at the meeting while we are convened together? Perhaps we could actually produce some deliverables during breakout sessions?