Tag Archives: funders

Data Diversity Podcast (#4) – Dr Stefania Merlo (1/2) 

Welcome back to the fourth instalment of Data Diversity, the podcast where we speak to Cambridge University Data Champions about their relationship with research data and highlight their unique data experiences and idiosyncrasies in their journeys as a researcher. In this edition, we speak to Data Champion Dr Stefania Merlo from the McDonald Institute of Archaeological Research, the Remote Sensing Digital Data Coordinator and project manager of the Mapping Africa’s Endangered Archaeological Sites and Monuments (MAEASaM) project and coordinator of the Metsemegologolo project. This is the first of a two-part series and in this first post, Stefania shares with us her experiences of working with research data and outputs that are part of heritage collections, and how her thoughts about research data and the role of the academic researcher have changed throughout her projects. She also shares her thoughts about what funders can do to ensure that research participants, and the data that they provide to researchers, can speak for themselves.   

This is the first of a two-part series and in this first post, Stefania shares with us her experiences of working with research data and outputs that are part of heritage collections, and how her thoughts about research data and the role of the academic researcher have changed throughout her projects. She also shares her thoughts about what funders can do to ensure that research participants, and the data that they provide to researchers, can speak for themselves.   


I’ve been thinking for a while about the etymology of the word data. Datum in Latin means ‘given’. Whereas when we are collecting data, we always say we’re “taking measurements”. Upon reflection, it has made me come to a realisation that we should approach data more as something that is given to us and we hold responsibility for, and something that is not ours, both in terms of ownership, but also because data can speak for itself and tell a story without our intervention – Dr Stefania Merlo


Data stories (whose story is it, anyway?) 

LO: How do you use data to tell the story that you want to tell? To put it another way, as an archaeologist, what is the story you want to tell and how do you use data to tell that story?

SM: I am currently working on two quite different projects. One is Mapping Africa’s Endangered Archaeological Sites and Monuments (funded by Arcadia) which is funded to create an Open Access database of information on endangered archaeological sites and monuments in Africa. In the project, we define “endangered” very broadly because ultimately, all sites are endangered. We’re doing this with a number of collaborators and the objective is to create a database that is mainly going to be used by national authorities for heritage management. There’s a little bit less storytelling there, but it has more to do with intellectual property: who are the custodians of the sites and the custodians of the data? A lot of questions are asked about Open Access, which is something that the funders of the projects have requested, but something that our stakeholders have got a lot of issues with. The issues surround where the digital data will be stored because currently, it is stored in Cambridge temporarily. Ideally all our stakeholders would like to see it stored in a server in the African continent at the least, if not actually in their own country. There are a lot of questions around this. 

The other project stems out of the work I’ve been doing in Southern Africa for almost the past 20 years, and is about asking how do you articulate knowledge of the African past that is not represented in history textbooks? This is a history that is rarely taught at university and is rarely discussed. How do you avail knowledge to publics that are not academic publics? That’s where the idea of creating a multimedia archive and a platform where digital representations of archaeological, archival, historical, and ethnographic data could be used to put together stories that are not the mainstream stories. It is a work in progress. The datasets that we deal with are very diverse because it is required to tell a history in a place and in periods for which we don’t have written sources.  

It’s so mesmerizing and so different from what we do in contexts where history is written. It gives us the opportunity to put together so many diverse types of sources. From oral histories to missionary accounts with all the issues around colonial reports and representations of others as they were perceived at the time, putting together information on the past environment combining archaeological data. We have a collective of colleagues that work in universities and museums. Each performs different bits and pieces of research, and we are trying to see how we would put together these types of data sets. How much do we curate them to avail them to other audiences? We’ve used the concept of data curation very heavily, and we use it purposefully because there is an impression of the objectivity of data, and we know, especially as social scientists, that this just doesn’t exist. 

I’ve been thinking for a while about the etymology of the word data. Datum in Latin means ‘given’. Whereas when we are collecting data, we always say we’re taking measurements. Upon reflection, it has made me come to a realisation that we should approach data more as something that is given to us and we hold responsibility for, and something that is not ours, both in terms of ownership, but also because data can speak for itself and tell a story without our intervention. That’s the kind of thinking surrounding data that we’ve been going through with the project. If data are given, our work is an act of restitution, and we should also acknowledge that we are curating it. We are picking and choosing what we’re putting together and in which format and framework. We are intervening a lot in the way these different records are represented so that they can be used by others to tell stories that are perhaps of more relevance to us. 

So there’s a lot of work in this project that we’re doing about representation. We are explaining – not justifying but explaining – the choices that we have made in putting together information that we think could be useful to re-create histories and tell stories. The project will benefit us because we are telling our own stories using digital storytelling, and in particular story mapping, but it could become useful for others as resources that can be used to tell their own stories. It’s still a work in progress because we also work in low resourced environments. The way in which people can access digital repositories and then use online resources is very different in Botswana and in South Africa, which are the two countries where I mainly work with in this project. We also dedicate time into thinking how useful the digital platform will be for the audiences that we would like to get an engagement from. 

The intended output is an archive that can be used in a digital storytelling platform. We have tried to narrow down our target audience to secondary school and early university students of history (and archaeology). We hope that the platform will eventually be used more widely, but we realised that we had to identify an audience to be able to prepare the materials. We have also realised that we need to give guidance on how to use such a platform so in the past year, we have worked with museums and learnt from museum education departments about using the museum as a space for teaching and learning, where some of these materials could become useful. Teachers and museum practitioners don’t have a lot of time to create their own teaching and learning materials, so we’re trying to create a way of engaging with practitioners and teachers in a way that doesn’t overburden them. For these reasons, there is more intervention that needs to come from our side into pre-packaging some of these curations, but we’re trying to do it in collaboration with them so that it’s not something that is solely produced by us academics. We want this to be something that is negotiated. As archaeologists and historians, we have an expertise on a particular part of African history that the communities that live in that space may not know about and cannot know because they were never told. They may have learned about the history of these spaces from their families and their communities, but they have learned only certain parts of the history of that land, whereas we can go much deeper into the past. So, the question becomes, how do you fill the gaps of knowledge, without imposing your own worldview? It needs to be negotiated but it’s a very difficult process to establish. There is a lot of trial and error, and we still don’t have an answer. 

Negotiating communities and funders 

LO: Have you ever had to navigate funders’ policies and stakeholder demands?  

SM: These kinds of projects need to be long and they need continuous funding, but they have outputs that are not always necessarily valued by funding bodies. This brings to the fore what funding bodies are interested in – is it solely data production, as it is called, and then the writing up of certain academic content? Or can we start to acknowledge that there are other ways of creating and sharing knowledge? As we know, there has been a drive, especially with UK funding bodies, to acknowledge that there are different ways in which information and knowledge is produced and shared. There are alternative ways of knowledge production from artistic ones to creative ones and everything in between, but it’s still so difficult to account for the types of knowledge production that these projects may have. When I’m reporting on projects, I still find it cumbersome and difficult to represent these types of knowledge production. There’s so much more that you need to do to justify the output of alternative knowledge compared to traditional outputs. I think there needs to be change to make it easier for researchers that produce alternative forms of knowledge to justify it rather than more difficult than the mainstream. 

One thing I would say is there’s a lot that we’ve learned with the (Mapping Africa’s Endangered Archaeological Sites and Monuments) project because there we engage directly with the custodians of the site and of the analog data. When they realise that the funders of the project expect to have this data openly accessible, then the questions come and the pushback comes, and it’s a pushback on a variety of different levels. The consequence is that basically we still haven’t been able to finalise our agreements with the custodians of the data. They trust us, so they have informed us that in the interim we can have the data as a project, but we haven’t been able to come to an agreement on what is going to happen to the data at the end of the project. In fact, the agreement at the moment is the data are not going to be going on a completely Open Access sphere. The negotiation now is about what they would be willing to make public, and what advantages they would have as a custodian of the data to make part, or all, of these data public.

This has created a disjuncture between what the funders thought they were doing. I’m sure they thought they were doing good by mandating that the data needs to be Open Access, but perhaps they didn’t consider that in other parts of the world, Open Access may not be desirable, or wanted, or acceptable, for a variety of very valid reasons. It’s a node that we still haven’t resolved and it makes me wonder: when funders are asking for Open Access, have they really thought about work outside of UK contexts with communities outside of the UK context? Have they considered these communities’ rights to data and their right to say, “we don’t want our data to be shared”? There’s a lot of work that has happened in North America in particular, because indigenous communities are the ones that put forward the concept of C.A.R.E., but in UK we are still very much discussing F.A.I.R. and not C.A.R.E.. I think the funders may have started thinking about it, but we’re not quite there. There is still this impression that Open Data and Open Access is a universal good without having considered that this may not be the case. It puts researchers that don’t work in UK or the Global North in an awkward position. This is definitely something that we are still grappling with very heavily. My hope is that this work is going to help highlight that when it comes to Open Access, there are no universals. We should revisit these policies in light of the fact that we are interacting with communities globally, not only those in some countries of the world. Who is Open Access for? Who does it benefit? Who wants it and who doesn’t want it, and for what reasons? These are questions that we need to keep asking ourselves. 

LO: Have you been in a position where you had to push back on funders or Open Access requirements before? 

Not necessarily a pushback, but our funders have funded a number of similar projects in South Asia, in Mongolia, in Nepal and the MENA region and we have come together as a collective to discuss issues around the ethics and the sustainability of the projects. We have engaged with representatives of our funders trying to explain that what they wanted initially, which is full Open Access, may not be practicable. In fact, there has already been a change in the terminology that is used by the funders. From Open Access, they changed the concept to Public Access, and they have come back to us to say that they can change their contractual terms to be more nuanced and acknowledge the fact that we are in negotiation with national stakeholders and other stakeholders about what should happen to the data. Some of this has been articulated in various meetings, but some of it was trial and error on our side. In other words, with our new proposal for renewal of funding, which was approved, we just included these nuances in the proposal and in our commitment and they were accepted. So in the course of the past four years, through lobbying of the funded projects, we have been able to bring nuance to the way in which the funders themselves think about Open Access. 


Stay tuned for part two of this conversation where Stefania will share some of the challenges of managing research data that are located in different countries!


Thoughts on the new White House OSTP open access memo

Dr. Samuel A. Moore, Scholarly Communication Specialist, Cambridge University Libraries

In the USA last Thursday, the White House Office of Science and Technology Policy announced its decision to mandate public access to all federally funded research articles and data. From 2026, the permitted embargo period of one year for funded publications will be removed and all publications arising from federal funding will have to be immediately accessible through a repository. Although more details are to be announced, my colleague Niamh Tumelty, the OSC’s Head of Open Research Services, shared a helpful summary of the policy and some initial reaction here. I want to offer my own personal assessment of what the new policy might mean from the perspective of open access to research articles, something we are working hard to promote and support throughout the university.

To be sure, the new OSTP memo is big news: the US produces a huge amount of research that will now be made immediately available without payment to the world at large. Following in the footsteps of Plan S in Europe, the open access policy landscape is rapidly evolving away from embargo periods and towards immediate access to research across all disciplines. Publishing industry consultants Clarke & Esposito have even argued that this intervention will make the subscription journal all the more unviable, eventually leading to its demise.

Indeed, responses from the publishing industry have been mixed. The STM Association, for example, offer a muted one-paragraph response claiming tepid support for the memo, while organisations such as the AAP were more vocally against what they see as a lack of ‘formal, meaningful consultation or public input’ on the memo, despite the fact that many more details are still to be announced (presumably, following consultation). A similar sense of frustration was displayed by some of the authors of the industry-supported Scholarly Kitchen blog. It’s fair to say that the publishing industry itself – at least the part of it that makes money from journal subscriptions – has not welcomed the new memo with open arms.

Understandably, funders and advocacy organisations have welcomed the news. Johan Rooryck from Coalition S called the memo a ‘game changer for scholarly publishing’, while the Open Research Funders Group ‘applauds bold OSTP action’ in its response. Open access advocates SPARC described the memo as a ‘historic win’ for open access and a ‘giant step towards realizing our collective goal of ensuring that sharing knowledge is a human right – for everyone’. Certainly, for those arguing in favour of greater public access to research, the memo will indeed result in just this. But I still have my reservations.

My PhD thesis analysed and assessed the creation and implementation of open access policy in the UK. As Cambridge researchers no doubt know, the open access policy landscape is composed of a number of mandates, with varying degrees of complexity, and affects the vast majority of UK researchers in one way or another. This is for better and for worse: there is an increase in bureaucracy associated with open access policy (particularly through repositories), even though it results in greater access to research. However, when you remove this bureaucracy through more seamless approaches to OA like transformative agreements, there is a risk of consolidating the power of large commercial publishers who dominate this space and make obscene profits (a fear also shared by Jeff Pooley in his write-up of the policy). There is therefore a delicate balance to be struck between simply throwing money at market-based solutions and requiring researchers and librarians to take on more of the burden of compliance.

The problem with indiscriminate policy mandates for public access to research, such as the OSTP’s memo, is that they shore up the idea that publishing has to be provided by a private industry that is not especially accountable to research communities or the university more broadly. This is precisely because these policies are indiscriminate and therefore apply to everyone equally, which for academic publishing means benefitting those already in a good position to profit. Larger commercial publishers have worked out better than anyone else how to monetise open access through a range of different business models. As long as researchers need to continue publishing with the bigger publishers, which they do for career reasons, these publishers will always be in a better position to benefit from open access policies. It is hard to imagine how the individual funding bodies could implement the OSTP memo in a way that does foreground a more bibliodiverse publishing system at the expense of commercialism (not least because this goal does not appear to be the target of the memo).  

I do not mean to overplay the pessimism here: it is great that we are heading for a world of much more open access research. The point now is to couple this policy with funding and support to continue building the capacity of an ethical and accountable publishing ecosystem, all while trying to embed these ethical alternatives within the mainstream. This kind of culture change cannot be achieved by mandates like the OSTP is proposing, but it can be achieved by the harder work of raising awareness of alternatives and highlighting the downsides of current approaches to publishing. It is also important to reveal the ways in which research cultures shape how researchers decide to publish their work – often at the expense of experimentation and openness – and how they can be changed for the better.

So I am interested to see how the memo is implemented in practice, especially how it is funded and the conditions set on immediate access to research. I am also keen to see what role, if any, rights retention plays in the implementation and how US libraries decide to support the policy and the changing environment more broadly. Ultimately, however, the move to a more scholar-led and scholar-governed ecosystem will not occur on an open/closed binary, nor on a top-down/bottom-up one, and so we must find a range of ways to support new cultures of knowledge production and dissemination in the university and beyond.

Image taken from Public Domain Pictures

US requirements for public access to research

Niamh Tumelty, Head of Open Research Services, Cambridge University Libraries

Yesterday it was announced that the White House Office of Science and Technology Policy has updated US policy guidance to make the results of taxpayer-supported research immediately available to the American public at no cost:
https://www.whitehouse.gov/ostp/news-updates/2022/08/25/ostp-issues-guidance-to-make-federally-funded-research-freely-available-without-delay/

Federal agencies have been asked to update their public access policies to make publications and supporting data publicly accessible without an embargo. This applies to all federal agencies (the previous policy only applied to those with more than $100 million in annual research and development expenditure) and allows for flexibility for the agencies to decide on some of the details while encouraging alignment of approaches. It applies to all peer-reviewed research articles in journals and includes the potential to also include peer-reviewed book chapters, editorials and peer-reviewed conference proceedings.

The emphasis on “measures to reduce inequities of, and access to, federally funded research and data” is particularly important in light of the serious risk that we will just move from a broken system with built-in inequities around access to information to a new broken system with built-in inequities around whose voices can be heard. Active engagement will be needed to ensure that the agencies take these issues into account and are not contributing to these inequities.

While there will be a time lag in terms of development/updating and implementation of agency policies and we don’t yet have the fine print around licences etc, this will bring requirements for US researchers more closely in line with what many of our researchers already need to do as a result of e.g. UKRI and Wellcome Trust policies. Closer alignment should help address some of the collaborator issues that have arisen following the recent cOAlition S policy updates – though of course a lot will depend on the detail of what each agency puts in place. Researchers availing of US federal funding need to engage now if they would like to influence the approach taken by those who fund their work.

There continues to be a very real question around sustainable business models both from publisher and institutional perspectives, alongside the other big questions around whether the current approaches to scholarly publishing are serving the needs of researchers adequately. It is essential that this doesn’t just become an additional cost for researchers or institutions as many of those who have commented in the past 24 hours fear. Many alternatives to the APC and transitional agreement/big deal approaches have been proposed, from diamond approaches through to completely reimagined approaches to publishing (e.g. Octopus).

There will be mixed feelings about this. While there is likely to be little sympathy for the publishers with the widest profit margins, this move is sure to push more of the smaller publishers, including many (but not all!) learned societies, to think differently. We need to ensure that we understand what researchers most value about these publishers and how to preserve those aspects in whatever comes in future – I am reminded of the thought-provoking comments from our recent working group on open research in the humanities on this topic.

These are big conversations that were already underway and will now take on greater urgency. The greatest challenge of all remains how to change the research culture such researchers can have confidence in sharing their work and expertise in ways that maximise access to their work while also aligning with their (differing!) values and priorities.

10 years on and where are we at? COASP 2018

Last week, the 10th Conference of the Open Access Publishing Association was held in Vienna. Much was covered over the two and a half days. A decade in, this conference considered the state of the open access (OA) movement, discussed different approaches to OA, considered inequity and the infrastructure required to meet this need and argued about language. Apologies – this is a long blog.

Fracturing of the ‘OA movement’?

In an early discussion, Paul Peters, OASPA President and CEO of Hindawi noted that similarly to movements like organic food or veganism, the OA ‘movement’ is not united in purpose. When what appear to be ‘fringe’ groups begin, it is easy to assume that all involved take a similar perspective. But the reasons for people’s involvement and the end point they are aiming for can be vastly different. Paul noted that this can be an issue for OASPA because there is not necessarily one goal for all the members. He posed the question about what this might mean for the organisation.

It also raises questions about approaches to ‘solving’ OA issues. Many different approaches were discussed at the event.

Unbundling

The concept of ‘unbundling’ the costs associated with publishing and offering these to people to engage with on an as needs basis was raised several times. This points to the concept put forward last year by Toby Green of the OECD. It also triggered a Twitter conversation about the analogy of the airline industry (and how poorly they treat their customers).

If the scholarly journal were unbundled, different players could deliver the functions. Kathleen Shearer, Executive Director of COAR noted that not all functions of scholarly publishing need to take place on the same platform. She suggested next generation repositories as one of the options.

Jean-Claude Guedon provided several memorable quotes from the event, with the most pertinent being “We don’t need a ‘Version of Record’. We need a ‘record of versions’”. Kirsten Ratan, of Coko Foundation agreed in her talk on infrastructure, stating “we publish like its 1999”. The Version of Record is the one that matters and it is static in time. But it is not 1999, she noted, and we need to consider the full body of work in its entirety.

After all, it was observed elsewhere at the conference, nothing radical has changed in the format of publications over the past 25 years. We are simply not using the potential the internet offers. Kathleen quoted Einstein stating “You cannot solve a problem from the same consciousness that created it. You must learn to see the world anew”.

New subscribing models

Wilma van Wezenbeek, from TU Delft and Programme Manager, Open Access, VSNU discussed the approach to negotiations taken in The Netherlands. They are arguing that when comparing how much is spent per article under the toll system and what it would cost to have everything published OA, that enough money exists in the system. VSNU are being pragmatic, focusing on big publishers and going for gold OA (to avoid the duplication of journals). She also noted how important it is for libraries to have presidents of the University at the negotiation table. Her parting advice on negotiations was to hold your nerve, stay true to the principles and don’t waiver.

This approach does not include smaller publishers and completely ignores fully gold publishers, an observation that was made a few times in the conference. An alternative approach, argued Kamran Naim, Director of Partnerships & Initiatives at Annual Reviews, was collective action. In his talk ‘Transitioning Subscriptions to OA Funding: How libraries can Subscribe to Open’ he asked what is required to flip the subscription cost to manage OA publication (instead of APCs). The challenge with this idea is it requires people to continue subscribing even when material is OA and they don’t have to. Another problem is the idea of ‘subscribing’ to OA material can become a procurement challenge. This cost can be classified as a ‘donation’ which is not allowed by some library budgets. So the suggestion is that subscribing libraries will be offered to subscribe to select journals and receive 5% off the subscription cost. The plan is to roll out the project to libraries in 2019 for 2020 models.

Study – downloading habits when material is OA

A very interesting study was presented by Salvatore Mele and Alexander Kohls from CERN and SCOAP3. Entitled ‘Preprints vs traditional journals vs Open Access journals – What do scientists download?’ the study compared downloads of the same scientific artefact as a preprint on arXiv and as a published article on a (flipped) journal platform.

Their findings, which came from arXiv, Elsevier and SpringerNature’s statistics, showed that there is a significant use of the version in arXiv during the first six months (when the only version of the work is available in arXiv) which drops off dramatically after the work is published (a point identified as when the DOI is minted).

They also compared downloads from 2013 – before the journals flipped to gold under the SCOAP3 arrangement with those from 2016 when the journals were open access. The pattern over time was similar, but accesses in 2016 were higher overall over time, but dramatically higher in the first three months after the DOI was minted.

 

The final slide demonstrated that having recent open access content was also driving up downloads of older works in the non-open access backfiles from the publisher platforms.

This work is not published “because we have day jobs”. I have included my poor images of the slides in this blog and will link to the slides when they are made available.

Nostalgia

Being the 10th OASPA conference there was some reminiscing throughout the presentations. In a keynote reflection on the Open Access movement, Rebecca Kenniston from KN Consultants noted several myths about OA publishing that existed 10 years ago that still persist. Rebecca discussed “library wishful thinking” when it came to OA. This has included thinking OA would solve the serials crisis, that practice would change ‘if only the academic community were aware’, that institutional repositories and mandates would solve OA. (Certainly one of my own observations over the 16 or so years I have been involved in OA is there is always a palpable sense of glee at OA events when ‘real’ researchers bother to turn up.)

David Prosser, Executive Director of Research Libraries UK was outed as the architect of the ‘hybrid’ option, which he articulated in his 2003 paper “From here to there: a proposed mechanism for transforming journals from closed to open access“. David defended himself by noting that the whole concept did not work because it was proposed with an assumption about the “sincerity of the industry to engage”.

This made me consider the presentation I gave to another 10th anniversary conference this year – Repository Fringe at Edinburgh. In 1990 Steven Harnad wrote about ‘Scholarly Skywriting’ and described the obstacles to the ‘revolution’ as including ‘old ways of thinking about scientific communication and publication’, ‘the current intellectual level of discussion on electronic networks is anything but inspiring’, ‘plagiarism’, ‘copyright’ and ‘academic credit and advancement’ amongst others. Little appears to have changed in the past 28 years.

The more perceptive readers will note how long ago these dates are. This OA palaver has been going on for decades. And it seems even longer because, as Guido Blechl from the University of Vienna noted, “open access time is shorter than normal time because it moves so fast”.

But none of this wishful thinking has come to fruition. Rebecca asked “what shift do we need in our thinking?” Well in many ways that shift has landed in the form of Plan S. See the related blog for the discussions about Plan S that happened at the conference.

Language matters

Rebecca also mentioned “our own special language”, which is, she observed, a barrier to entry to the discussion. Indeed language issues came up often during the few days of the conference.

There were a few references to the problems with the terms ‘green’ and ‘gold’, and specifically gold. This has long been a personal bugbear of mine because of the nonsensical nature of the labels, and the associations of ‘the best’ and ‘expensive’ with gold. There has been a co-opting of the term ‘gold’ by the commercial publishing sector to mean ‘pay to publish’. Of course all *hybrid* journals charge an APC, and more articles are published where an APC has been paid than not, which is possibly why the campaign has been successful – see the Twitter discussion here. But it is inaccurate. In truth, ‘gold’ means the work is open access from the point of publication. More fully gold open access journals do not charge an APC than do.

There was also concern raised about the term ‘Open Science’ which, while in Europe is an inclusive term to cover all types of research, is not perceived this way in other parts of the world. There was strong support amongst the group for using the term ‘Open Scholarship’ as an alternative. This also brought up a discussion about using the term ‘publication’ rather than the more inclusive research ‘outputs’ or ‘works’, which encompass publishing options beyond the concept of a book or a journal.

Inequity

Inclusivity is not optional! We need a global (information/publishing) system!” was the rallying cry of Kathleen Shearer in her talk.

For many in the OA space, equity of access to research outputs lies at the centre of what the end goal is. It is clear that knowledge published by academic journals is inaccessible to the majority of researchers in low- and middle-income countries. But if we move to a fully gold environment, with the potential to increase the cost of author participation in the publishing environment, then we might have simply reversed the problem. Instead of not being able to read research, academics in the Global South will be excluded from participating in the academic discussion.

There was a discussion about the change in global publishing output since 2007, which reflects a big increase in output from China and Brazil, but otherwise shows that output is uneven and not inclusive.

One possible solution to this issue would be for open access publishers to make it clearer to authors that they offer waivers for authors who are unable to pay the APC. There was discussion about the question ‘what form should OA publishing take in Eastern and Southern Europe?’. The answer was that it should be inexpensive and use infrastructure that is publicly owned and cannot be sold.

Infrastructure

Ahhhh infrastructure. We are working within a fast consolidating environment. Elsevier continues to buy up companies to ensure it has representation across all aspects of the scholarly ecosystem and Digital Science is developing and acquiring new services to a similar end. See ‘Virtual Suites of tools/platforms supported by the same funder’ and ‘Vertical integration resulting from Elsevier’s acquisitions’. These are obvious examples but Clarivate Analytics has recently acquired Publons and ProQuest has absorbed Ex Libris which has in turn bought Research Research and has plans to create Esploro – a cloud-based research services platform, so this is prevalent across the sector.

This raises some serious concerns for the concept of ‘openness’. In his excellent round up, Geoff Bilder, Director of Strategic Initiatives at Crossref, commented that we are looking in the rear view mirror at things that have already happened and we are not noticing what is in front of us. While we might end up in a situation where publications are open access, these are not representative of the discussions that occurred to allow the authors to come to those conclusions. The REAL communication happens in coffee shops and online discussions. If these conversations are using proprietary systems (such as Slack, for example), then these conversations are hidden from us.

Who owns the information about what is being researched and the data behind it when the scholarly infrastructure is held within a commercial ecosystem? Is there an opportunity to reimagine? asked Kirsten Ratan, referencing SPARC’s action plan on ‘Securing community controlled infrastructure’. “In scholarly communication,” she summarised, “we have accepted the limitations of the infrastructure with a learned helplessness. It‘s time that these days are over.”

There are multiple projects currently in place around the world to collectively manage and support infrastructure. Kathleen Shearer described several projects:

  • Consortia negotiations such as OA2020 and SCOAP3
  • The Global Sustainability Coalition for Open Science Services (SCOSS) is an international group of leading academic and advocacy organisations that came together in 2016 to help secure the vital infrastructure underpinning Open Access and Open Science. SPARC Europe is a founding member.
  • The 5% commitment is a call that “Every academic library should commit to contribute 2.5% of its total budget to support the common infrastructure needed to create the open scholarly commons”. This is primarily a US and Canadian discussion.
  • OA membership models
  • APC funds

There are actually a couple of other projects not mentioned at COASP 2018. In 2017, several major funding organisations met and came to a strong consensus that core data resources for life sciences should be supported through a coordinated international effort to both ensure long term sustainability and appropriately align funding with scientific impact. The ELIXIR Core Data Resources project is identifying resources defined as a set of European data resources that are of fundamental importance to the wider life-science community and the long-term preservation of biological data.

OA Monographs

The final day of the event looked at OA monographs. Having come from a British Academy event on OA monographs the week before (see the Twitter discussion), this debate is fairly top of mind at the moment for me.

Sven Fund, who is both the Managing Director of Knowledge Unlatched and of fullstopp which is running a consultation on OA monographs for Universities UK, spoke about the OA monograph market. He noted that books are important, and not just because “people like to decorate their living rooms with them”. But he suggested that rather than just adding a few hyperlinks, we should be using the technology available to us with books. It has been the smaller publishers who have been innovating, large publishers have not been involved, which has limited the level of interest.

The OA book market is still small, with only 12,794 books and chapters listed in the Directory of OA Books (DOAB) compared to over 3 million articles  listed in the Directory of OA Journals (DOAJ). But growth in OA books is still strong even though the OA journal market matures. Libraries are the bottleneck, Sven argued, because they need to change the funding model significantly. There has been 10-15 years of discussion and now is the time to act. Libraries need to make a significant commitment that X% goes into open access.

There are also problems with demonstrating proof of impact of the OA book. Sven argued we need transparency and simplicity in the market, and said that no-one is doing an analysis of which books should be OA or not based on impact and usage. This needs to happen.

Sven said that royalties are important to authors – not because of the money but because it shows how much the work has been read. For this reason he argued we need publishers to share their usage data for a specific OA titles with the author. As an aside, it seems extraordinary that publishers are not already doing this, and I asked Sven why they don’t. He replied that it seems that ‘data is the new gold’ and therefore they do not share the information. Download information about open books is often protected because of the risk a of providing information that gives their competitors a commercial advantage.

But Sven also noted there needs to be context in the numbers. Libraries in the past have done a simple analysis of cost per download without taking into consideration the differences in topics. Of course some areas cost more per download than others, he said. There is also the risk that if you share this data then you might have a situation where a £10,000 book only has a few downloads which ‘looks bad’.

The profit imperative

There were some tensions at the meeting about profits. A question that arose early in the first panel discussion was: “Should we be ashamed as commercial publishers for making money?”. One response was that if you don’t make money you are not a commercial publisher. But the same person noted the ‘anti commercial sentiment’ in these discussions indicate that something is wrong.

A secondary observation was that open access publishers are doing a good job “while the current incentive systems are in place”. This of course points to the academic reward system controlling the behaviour of all players in this game.

As is always the case at open meetings, the Journal Impact Factor was never far away, although Paul Peters noted that the JIF was partly responsible for the success of OA journals, PLOS ONE took off when it received an impact factor. It was noted in that discussion that OA journals obtaining and increasing their JIF is ‘not proof of success, it’s proof of adaptation’.

The final talk was from Geoff Bilder. One participant described his talk on Twitter as “the best part of the publishing conference, where Geoff Bilder tells us everything that we’re doing that’s wrong”. Geoff noted that throughout the conference people had used some terms fairly loosely, including ‘commercial’ and ‘for profit’. He noted that profit doesn’t necessarily mean taking money out of the system, often profit is channelled back into the business.

In the end

In all it was an interesting and thought provoking conference. Possibly the most memorable part was the 12 flights of stairs between the lecture rooms and the breakout coffee and lunch space. This has been the first OA conference I have attended where participants improved their cardiovascular fitness as a side bonus to the event.

The Twitter hashtag is #COASP10

Published 24 September 2018
Written by Dr Danny Kingsley
Creative Commons License

Most Plan S principles are not contentious

This is a sister blog to “Relax everyone, Plan S is just the beginning of the discussion” and provides the ‘supplementary material’ to that blog. It discusses the points in the Plan S principles that are not particularly contentious.

At the end of this blog is a list of links and commentary to date on Plan S.

Not much new here

The Funders will ensure the establishment of robust criteria and requirements for the services that compliant high quality Open Access platforms and journals must provide.

This is perfectly reasonable. The amount of money being invested is huge and quite rightly, the funders want to articulate what they are prepared to pay for. It is also helpful from an institutional perspective to have guidelines that clearly identify which journals are compliant and which are not.

Indeed, there is a precedent. In 2017 the Wellcome Trust introduced a publisher requirement list stating that compliant publishers needed to deposit to PubMed Central Europe, apply the correct licence and provide invoices that contained complete and understandable information. They asked publishers to sign up to these principles to be listed on their ‘white list’.

Where applicable, Open Access publication fees are covered by Funding Agencies or universities…

This point reflects the status quo in the UK at least. Universities across the UK are currently managing open access payments through various funding models. In some instances, such as Cambridge, payments are only made from funds provided by funding bodies with no extra funds provided by the institution. Other institutions such as UCL provide central university funds in addition to those provided by funders. There are a small number of institutions which do not receive any funds from funders but do provide central funds for specific publications.

Of course, if journals were to flip to fully open access then funds currently being used to pay for subscriptions could be freed up to divert to expenditure on APCs for fully gold publications.

Funders will ask universities and libraries to align their policies and strategies, notably to ensure transparency.

While this might be a little tricky simply because of the individual governance arrangements at institution, it is a sensible thing to aim for.

The above principles shall apply to all research outputs, but it is understood that the timeline to achieve Open Access for monographs and books may be longer than 1st January 2020.

Open Access monographs ARE contentious, don’t get me wrong. But in the context of this statement of principle, there is concession that there is some work to be done in this space. And we already knew that UKRI intends to include monographs in the post REF2021 (as in, anything published from 1 January 2021). Wellcome Trust have had OA monographs in their policy for years.

The importance of open archives and repositories for hosting research outputs is acknowledged because of their long-term archiving function and their potential for editorial innovation.

Now I know this is contentious for us Open Access nerds because there is a sense that repositories are once again being pushed into the shadows, which is what happened with the Finch report. But as noted in the main blog, under Plan S, deposit of an Author’s Accepted Manuscript into a repository is compliant if it is there under a CC-BY licence and with a zero embargo.

Some issues are operational

In a few instances, the queries or concerns raised about Plan S are actually operational ones.

When APCs are applied, their funding is standardised and capped (across Europe)

Currently the RCUK (now UKRI) does cap funding to Universities, using a complex algorithm to determine allocations in a given year to support the institutions meeting the open access policy. This has resulted in some institutions (including Cambridge) to identify a preference for publishers  exhibiting actions towards an open access future.

Manchester University has introduced new criteria for payment of APCs. They support “Publishers who are taking a sustainable and affordable approach to the transition to OA, e.g. by reducing the cost of publishing Gold OA in hybrid (subscription) journals via offsetting deals or membership schemes are listed below:…” They include a list of journals for which APCs will not be paid.

The alternative interpretation of this statement will be that individual APCs will be capped. This would have implications for all administrators of APCs. It would have particular implications for Cambridge University because of the relatively high proportion of papers published in expensive open access journals such as Nature Communications. The University would both have to find funds to supplement the cost, and also provide the administrative support for this process. This is where discussions need to happen about redirecting subscription budgets towards open access activities. While Plan S adds some urgency, there is time to have these.

The Funders will monitor compliance and sanction non-compliance.

This is the statement that has some administrative staff highly concerned. In the end it will fall upon them to ensure their research community is up to speed and doing the required activities. But we have had sanctions for non-compliance to Wellcome Trust policies since 2014 so this in itself is not new.

Relevant documents from Science Europe

Commentary, news stories & press releases

There has been considerable discussion about Plan S – here are just a few links that might be interesting. NOTE this list has been moved and is now being maintained on a separate blog: ‘Plan S – links, commentary and news items‘.

Published 12 September 2018
Written by Dr Danny Kingsley
Creative Commons License

What I wish I’d known at the start – setting up an RDM service

In August, Dr Marta Teperek began her new role at Delft University in the Netherlands. In her usual style of doing things properly and thoroughly, she has contributed this blog reflecting on the lessons learned in the process of setting up Cambridge University’s highly successful Research Data Facility.

On 27-28 June 2017 I attended the Jisc’s Research Data Network meeting at the University of York. I was one of several people invited to talk about experiences of setting up RDM services in a workshop organised by Stephen Grace from London South Bank University and Sarah Jones from the Digital Curation Centre. The purpose of the workshop was to share lessons learned and help those that were just starting to set up research data services within their institutions. Each of the presenters prepared three slides: 1. What went well, 2. What didn’t go so well, 3. What they would do differently. All slides from the session are now publicly available.

For me the session was extremely useful not only because of the exchange of practices and learning opportunity, but also because the whole exercise prompted me to critically reflect on Cambridge Research Data Management (RDM) services. This blog post is a recollection of my thoughts on what went well, what didn’t go so well and what could have been done differently, as inspired by the original workshop’s questions.

What went well

RDM services at Cambridge started in January 2015 – quite late compared to other UK institutions. The late start meant however that we were able to learn from others and to avoid some common mistakes when developing our RDM support. The Jisc’s Research Data Management mailing list was particularly helpful, as it is a place used by professionals working with research data to look for help, ask questions, share reflections and advice. In addition, Research Data Management Fora organised by the Digital Curation Centre proved to be not only an excellent vehicle for knowledge and good practice exchange, but also for building networks with colleagues in similar roles. In addition, Cambridge also joined the Jisc Research Data Shared Service (RDSS) pilot, which aimed to create a joint research repository and related infrastructure. Being part of the RDSS pilot not only helped us to further engage with the community, but also allowed us to better understand the RDM needs at the University of Cambridge by undertaking the Data Asset Framework exercise.

In exchange for all the useful advice received from others, we aimed to be transparent about our work as well. We therefore regularly published blog posts about research data management at Cambridge on the Unlocking Research blog. There were several additional advantages of the transparent approach: it allowed us to reflect on our activities, it provided an archival record of what was done and rationale for this and it also facilitated more networking and comments exchange with the wider RDM community.

Engaging Cambridge community with RDM

Our initial attempts to engage research community at Cambridge with RDM was compliance based: we were telling our researchers that they must manage and share their research data because this was what their funders require. Unsurprisingly however, this approach was rather unsuccessful – researchers were not prepared to devote time to RDM if they did not see the benefits of doing so. We therefore quickly revised the approach and changed the focus of our outreach to (selfish) benefits of good data management and of effective data sharing. This allowed us to build an engaged RDM community, in particular among early career researchers. As a result, we were able to launch two dedicated programmes, further strengthening our community involvement in RDM: the Data Champions programme and also the Open Research Pilot Project. Data Champions are (mostly) researchers, who volunteered their time to act as local experts on research data management and sharing to provide advice and specialised training within their departments.The Open Research Pilot Project is looking at the benefits and barriers to conducting Open Research.

In addition, ensuring that the wide range of stakeholders from across the University were part of the RDM Project Group and had an oversight of development and delivery of RDM services, allowed us to develop our services quite quickly. As a result, services developed were endorsed by wide range of stakeholders at Cambridge and they were also developed in a relatively coherent fashion. As an example, effective collaboration between the Office of Scholarly Communication, the Library, the Research Office and the University Information Services allowed integration between the Cambridge research repository, Apollo, and the research information system, Symplectic Elements.

What didn’t go so well

One of the aspects of our RDM service development that did not go so well was the business case development. We started developing the RDM business case in early 2015. The business case went through numerous iterations, and at the time of writing of this blog post (August 2017), financial sustainability for the RDM services has not yet been achieved.

One of the strongest factors which contributed to the lack of success in business case development was insufficient engagement of senior leadership with RDM. We have invested a substantial amount of time and effort in engaging researchers with RDM and by moving away from compliance arguments, to the extent that we seem to have forgotten that compliance- and research integrity-based advocacy is necessary to ensure the buy in of senior leadership.

In addition, while trying to move quickly with service development, and at the same time trying to gain trust and engagement in RDM service development from the various stakeholder groups at Cambridge, we ended up taking part in various projects and undertakings, which were sometimes loosely connected to RDM. As a result, some of the activities lacked strategic focus and a lot of time was needed to re-define what the RDM service is and what it is not in order to ensure that expectations of the various stakeholders groups could be properly managed.

What could have been done differently

There are a number of things which could have been done differently and more effectively. Firstly, and to address the main problem of insufficient engagement with senior leadership, one could have introduced dedicated, short sessions for principal investigators on ensuring effective research data management and research reproducibility across their research teams. Senior researchers are ultimately those who make decisions at research-intensive institutions, and therefore their buy-in and their awareness of the value of good RDM practice is necessary for achieving financial sustainability of RDM services.

In addition, it would have been valuable to set aside time for strategic thinking and for defining (and re-defining, as necessary) the scope of RDM services. This is also related to the overall branding of the service. In Cambridge a lot of initial harm was done due to negative association between Open Access to publications and RDM. Due to overarching funders’ and government’s requirements for Open Access to publications, many researchers started perceiving Open Access to publications merely as a necessary compliance condition. The advocacy for RDM at Cambridge started as ‘Open Data’ requirements, which led many researchers to believe that RDM is yet another requirement to comply with and that it was only about open sharing of research data. It took us a long time to change the messages and to rebrand the service as one supporting researchers in their day to day research practice and that proper management of research data leads to efficiency savings. Finally, only research data which are management properly from the very start of the research process can be then easily shared at the end of the project.

Finally, and which is also related to the focusing and defining of the service, it would have been useful to decide on a benchmarking strategy from the very beginning of the service creation. What is the goal(s) of the service? Is it to increase the number of shared datasets? Is it to improve day to day data management practice? Is to to ensure that researchers know how to use novel tools for data analysis? And, once the goal(s) is decided, design a strategy to benchmark the progress towards achieving this goal(s). Otherwise it can be challenging to decide which projects and undertakings are worth continuation and which ones are less successful and should be revised or discontinued. In order to address one aspect of benchmarking, Cambridge led the creation of an international group aiming to develop a benchmarking strategy for RDM training programmes, which aims to create tools for improving RDM training provision.

Final reflections

My final reflection is to re-iterate that the questions asked of me by the workshop leaders at the Jisc RDN meeting really inspired me to think more holistically about the work done towards development of RDM services at Cambridge. Looking forward I think asking oneself the very same three questions: what went well, what did not go so well and what you would do differently, might become for a useful regular exercise ensuring that RDM service development is well balanced and on track towards its intended goals.


Published 24 August 2017
Written by Dr Marta Teperek

Creative Commons License

Open Resources: Who Should Pay?

This blog is the first in a series of three which considers the perspectives of researchers, funders and universities in relation to the support for open resources, coordinated and written by Dr Lauren Cadwallader. This post asks the question: What is the responsibility of national funders to research resources that are internationally important?

In January 2017 the Office of Scholarly Communication and Wellcome Trust started an Open Research Pilot Project to try to understand how we could help our researchers work more openly and what barriers they faced with making their work open. One of the issues that is a common theme with the groups that we are working with is the issue of the sustainability of open resources.

The Virtual Fly Brain Example

Let’s take the Connectomics group I am working with for example. They investigate the connections of neurons in fly brains (Drosophila). They produce a lot of data and are committed to sharing this openly. They share their data via the Virtual Fly Brain platform (VFB).

This platform was set up in 2009 by a group of researchers in Cambridge and Edinburgh; some of the VFB team are now also involved in the Connectomics group so there is a close relationship between these projects. The platform was created as a domain-specific location to curate existing data, taken from the literature, on Drosophila neurons and for curating and sharing new data produced by researchers working in this area.

Initially it was set up thanks to a grant from the Biotechnology and Biological Sciences Research Council (BBSRC). After an initial three year grant, the BBSRC declined to fund the database further. One likely reason for this is that the BBSRC resources scheme explicitly favours resources with a large number of UK users. The number of UK researchers who use Drosophila brain image data is relatively small (<10 labs), whereas the number of international researchers who use this data is relatively large, with an estimated 200 labs working on this type of data in other parts of the world.

Subsequently, the Wellcome Trust stepped in with funding for a further three years, due to end in September 2017. Currently it is uncertain whether or not they will fund it in the future. By now, almost eight years after its creation, VFB has become the go-to source for openly available data on Drosophila brain information and images integrated into a queryable platform. No other resource like it exists and no other research group is making moves to curate Drosophila neurobiology data openly. The VFB case raises interesting and important questions about how resources are funded and the future of domain specific open infrastructures.

The status quo

On the one hand funders like the Wellcome Trust, Research Councils UK and National Institutes of Health (NIH) are encouraging researchers to use domain specific repositories for data sharing. Yet on the other, they are acknowledging that the current approaches for these resources are not necessarily sustainable.

A recent review on building and sustaining data infrastructures commissioned by the Wellcome Trust acknowledges that in light of the FAIR principles “it is clear that data is best made available through repositories where aggregation can add most value”, which is arguably in a domain-specific repository. Use of domain-specific repositories allows data to be aggregated with similar data recorded using the same metadata fields.

It is also clear that publishers can influence where data is deposited, with publishers such as Nature Publishing Group, PLOS and F1000 all recommending subject-specific repositories as the first choice place for deposition. If no subject-specific repository is available then unstructured repositories, such as Dryad or figshare are often recommended instead, which complicates infrastructure needs and therefore provisions.

The economic model for supporting data infrastructures is something the Wellcome Trust are considering, with reports recently published by other funding agencies (here, here and here). The Wellcome Trust’s commissioned review noted that project-based funding for data infrastructures in not sustainable in the long term.

However, historically funders have encouraged, and still encourage, the use of domain specific resources, which have been born from project-based funding because of a lack of provision elsewhere. This has created a complex situation – researchers created domain specific data infrastructures using their project funding; these have become the subject norm; funder’s encourage their use, but now don’t have the mechanisms to be able to pledge sustained long-term funding.

National interests?

What is the responsibility of national funders to research resources that are internationally important? Academic research is collaborative. It crosses borders and utilises shared knowledge regardless of where it was generated and this is acknowledged by funders who see the benefits of collaboration. Yet, the strategic goals of funders, such as the BBSRC, are often focused on the national level when it comes to relevance and importance.

On the one hand it is understandable that funders concentrate on national interests – taxpayers’ money goes into the funder’s coffers and therefore they have a responsibility to those taxpayers to ensure that the money is spent on research that benefits the nation.

But, one could argue that international collaboration is in the national interest. The US-based NIH funds resources that are of international importance, including most of the model organism databases and genomic resources, such as the Gene Expression Omnibus. These are highly used by US researchers so one could argue that NIH are acting in the national interest but they are open to researchers all over the world and therefore constitute a resource of international importance.

Wellcome Trust do have a global outlook when it comes to funding, with 21% of their total spend (2015-6) going to projects outside of the UK. Yet, the VFB resource is still vulnerable despite being an internationally important resource.

One of the motivations for the Connectomics group to to participate in the Open Research Pilot is to open a dialogue with the Wellcome Trust about these issues. The Wellcome Trust are committed to strategically investing in Open Research and encourage the use of domain-specific resources. The Connectomics group are interested in how will this strategic investment translate into actual funding decisions now and into the future.

Issues on which researchers would like clarification

All the researchers who are part of the Open Research Pilot have had the opportunity to contribute to questions on open resources sustainability. Posts on the funder’s and University’s perspective will be published as parts 2 and 3 of this blog.

  1. What do you think is the responsibility of national funders towards research resources that are of more international benefit than national?
  2. How do you think the funding landscape will react to the move towards open research in terms of supporting the sustainability of resources used for curating and sharing data?
  3. Researchers are asked to share their data in domain specific resources if they are available. There are 1598 discipline specific repositories listed on re3data.org and each one needs to be supported. How big does a research community need to be to expect support?
  4. What percentage of financial support should be focussed on resources versus primary research?
  5. If funders are reluctant to pay for domain specific resources, is there a need to move to a researcher pays model for data sharing rather than centrally funding resources in some circumstances? Why? How do they envisage this being paid for?
  6. How can we harmonise the approach to sustainable open resources across a global research community? Should we move to centralised infrastructures like the European Open Science Cloud?
  7. More generally how can funders and employers help to incentivise open research (carrot or stick?)
  8. Wellcome often tries to act in a way to bring about change (e.g. open access publishing): Do they envisage that the long term funding of open research (10-20 years from now) will be very different from the situation over e.g. the next 5 years?

Published 23 June 2017
Written by Dr Lauren Cadwallader

Creative Commons License

Strategies for engaging senior leadership with RDM – IDCC discussion

This blog post gathers key reflections and take-home messages from a Birds of a Feather discussion on the topic of senior management engagement with RDM, and while written by a small number of attendees, the content reflects the wider discussion in the room on the day. [Authors: Silke Bellanger, Rosie Higman, Heidi Imker, Bev Jones, Liz Lyon, Paul Stokes, Marta Teperek*, Dirk Verdicchio]

On 20 February 2017, stakeholders interested in different aspects of data management and data curation met in Edinburgh to attend the 12th International Digital Curation Conference, organised by the Digital Curation Centre. Apart from discussing novel tools and services for data curation, the take-home message from many presentations was that successful development of Research Data Management (RDM) services requires the buy-in of a broad range of stakeholders, including senior institutional leadership

Summary

The key strategies for engaging senior leadership with RDM that were discussed were:

  • Refer to doomsday scenarios and risks to reputations
  • Provide high profile cases of fraudulent research
  • Ask senior researchers to self-reflect and ask them to imagine a situation of being asked for supporting research data for their publication
  • Refer to the institutional mission statement / value statement
  • Collect horror stories of poor data management practice from your research community
  • Know and use your networks – know who your potential allies are and how they can help you
  • Work together with funders to shape new RDM policies
  • Don’t be afraid to talk about the problems you are experiencing – most likely you are not alone and you can benefit from exchanging best practice with others

Why it is important to talk about engaging senior leadership in RDM?

Endorsement of RDM services by senior management is important because frequently it is a prerequisite for the initial development of any RDM support services for the research community. However, the sensitive nature of the topic (both financially and sometimes politically as well) means there are difficulties in openly discussing the issues that RDM service developers face when proposing business cases to senior leadership. This means the scale of the problem is unknown and is often limited to occasional informal discussions between people in similar roles who share the same problems.

This situation prevents those developing RDM services from exchanging best practice and addressing these problems effectively. In order to flesh out common problems faced by RDM service developers and to start identifying possible solutions, we organised an informal Birds of a Feather discussion on the topic during the 12th IDCC conference. The session was attended by approximately 40 people, including institutional RDM service providers, senior organisational leaders, researchers and publishers.

What is the problem?

We started by fleshing out the problems, which vary greatly between institutions. Many participants said that their senior management was disengaged with the RDM agenda and did not perceive good RDM as an area of importance to their institution. Others complained that they did not even have the opportunity to discuss the issue with their senior leadership. So the problems identified were both with the conversations themselves, as well as with accessing senior management in the first place.

We explored the type of senior leadership groups that people had problems engaging with. Several stakeholders were identified: top level institutional leadership, heads of faculties and schools, library leadership, as well as some research team leaders. The types of issues experienced when interacting with these various stakeholder groups also differed.

Common themes

Next we considered if there were any common factors shared between these different stakeholder groups. One of the main issues identified was that people’s personal academic/scientific experience and historic ideals of scientific practice were used as a background for decision making.

Senior leaders, like many other people, tend to look at problems with their own perspective and experience in mind. In particular, within the rapidly evolving scholarly communication environment what they perceive as community norms (or in fact community problems) might be changing and may now be different for current researchers.

The other common issue was the lack of tangible metrics to measure and assess the importance of RDM which could be used to persuade senior management of RDM’s usefulness. The difficulties in applying objective measures to RDM activities are mostly due to the fact that every researcher is undertaking an amount of RDM by default so it is challenging to find an example of a situation without any RDM activities that could be used as a baseline for an evidenced-based cost benefit analysis of RDM. The work conducted by Jisc in this area might be able to provide some solutions for this. Current results from this work can be found on the Research Data Network website.  

What works?

The core of our discussion was focused on exchanging effective methods of convincing managers and how to start gathering evidence to support the case for an RDM service within an institution.

Doomsday scenarios

We all agreed that one strategy that works for almost all possible audience types are doomsday scenarios – disasters that can happen when researchers do not adhere to good RDM practice. This could be as simple as asking individual senior researchers what they would do if someone accused them of falsifying research data five years after they have published their corresponding research paper. Would they have enough evidence to reject such accusations? The possibility of being confronted with their own potential undoing helped convince many senior managers of the importance of RDM.

Other doomsday scenarios which seem to convince senior leaders were related to broader institutional crises, such as risk of fire. Useful examples are the fire which destroyed the newly built Chemistry building at the University of Nottingham, the fire which destroyed valuable equipment and research at the University of Southampton (£120 million pounds’ worth of equipment and facilities), the recent fire at the Cancer Research UK Manchester Institute and a similar disaster at the University of Santa Cruz.

Research integrity and research misconduct

Discussion of doomsday scenarios led us to talk about research integrity issues. Reference to documented cases of fraudulent research helped some institutions convince their senior leadership of the importance of good RDM. These cases included the fraudulent research by Diederik Stapel from Tilburg University or by Erin Potts-Kant from Duke University, where $200 million in grants was awarded based on fake data. This led to a longer discussion about research reproducibility and who owns the problem of irreproducible research – individual researchers, funders, institutions or perhaps publishers. We concluded that responsibility is shared, and that perhaps the main reason for the current reproducibility crisis lies in the flawed reward system for researchers. 

Research ethics and research integrity are directly connected to good RDM practice and are also the core ethical values of academia. We therefore reflected on the importance of referring to the institutional value statement/mission statement or code of conduct when advocating/arguing for good RDM. One person admitted adding a clear reference to the institutional mission statement whenever asking senior leadership for endorsement for RDM service improvements. The UK Concordat on Open Research Data is a highly regarded external document listing core expectations on good research data management and sharing, which might be worth including as a reference. In addition, most higher education institutions will have mandates in teaching and research, which might allow good RDM practice to be endorsed through their central ethics committees.

Bottom up approaches to reach the top

The discussion about ethics and the ethos of being a researcher started a conversation about the importance of bottom up approaches in empowering the research community to drive change and bring innovation. As many researcher champions as possible should convince senior leadership about important services. Researcher voices are often louder than those of librarians, or those running central support services, so consider who will best help to champion your cause.

Collecting testimonies from researchers about the difficulties of working with research data when good data management practice was not adhered to is also a useful approach. Shared examples of these included horror stories such as data loss from stolen laptops (when data had not been backed up), newly started postdocs inheriting projects and the need to re-do all the experiments from scratch due to lack of sufficient data documentation from their predecessor, or lost patent cases. One person mentioned that what worked at their institution was an ‘honesty box’ where researchers could anonymously share their horror data management stories.

We also discussed the potential role of whistle-blowers, especially given the fact that reputational damage is extremely important for institutions. There was a suggestion that institutions should add consequences of poor data management practice to their institutional risk registers. The argument that good data management practice leads to time and efficiency savings also seems to be powerful when presented to senior leadership.

The importance of social networks

We then discussed the importance of using one’s relationships in getting senior management’s endorsement for RDM. The key to this is getting to know the different stakeholders, their interests and priorities, and thinking strategically about target groups: who are potential allies? Who are the groups who are most hesitant about the importance of RDM? Why are they hesitant? Could allies help with any of these discussions? A particularly powerful example was from someone who had a Nobel Prize winner ally, who knew some of the senior institutional leaders and helped them to get institutional endorsement for their cause.

Can people change?

The question was asked whether anyone had an example of a senior leader changing their opinion, not necessarily about RDM services. Someone suggested that in case of unsupportive leadership, persistence and patience are required and that sometimes it is better to count on a change of leadership than a change of opinions. Another suggestion was that rebranding the service tends to be more successful than hoping for people to change. Again, knowing the stakeholders and their interests is helpful in getting to know what is needed and what kind of rebranding might be appropriate. For example, shifting the emphasis from sharing of research data and open access to supporting good research data management practice and increasing research efficiency was something that had worked well at one institution.

This also led to a discussion about the perception of RDM services and whether their governance structure made a difference to how they were perceived. There was a suggestion that presenting RDM services as endeavours from inside or outside the Library could make a difference to people’s perceptions. At one science-focused institution anything coming from the library was automatically perceived as a waste of money and not useful for the research community and, as a result, all business cases for RDM services were bound to be unsuccessful due to the historic negative perception of the library as a whole. Opinion seemed to confirm that in places where libraries had not yet managed to establish themselves as relevant to 21st century academics, pitching library RDM services to senior leadership was indeed difficult. A suggested approach is to present RDM services as collaborative endeavours, and as joint ventures with other institutional infrastructure or service providers, for example as a collaboration between the library and the central IT department. Again, strong links and good relationships with colleagues at other University departments proved to be invaluable in developing RDM services as joint ventures.

The role of funding bodies

We moved on to discuss the need for endorsement for RDM at an institutional level occurring in conjunction with external drivers. Institutions need to be sustainable and require external funding to support their activities, and therefore funders and their requirements are often key drivers for institutional policy changes. This can happen on two different levels. Funding is often provided on the condition that any research data generated as a result needs to be properly managed during the research lifecycle, and is shared at the end of the project.

Non-compliance with funders’ policies can result in financial sanctions on current grants or ineligibility for individual researchers to apply for future grant funding, which can lead to a financial loss for the University overall. Some funders, such as the Engineering and Physical Sciences Research Council (EPSRC) in the United Kingdom, have clear expectations that institutions should support their researchers in adhering to good research data management practice by providing adequate infrastructure and policy framework support, therefore directly requesting institutions to support RDM service development.

Could funders do more?

There was consensus that funding bodies could perhaps do more to support good research data management, especially given that many non-UK funders do not yet have requirements for research data management and sharing as a condition of their grants. There was also a useful suggestion that funders should make more effort to ensure that their policies on research data management and sharing are adhered to, for example by performing spot-checks on research papers acknowledging their funding to see if supporting research data was made available, as the EPSRC have been doing recently.

Similarly, if funders would do more to review and follow up on data management plans submitted as part of grant applications it would be useful in convincing researchers and senior leadership of the importance of RDM. Currently not all funders require that researchers submit data management plans as part of grant applications. Although some pioneering work aiming to implement active data management plans started, people taking part in the discussion were not aware of any funding body having a structured process in place to review and follow up on data management plans. There was a suggestion that institutions should perhaps be more proactive in working together with funders in shaping new policies. It would be useful to have institutional representatives at funders’ meetings to ensure greater collaboration.

Future directions and resources

Overall we felt that it was useful to exchange tips and tricks so we can avoid making the same mistakes. Also, for those who had not yet managed to secure endorsement for RDM services from their senior leaders it was reassuring to understand that they were not the only ones having difficulty. Community support was recognised as valuable and worth maintaining. We discussed what would be the best way of ensuring that the advice exchanged during the meeting was not lost, and also how an effective exchange of ideas on how best to engage with senior leadership should be continued. First of all we decided to write up a blog post report of the meeting and to make it available to a wider audience.

Secondly, Jisc agreed to compile the various resources and references mentioned and to create a toolkit of techniques with examples for making RDM business cases for RDM. An initial set of resources useful in making the case can be found on the Research Data Network webpages. The current resources include A High Level Business Case, some Case studies and Miscellaneous resources – including Videos, slide decks, infographics, links to external toolkits, etc. Further resources are under development and are being added on a regular basis.

The final tip to all RDM service providers was that the key to success was making the service relevant and that persistence in advocating for the good cause is necessary. RDM service providers should not be shy about sharing the importance of their work with their institution, and should be proud of the valuable work they are doing. Research datasets are vital assets for institutions, and need to be managed carefully, and being able to leverage this is the key in making senior leadership understand that providing RDM services is essential in supporting institutional business.

Published 5 May 2017
Written by Silke Bellanger, Rosie Higman, Heidi Imker, Bev Jones, Liz Lyon, Paul Stokes, Dr Marta Teperek and Dirk Verdicchio

Creative Commons License

Open Research Project, first thoughts

Dr Laurent Gatto is one of the participants in the Office of Scholarly Communication’s Open Research Pilot. He has recently blogged about his first impressions of the pilot. With his permission we have re-blogged it here.

I am proud to be one of the participants in the Wellcome Trust Open Research Project (and here). The call was initially opened in December 2016 and was pitched like this:

Are you in favour of more transparency in research? Are you concerned about research reproducibility? Would you like to get better recognition and credit for all outputs of your research process? Would you like to open up your research and make it more available to others?

If you responded ‘yes’ to any of these questions, we would like to invite you to participate in the Open Research Pilot Project, organised jointly by the Open Research team at the Wellcome Trust and theOffice of Scholarly Communication at the University of Cambridge.

This of course sounded like a great initiative for me and I promptly filed an application.

We had our kick-off meeting on the 27th January, with the aim of getting to know each other and somehow define/clarify some of the objectives of the project. This post summarises my take on it.

Here’s how I introduced myself.

Who are you?

Laurent Gatto, Senior Research Associate in the Department of Biochemistry, physically located in Systems Biology and the Maths Department. SSI fellow and Software/Data Carpentry instructor and generally involved in the Open community in Cambridge, such as OpenConCam and Data Champions initiative.

What is your research about and what kind of data does your research generate?

My area of research is computational biology, with special focus on high-throughput proteomics and integration of different data and annotations. I use raw data produced by third parties, in particular the Cambridge Centre for Proteomics (mass spectrometry data), and produce processed/annotated/interactive data and a lot of software (and also here).

What motivated you to participate in the Pilot?

Improve openness/transparency (and hence reproducibility/rigour) in my research and communication, and participate in improving openness (and hence reproducibility/rigour) more widely.

What kind of outputs are you planning to share? Do you foresee any difficulties in sharing?

My direct outputs are systematically shared openly early on: open source software (before publication), pre-prints, improved data (as data packages). Difficulties, if any, generally stem from collaborators less willing to share early and openly.

A personal take on the project

It is a long project, 2 years, and hence a rather ambitious one, of a unique kind. Hence, we will have to define its overall goals as we go. The continued involvement of the participants over time will play a major role in the project’s success.

What are attainable goals?

It is important to note that there is no funding for the participants. We are driven by a desire to be open, benefit from being open and the visibility that we can gain through the project, and the prospect that the Wellcome Trust will learn from our experience and, implement any lessons learnt. We get to interact with each other and with research support librarians, who will help us throughout the duration of the project. We also commit to sharing of research outputs beyond traditional publications and to engage with the Project, by participating in Project meetings and contributing to Project publications.

A lot of our initial discussions centred around rewards for open research or, actually, lack thereof and perceived associated risks. Indeed, the traditional academic rewarding system and the competitiveness in research leaves little room for reproducibility and openness. It is, I believe, all participants hope that this project will benefit us, in some form or another.

A critical point that is missing is the academic promotion of open research and open researcher, as a way to promote a more rigorous and sound research process and tackle the reproducibility crisis. What should the incentives be? How to make sure that the next generation of academics genuinely value openness and transparency as a foundation of rigorous research?

Some desired outputs

Ideally, I would like that the Wellcome Trust’s famous Research investigator awards to be de facto Open research investigator awards. There’s currently a split (opposition?) between doing research and supporting open science when doing research. In every grant I have written, I had to demonstrate that the team had a track record, or was in a good position to successfully pursue to proposed project. Well, how about demonstrating a track record in being good in opening and sharing science outputs? Every researcher submitting a grant should convincingly demonstrate that they are, have been and/or will be proactive open researcher and openly disseminate all the outputs. By leading by example in the frame of this Open Research Project, this is something that the Wellcome Trust could take away from.

Unfortunately, it is a fact that open science is not on the agenda of many (most?) more senior researchers and that they are neither in a position to be open nor that open science is a priority at all. I find it particularly disheartening that many senior academics (i.e. those that will sit on the panel deciding if I’m worth my next job) consider investing time in open science and the promotion of open science as time wasted of actually doing research. A bit like time for outreach and promotion of science to the wider public is sometimes looked down at, as not being the real stuff.

Another desire is that this project will enable us to influence funders, such as the Wellcome Trust, of course, but also more widely the research councils.

As a concrete example, I would like all grants that are accepted to be openly published beyond the daft layman summary. Published grants after acceptance should include data management plan, the pathway to impact, possibly more, and these could then be used to assess to what extend the project delivered as promised.

This serves at least two purposes. First, it is a way to promote transparency and accountability towards the funder, scientific community and public. Also, it is a great resource for early career researchers. Unless there is specific support in place, writing a first grant is not an easy job, especially given the multitude documents to prepare in addition to the scientific case for support. And even for more experienced researchers, it can’t harm to explore different approaches to grant writing.

Another concrete output is the requirement for a dedicated software management plan for each grant that involves any software development. I certainly consider my software to be equivalent to data and document it as such in my DMPs, but there seems to be a need for clarification.

I believe that I do a pretty decent job in conducting open science: pre-prints, open access, release data, … In the frame of this project, I shall do a better job at promoting open science for its own sake.

I also hope that by bringing some of my projects under the umbrella of the the Open Research Project, I will benefit from a broader dissemination that will, directly or indirectly, be beneficial for my career (see the importance of benefits and rewards above).

Next steps

It is important to make the most out of this unique opportunity. We need to create a momentum, define ambitious goals, and work hard to reach them. But I also think that it is important to get as much input as possible from the community. Nothing beats collective intelligence for such open-ended projects, in particular for open projects.

So please, do not hesitate to comment, discuss on twitter or elsewhere, or email me directly if you have ideas you would like to promote and or discuss.

Published 08 March 2017
Written by Dr Laurent Gatto
Creative Commons License

Show me the money – the path to a sustainable Research Data Facility

Like many institutions in the UK, Cambridge University has responded to research funders’ requirements for data management and  sharing with a concerted effort to support our research community in good data management and sharing practice through our Research Data Facility. We have written a few times on this blog and presented to describe our services. This blog is a description of the process we have undertaken to support these services in the long term.

Funders expect  that researchers make the data underpinning their research available and provide a link to this data in the paper itself. The EPSRC started checking compliance with their data sharing requirement on 1 May 2015. When we first created the Research Data Facility we spoke to many researchers across the institution and two things became very clear. One was that there was considerable confusion about what actually counts as data, and the second was that sharing data on publication is not something that can be easily done as an afterthought if the data was not properly managed in the first place.

We have approached these issues separately. To try and determine what is actually required from funders beyond the written policies we have invited representatives from our funders to come to discussions and forums with our researchers to work out the details. So far we have hosted Ben Ryan from the EPSRC, Michael Ball from the BBSRC and most recently David Carr and Jamie Enoch from the Wellcome Trust and CRUK respectively.

Dealing with the need for awareness of research data management has been more complex. To raise awareness of good practice in data management and sharing we embarked on an intense advocacy programme and in the past 15 months have organised 71 information sessions about data sharing (speaking with over 1,700 researchers). But we also needed to ensure the research community was managing its data from the beginning of the research process. To assist this we have developed workshops on various aspects of data management (hosting 32 workshops in the past year), a comprehensive website, a service to support researchers with their development of their research data management plans and a data management consultancy service.

So far, so good. We have had a huge response to our work, and while we encourage researchers to use the data repository that best suits their material, we do offer our institutional repository Apollo as an option. We are as of today, hosting 499 datasets in the repository. The message is clearly getting through.

Sustainability

The word sustainability (particularly in the scholarly communication world) is code for ‘money’. And money has become quite a sticking point in the area of data management. The way Cambridge started the Research Data Facility was by employing a single person, Dr Marta Teperek for one year, supported by the remnants of the RCUK Transition Fund. It became quickly obvious that we needed more staff to manage the workload and now the Facility employs half an Events and Outreach Coordinator and half a Repository Manager plus a Research Data Adviser who looks after the bulk of the uploading of data sets into the repository.

Clearly there was a need to work out the longer term support for staffing the Facility – a service for which there are no signs of demand slowing. Early last year we started scouting around for options.  In April 2013 the RCUK released some guidance that said it was permissible to recover costs from grants through direct charges or overheads – but noted institutions could not charge twice. This guidance also mentioned that it was permissible for institutions to recover costs of RDM Facilities as other Small Research Facilities, “provided that such facilities are transparently charged to all projects that use them”.

Transparency

On the basis of that advice we established a Research Data Facility as a Small Research Facility according to the Transparent Approach to Costing (TRAC) methodology. Our proposal was that Facility’s costs will be recovered from grants as directly allocated costs. We chose this option rather than overheads because of the advantage of transparency to the funder of our activities. By charging grants this way it meant a bigger advocacy and education role for the Facility. But the advantage is that it would make researchers aware that they need to consider research data management seriously, that this involves both time and money, and that it is an integral part of a grant proposal.

Dr Danny Kingsley has argued before (for example in a paper ‘Paying for publication: issues and challenges for research support services‘) that by centralising payments for article processing charges, the researchers remain ignorant of the true economics of the open access system in the way that they are generally unaware of the amounts spent on subscriptions. If we charged the costs of the Facility into overheads, it becomes yet another hidden cost and another service that ‘magically’ happens behind the scenes from the researcher’s point of view.

In terms of the actual numbers, direct costs of the Research Data Facility included salaries for 3.2 FTEs (a Research Data Facility Manager, Research Data Adviser, 0.5 Outreach and Engagement Coordinator, 0.5 Repository Manager, 0.2 Senior Management time), hardware and hardware maintenance costs, software licences, costs of organising events as well as the costs of staff training and conference attendance. The total direct annual cost of our Facility was less than £200,000. These are the people cost of the Facility and are not to be confused with the repository costs (for which we do charge directly).

Determining how much to charge

Throughout this process we have explored many options for trying to assess a way of graduating the costing in relation to what support might be required. Ideally, we would want to ensure that the Facility costs can be accurately measured based on what the applicant indicated in their data management plan. However, not all funders require data management plans. Additionally, while data management plans provide some indication of the quantity of data (storage) to be generated, they do not allow a direct estimate of the amount of data management assistance required during the lifetime of the grant. Because we could not assess the level of support required for a particular research project from a data management plan, we looked at an alternative charging strategy.

We investigated charging according to the number of people on a team, given that the training component of the Facility is measurable by attendees to workshops. However, after investigation we were unable to easily extract that type of information about grants and this also created a problem for charging for collaborative grants. We then looked at charging a small flat charge on every grant requiring the assistance of the Facility and at charging proportionally to the size (percentage of value) of the grant. Since we did not have any compelling evidence that bigger grants require more Facility assistance, we proposed a model of flat charging on all grants, which require Facility assistance. This model was also the most cost-effective from an administrative point of view.

As an indicator of the amount of work involved in the development of the Business Case, and the level of work and input that we have received relating to it, the document is now up to version 18 – each version representing a recalculation of the costings.

Collaborative process

A proposal such as we were suggesting – that we charge the costs of the Facility as a direct charge against grants – is reasonably radical. It was important that we ensure the charges would be seen as fair and reasonable by the research community and the funders. To that end we have spent the best part of a year in conversation with both communities.

Within the University we had useful feedback from the Open Access Project Board (OAPB) when we first discussed the option in July last year. We are also grateful to the members of our community who subsequently met with us in one on one meetings to discuss the merits of the Facility and the options for supporting it. At the November 2015 OAPB meeting, we presented a mature Business Case. We have also had to clear the Business Case through meetings of the Resource Management Committee (RMC).

Clearly we needed to ensure that our funders were prepared to support our proposal. Once we were in a position to share a Business Case with the funders we started a series of meetings and conversations with them.

The Wellcome Trust was immediate in its response – they would not allow direct charging to grants as they consider this to be an overhead cost, which they do not pay. We met with Cancer Research UK (CRUK) in January 2016 and there was a positive response about our transparent approach to costing and the comprehensiveness of services that the Facility provides to researchers at Cambridge. These issues are now being discussed with senior management at CRUK and discussions with CRUK are still ongoing at the time of writing this report (May 2016). [Update 24 May: CRUK agreed to consider research data management costs as direct costs on grant applications on a case by case basis, if justified appropriately in the context of the proposed research].

We encourage open dialogue with the RCUK funders about data management. In May 2015 we invited Ben Ryan to come to the University to talk about the EPSRC expectations on data management and how Cambridge meets these requirements. In August 2015 Michael Ball from the BBSRC came to talk to our community. We had an indication from the RCUK that our proposal was reasonable in principle. Once we were in a position to show our Business Case to the RCUK we invited Mark Thorley to discuss the issue and he has been in discussion with the individual councils for their input to give us a final answer.

Administrative issues

Timing in a decision like this is challenging because of the large number of systems within the institution that would be affected if a change were to occur. In anticipation of a positive response we started the process of ensuring our management and financial systems were prepared and able to manage the costing into grants – to ensure that if a green light were given we would be prepared.  To that end we have held many discussions with the Research Office on the practicalities of building the costing into our systems to make sure the charge is easy to add in our grant costing tool. We also had numerous discussions on how to embed these procedures in their workflows for validating whether the Facility services are needed and what to do if researchers forget to add them. The development has now been done.

A second consideration is the necessity to ensure all of the administrative staff involved in managing research grants (at Cambridge this is a  group of over 100 people) are aware of the change and how to manage both the change to the grant management system and also manage the questions from their research community. Simultaneously we were also involved in numerous discussions with our invaluable TRAC team at the Finance Division at the University who helped us validate all the Facility costs (to ensure that none of the costs are charged twice) and establishing costs centres and workflows for recovering money from grants.

Meanwhile we have had to keep our Facility staff on temporary contracts until we are in a position to advertise the roles. There is a huge opportunity cost in training people up in this area.

Conclusion

As it happened, the RCUK has come back to us to say that we can charge this cost to grants but as an overhead rather than direct cost. Having this decision means we can advertise the positions and secure our staffing situation. But we won’t be needing the administrative amendments to the system, nor the advocacy programme.

It has been a long process given we began preparing the Business Case in March 2015. The consultation throughout the University and the engagement of our community (both research and funder) has given us an opportunity to discuss the issues of research data management more widely. It is a shame – from our perspective – that we will not be able to be transparent about the costs of managing data effectively.

The funders and the University are all working towards a shared goal – we are wanting a culture change towards more open research, including the sharing of research data. To achieve this we need a more aware and engaged research community on these matters.  There is much advocacy to do.

Published 8 May 2016
Written by Dr Danny Kingsley and Dr Marta Teperek
Creative Commons License