Tag Archives: policy

Compliance is not the whole story

Today, Research England released Monitoring sector progress towards compliance with funder open access policies the results of a survey they ran in August last year in conjunction with RCUK, Wellcome Trust and Jisc.

Cambridge University was one of the 113 institutions that answered a significant number of questions about how we were managing compliance with various open access policies, what systems we were using and our decision making processes. Reading the collective responses has been illuminating.

The rather celebratory commentary from UKRI has focused on the compliance aspect – see the Research England’s press release: Over 80% of research outputs meet requirements of REF 2021 open access policy and the post by the Executive Chair of Research England David Sweeney, Open access – are we almost there for REF?

What’s it all about?

At risk of putting a dampener on the party I’d like to point a few things out. For a start,  compliance with a policy is not the end goal of a policy in itself. While clearly the UK policies over the past five years have increased the amount of UK research that is available open access, we do need to ask ourselves ‘so what?’.

What we are not measuring, or indeed even discussing, is the reason why we are doing this.

While the open access policies of other funders such as Wellcome Trust and Bill and Melinda Gates Foundation articulate the end goal: “foster a richer research culture” in the former and “ information sharing and transparency” in the latter, the REF2021 policy is surprisingly perfunctory. It simply states: “certain research outputs should be made open-access to be eligible for submission to the next Research Excellence Framework”.

It would be enormously helpful to those responsible for ‘selling’ the idea to our research community if there were some evidence to demonstrate the value in what we are all doing. A stick only goes so far.

It’s really hard, people

Part of the reason why we are having so much difficulty selling the idea to both our research community and the administration of the University is because open access compliance is expensive and complicated, as this survey amply demonstrates.

While there may have been an idea that requiring the research community to provide their work on acceptance would mean they would become more aware and engaged with Open Access, it seems this has not been achieved. Given that 71% of HEIs reported that AAMs are deposited by a member of staff from professional services, it is safe to say the past six years since the Finch Report have not significantly changed author behaviour.

With 335 staff at 1.0FTE recorded as “directly engaged in supporting and implementing OA at their institution”, it is clear that compliance is a highly resource hungry endeavour. This is driving the decision making at institutional level. While “the intent of funders’ OA policies is to make as many outputs freely available as possible”, institutions are focusing on the outputs that are likely to be chosen for the REF (as opposed to making everything available).

I suspect this is ideology meeting pragmatism. Not only can institutions not support the overall openness agenda, these policies seem to be further underlining the limited reward systems we currently use in academia.

The infrastructure problem

The first conclusion of the report was that “systems which support and implement OA are largely manual, resource-intensive processes”. The report notes that compliance checking tools are inadequate partly because of the complexity of funder policies and the labyrinth that is publisher embargo policies. It goes on to say the findings “demonstrate the need for CRIS systems, and other compliance tools used by institutions be reviewed and updated”.

This may the case, but buried in that suggestion is years of work and considerable cost. We know from experience. It has taken us at Cambridge 2.5 years and a very significant investment to link our CRIS system (Symplectic Elements) to our DSpace repository Apollo. And we are still not there in terms of being able to provide meaningful reports to our departments.

Who is paying for all of this?

When we say ‘open’…

The report touches on what is a serious problem in the process. Because we are obtaining works at time of acceptance (an aspect of the policy Cambridge supports), and embargo periods cannot be set until the date of publication is known, there is a significant body of material languishing under indefinite embargoes waiting to be manually checked and updated.

The report notes that ‘there is no clear preference…as to how AAMs are augmented or replaced in repositories following the release of later versions’. Given the lack of any automated way of checking this information the problem is unmanageable without huge human intervention.

At Cambridge we offer a ‘Request a Copy’ service which at least makes the works accessible, but this is an already out of control situation that is compounding as time progresses.

Solutions?

We really need to focus on sector solutions rather than each institution investing independently. Indeed, the second last conclusion is that ‘the survey has demonstrated the need for publishers, funders and research institutions to work towards reducing burdensome manual processes”. One such solution, which has a sole mention in the report, is the UK Scholarly Communication Licence as a way of managing the host of licences.

Right at the end of the report in the second last point something very true to my heart was mentioned: “Finally, respondents highlighted the need for training and skills at an institutional level to ensure that staff are kept up to date with resources and tools associated with OA processes.” Well, yes. This is something we have been trying to address at a sector level, and the solutions are not yet obvious.

This report is an excellent snapshot and will allow institutions such as ours some level of benchmarking. But it does highlight that we have a long way to go.

Published 14 June 2018
Written by Dr Danny Kingsley
Creative Commons License

Strategies for engaging senior leadership with RDM – IDCC discussion

This blog post gathers key reflections and take-home messages from a Birds of a Feather discussion on the topic of senior management engagement with RDM, and while written by a small number of attendees, the content reflects the wider discussion in the room on the day. [Authors: Silke Bellanger, Rosie Higman, Heidi Imker, Bev Jones, Liz Lyon, Paul Stokes, Marta Teperek*, Dirk Verdicchio]

On 20 February 2017, stakeholders interested in different aspects of data management and data curation met in Edinburgh to attend the 12th International Digital Curation Conference, organised by the Digital Curation Centre. Apart from discussing novel tools and services for data curation, the take-home message from many presentations was that successful development of Research Data Management (RDM) services requires the buy-in of a broad range of stakeholders, including senior institutional leadership

Summary

The key strategies for engaging senior leadership with RDM that were discussed were:

  • Refer to doomsday scenarios and risks to reputations
  • Provide high profile cases of fraudulent research
  • Ask senior researchers to self-reflect and ask them to imagine a situation of being asked for supporting research data for their publication
  • Refer to the institutional mission statement / value statement
  • Collect horror stories of poor data management practice from your research community
  • Know and use your networks – know who your potential allies are and how they can help you
  • Work together with funders to shape new RDM policies
  • Don’t be afraid to talk about the problems you are experiencing – most likely you are not alone and you can benefit from exchanging best practice with others

Why it is important to talk about engaging senior leadership in RDM?

Endorsement of RDM services by senior management is important because frequently it is a prerequisite for the initial development of any RDM support services for the research community. However, the sensitive nature of the topic (both financially and sometimes politically as well) means there are difficulties in openly discussing the issues that RDM service developers face when proposing business cases to senior leadership. This means the scale of the problem is unknown and is often limited to occasional informal discussions between people in similar roles who share the same problems.

This situation prevents those developing RDM services from exchanging best practice and addressing these problems effectively. In order to flesh out common problems faced by RDM service developers and to start identifying possible solutions, we organised an informal Birds of a Feather discussion on the topic during the 12th IDCC conference. The session was attended by approximately 40 people, including institutional RDM service providers, senior organisational leaders, researchers and publishers.

What is the problem?

We started by fleshing out the problems, which vary greatly between institutions. Many participants said that their senior management was disengaged with the RDM agenda and did not perceive good RDM as an area of importance to their institution. Others complained that they did not even have the opportunity to discuss the issue with their senior leadership. So the problems identified were both with the conversations themselves, as well as with accessing senior management in the first place.

We explored the type of senior leadership groups that people had problems engaging with. Several stakeholders were identified: top level institutional leadership, heads of faculties and schools, library leadership, as well as some research team leaders. The types of issues experienced when interacting with these various stakeholder groups also differed.

Common themes

Next we considered if there were any common factors shared between these different stakeholder groups. One of the main issues identified was that people’s personal academic/scientific experience and historic ideals of scientific practice were used as a background for decision making.

Senior leaders, like many other people, tend to look at problems with their own perspective and experience in mind. In particular, within the rapidly evolving scholarly communication environment what they perceive as community norms (or in fact community problems) might be changing and may now be different for current researchers.

The other common issue was the lack of tangible metrics to measure and assess the importance of RDM which could be used to persuade senior management of RDM’s usefulness. The difficulties in applying objective measures to RDM activities are mostly due to the fact that every researcher is undertaking an amount of RDM by default so it is challenging to find an example of a situation without any RDM activities that could be used as a baseline for an evidenced-based cost benefit analysis of RDM. The work conducted by Jisc in this area might be able to provide some solutions for this. Current results from this work can be found on the Research Data Network website.  

What works?

The core of our discussion was focused on exchanging effective methods of convincing managers and how to start gathering evidence to support the case for an RDM service within an institution.

Doomsday scenarios

We all agreed that one strategy that works for almost all possible audience types are doomsday scenarios – disasters that can happen when researchers do not adhere to good RDM practice. This could be as simple as asking individual senior researchers what they would do if someone accused them of falsifying research data five years after they have published their corresponding research paper. Would they have enough evidence to reject such accusations? The possibility of being confronted with their own potential undoing helped convince many senior managers of the importance of RDM.

Other doomsday scenarios which seem to convince senior leaders were related to broader institutional crises, such as risk of fire. Useful examples are the fire which destroyed the newly built Chemistry building at the University of Nottingham, the fire which destroyed valuable equipment and research at the University of Southampton (£120 million pounds’ worth of equipment and facilities), the recent fire at the Cancer Research UK Manchester Institute and a similar disaster at the University of Santa Cruz.

Research integrity and research misconduct

Discussion of doomsday scenarios led us to talk about research integrity issues. Reference to documented cases of fraudulent research helped some institutions convince their senior leadership of the importance of good RDM. These cases included the fraudulent research by Diederik Stapel from Tilburg University or by Erin Potts-Kant from Duke University, where $200 million in grants was awarded based on fake data. This led to a longer discussion about research reproducibility and who owns the problem of irreproducible research – individual researchers, funders, institutions or perhaps publishers. We concluded that responsibility is shared, and that perhaps the main reason for the current reproducibility crisis lies in the flawed reward system for researchers. 

Research ethics and research integrity are directly connected to good RDM practice and are also the core ethical values of academia. We therefore reflected on the importance of referring to the institutional value statement/mission statement or code of conduct when advocating/arguing for good RDM. One person admitted adding a clear reference to the institutional mission statement whenever asking senior leadership for endorsement for RDM service improvements. The UK Concordat on Open Research Data is a highly regarded external document listing core expectations on good research data management and sharing, which might be worth including as a reference. In addition, most higher education institutions will have mandates in teaching and research, which might allow good RDM practice to be endorsed through their central ethics committees.

Bottom up approaches to reach the top

The discussion about ethics and the ethos of being a researcher started a conversation about the importance of bottom up approaches in empowering the research community to drive change and bring innovation. As many researcher champions as possible should convince senior leadership about important services. Researcher voices are often louder than those of librarians, or those running central support services, so consider who will best help to champion your cause.

Collecting testimonies from researchers about the difficulties of working with research data when good data management practice was not adhered to is also a useful approach. Shared examples of these included horror stories such as data loss from stolen laptops (when data had not been backed up), newly started postdocs inheriting projects and the need to re-do all the experiments from scratch due to lack of sufficient data documentation from their predecessor, or lost patent cases. One person mentioned that what worked at their institution was an ‘honesty box’ where researchers could anonymously share their horror data management stories.

We also discussed the potential role of whistle-blowers, especially given the fact that reputational damage is extremely important for institutions. There was a suggestion that institutions should add consequences of poor data management practice to their institutional risk registers. The argument that good data management practice leads to time and efficiency savings also seems to be powerful when presented to senior leadership.

The importance of social networks

We then discussed the importance of using one’s relationships in getting senior management’s endorsement for RDM. The key to this is getting to know the different stakeholders, their interests and priorities, and thinking strategically about target groups: who are potential allies? Who are the groups who are most hesitant about the importance of RDM? Why are they hesitant? Could allies help with any of these discussions? A particularly powerful example was from someone who had a Nobel Prize winner ally, who knew some of the senior institutional leaders and helped them to get institutional endorsement for their cause.

Can people change?

The question was asked whether anyone had an example of a senior leader changing their opinion, not necessarily about RDM services. Someone suggested that in case of unsupportive leadership, persistence and patience are required and that sometimes it is better to count on a change of leadership than a change of opinions. Another suggestion was that rebranding the service tends to be more successful than hoping for people to change. Again, knowing the stakeholders and their interests is helpful in getting to know what is needed and what kind of rebranding might be appropriate. For example, shifting the emphasis from sharing of research data and open access to supporting good research data management practice and increasing research efficiency was something that had worked well at one institution.

This also led to a discussion about the perception of RDM services and whether their governance structure made a difference to how they were perceived. There was a suggestion that presenting RDM services as endeavours from inside or outside the Library could make a difference to people’s perceptions. At one science-focused institution anything coming from the library was automatically perceived as a waste of money and not useful for the research community and, as a result, all business cases for RDM services were bound to be unsuccessful due to the historic negative perception of the library as a whole. Opinion seemed to confirm that in places where libraries had not yet managed to establish themselves as relevant to 21st century academics, pitching library RDM services to senior leadership was indeed difficult. A suggested approach is to present RDM services as collaborative endeavours, and as joint ventures with other institutional infrastructure or service providers, for example as a collaboration between the library and the central IT department. Again, strong links and good relationships with colleagues at other University departments proved to be invaluable in developing RDM services as joint ventures.

The role of funding bodies

We moved on to discuss the need for endorsement for RDM at an institutional level occurring in conjunction with external drivers. Institutions need to be sustainable and require external funding to support their activities, and therefore funders and their requirements are often key drivers for institutional policy changes. This can happen on two different levels. Funding is often provided on the condition that any research data generated as a result needs to be properly managed during the research lifecycle, and is shared at the end of the project.

Non-compliance with funders’ policies can result in financial sanctions on current grants or ineligibility for individual researchers to apply for future grant funding, which can lead to a financial loss for the University overall. Some funders, such as the Engineering and Physical Sciences Research Council (EPSRC) in the United Kingdom, have clear expectations that institutions should support their researchers in adhering to good research data management practice by providing adequate infrastructure and policy framework support, therefore directly requesting institutions to support RDM service development.

Could funders do more?

There was consensus that funding bodies could perhaps do more to support good research data management, especially given that many non-UK funders do not yet have requirements for research data management and sharing as a condition of their grants. There was also a useful suggestion that funders should make more effort to ensure that their policies on research data management and sharing are adhered to, for example by performing spot-checks on research papers acknowledging their funding to see if supporting research data was made available, as the EPSRC have been doing recently.

Similarly, if funders would do more to review and follow up on data management plans submitted as part of grant applications it would be useful in convincing researchers and senior leadership of the importance of RDM. Currently not all funders require that researchers submit data management plans as part of grant applications. Although some pioneering work aiming to implement active data management plans started, people taking part in the discussion were not aware of any funding body having a structured process in place to review and follow up on data management plans. There was a suggestion that institutions should perhaps be more proactive in working together with funders in shaping new policies. It would be useful to have institutional representatives at funders’ meetings to ensure greater collaboration.

Future directions and resources

Overall we felt that it was useful to exchange tips and tricks so we can avoid making the same mistakes. Also, for those who had not yet managed to secure endorsement for RDM services from their senior leaders it was reassuring to understand that they were not the only ones having difficulty. Community support was recognised as valuable and worth maintaining. We discussed what would be the best way of ensuring that the advice exchanged during the meeting was not lost, and also how an effective exchange of ideas on how best to engage with senior leadership should be continued. First of all we decided to write up a blog post report of the meeting and to make it available to a wider audience.

Secondly, Jisc agreed to compile the various resources and references mentioned and to create a toolkit of techniques with examples for making RDM business cases for RDM. An initial set of resources useful in making the case can be found on the Research Data Network webpages. The current resources include A High Level Business Case, some Case studies and Miscellaneous resources – including Videos, slide decks, infographics, links to external toolkits, etc. Further resources are under development and are being added on a regular basis.

The final tip to all RDM service providers was that the key to success was making the service relevant and that persistence in advocating for the good cause is necessary. RDM service providers should not be shy about sharing the importance of their work with their institution, and should be proud of the valuable work they are doing. Research datasets are vital assets for institutions, and need to be managed carefully, and being able to leverage this is the key in making senior leadership understand that providing RDM services is essential in supporting institutional business.

Published 5 May 2017
Written by Silke Bellanger, Rosie Higman, Heidi Imker, Bev Jones, Liz Lyon, Paul Stokes, Dr Marta Teperek and Dirk Verdicchio

Creative Commons License

Service Level Agreements for TDM

Librarians expect publishers to support our researchers’ rights to Text and Data Mining and not cut access off for a library if they see ‘suspicious’ activity before they establish whether it is legitimate or not. These were the conclusions of a group who met at a workshop to discuss provision of Text and Data Mining services in March. The final conclusions were:

Expectations libraries have of publishers over TDM

The workshop concluded with very different expectations to what was originally proposed. The messages to publishers that were agreed were:

  1. Don’t cut us off over TDM activity! Have a conversation with us first if you notice abnormal behaviour*
  2. If you do cut us off and it turns out to be legitimate then we expect compensation for the time we were cut off
  3. Mechanisms for TDM where certain behaviours are expected need to be built into separate licensing agreements for TDM

*And if you want to cut us off – please demonstrate there are all these illegal TDM activities happening in the UK

Workshop on TDM

The workshop “Developing a research library position statement on Text and Data Mining in the UK” was part of the recent RLUK2017 conference.  My colleagues, Dr Debbie Hansen from the Office of Scholarly Communication and Anna Vernon from Jisc, and I wanted to open up the discussion about Text and Data Mining (TDM) with our library community. We have made the slides available and they contain a summary of all the discussions held during the event. This short blog post is an analysis of that discussion.

We started the workshop with a quick analysis of who was in the room using a live survey tool called Mentimeter. Eleven participants came from research institutions – six large, four small and one  from an ‘other research institution’. There were two publishers, and four people who identified as ‘other’ – which were intermediaries. Of the 19 attendees, 14 worked in a library. There was only one person who said they had extensive experience in TDM, four people said they were TDM practitioners but the largest group were the 14 who classified themselves as having ‘heard of TDM but have had no practical experience’.

The workshop then covered what TDM is, what the legal situation is and what publishers are currently saying about TDM . We then opened up the discussion.

Experiences of TDM for participants

In the initial discussion about experiences of the participants, a few issues were raised if libraries were to offer TDM services. Indeed there was a question whether this should form part of library service delivery at all. The issue is partly that this is new legislation, so currently publisher and institutions are reactive, not strategic in relation to TDM. We agreed:

  • There is a need for clearer understanding of the licensing situation with information
  • We also need to create a mechanism of where to go for advice, both within the institution and the publisher
  • We need to develop procedures of what to do with requests – which is a policy issue 
  • Researcher behaviour is a factor – academics are not concerned by copyright.

Offering TDM is a change of role of the library – traditionally libraries have existed to preserve access to items. The group agreed we would like to be enabling this activity rather than saying “no you can’t”. There are library implications for offering support for TDM, not least that librarians are not always aware of TDM taking place within their institution. This makes it difficult to be the central point for the activity. In addition, TDM could threaten access through being cut off, so this is causing internal disquiet.

TDM activity underway in Europe & UK

We then presented to the workshop some of the activities in TDM that are happening internationally, such as the FutureTDM project. There was also a short run down on the new copyright exception for research organisations carrying out research in public interest being proposed to the European Commission allowing researchers to carry out TDM of copyright protected content if they have lawful access (e.g. subscription) without prior authorisation.

ContentMine is a not for profit organisation that supplies open source TDM software to access and analyse documents. They are currently partnering with Wikimedia Foundation with a grant to develop WikiFactMine which is a project aiming to make scientific data available to editors of Wikidata and Wikipedia.

The ChemDataExtractor is a tool built by the Molecular Engineering Group at the University of Cambridge. It is an open source software package that extracts chemical information from scientific documentation (e.g. text, tables). The extracted data can be used for onward analysis. There is some information in a paper  in the Journal of Chemical Information and Modelling: ChemDataExtractor: A Toolkit for Automated Extraction of Chemical Information from the Scientific Literature“.

The Manchester Institute of Biotechnology hosts the National Centre for Text Mining (NaCTeM), which works with research partners to provide text mining tools and services in the biomedical field.

The British Library had a call for applications for a PhD student placement to undertake thesis text mining on 150,000 theses held in EThOS to extract new metadata such as names of supervisors.  Applications closed 20 February 2017, but according to an EThOS newsletter from March,  they had received no applications for the placement. The suggestion is that “perhaps that few students have content mining skills sufficiently well developed to undertake such a challenging placement”.

The problem with supporting TDM in libraries

We proposed to the workshop group that libraries are worried about getting cut off from their subscription by publishers due to large downloads of papers through TDM activity. This is because publishers’ systems are pre-programmed to react to suspicious activity. If TDM invokes automated investigation, then this may cause an access block.

However universities need to maintain support mechanism to ensure continuity of access. For this to occur we require workflows for swift resolution, fast communication and a team of communicators. This also requires education of researchers of potential issues.

We asked the group to discuss this issue – noting reasons why their organisation is not actively supporting TDM and if they are the main challenges they face.

Discussion about supporting TDM in libraries

The reasons put forward for not supporting TDM included practical issues such as the challenges of handling physical media and the risk of lockout.

The point was made that there was a lack of demand for the service. This is possibly because the researchers are not coming to the Library for help. There may be a lack of awareness in the IT areas that the Library can help and they may not even pass on the queries.  This points to the need for internal discussion with institutions.

It was noted that there was an assumption in the discussion that the Library is at the centre of this type of activity, however and we are not joined up as organisations. The question is who is responsible for this activity? There is often no institutional view on TDM because the issues are not raised at academic level. Policy is required.

Even if researchers do come to the library, there are questions about how we can provide a service. Initially we would be responding to individual queries, but how do we scale it up?

The challenges raised included the need for libraries to ensure everyone understands the needs at the the content owner level. The library, as the coordinator of this work would need to ensure the TDM is not for commercial use, and need to ensure people know their responsibilities. This means the library is potentially being intrusive on the researcher process.

Service Level Agreement proposal

The proposal we put forward to the group was that we draft a statement for a Service Level Agreement for publishers to assure us that if the library is cut off, but the activity is legal, we will be reinstated within and agreed period of time. We asked the group to discuss the issues if we were to do this.

Expectation of publishers

The discussion has raised several issues libraries had experienced with publishers over TDM. One participants said the contract with a particular publisher to allow their researchers to do TDM took two years to finalise.

There was a recognition that for genuine TDM to be identified might require some sort of registry of TDM activity which might not be an administrative task all libraries want to take on. The alternative suggestion was a third party IP registry, which could avoid some of the manual work. Given that LOCKSS crawls publisher software without getting trapped, this could work in the same way with a bank of IP addresses that is secured for this purpose.

Some solutions that publishers could help with include publishers delivering material in different ways – not on a hard drive. The suggestion was that this could be part of a platform and the material was produced in a format that allowed TDM (at no extra cost).

Expectation of libraries

There was some distaste amongst the group for libraries to take on the responsibility for maintaining  a TDM activity register. However libraries could create a safe space for TDM like virtual private networks.

Licenses are the responsibility of libraries, so we are involved whether we wish to be or not. Large scale computational reading is completely different from current library provision. There are concerns that licensing via the library could be unsuitable for some institutions. This raises issues of delivery and legal responsibilities. One solution for TDM could be to record IP address ranges in licence agreements. We need to consider:

  • How do we manage the licenses we are currently signed up to?
  • How do we manage licensing into the future so we separate different uses? Should we have a separate TDM ‘bolt on’ agreement.

The Service Level Agreement (SLA) solution

The group noted that, particularly given the amount publisher licenses cost libraries, being cut off for a week or two weeks with no redress is unusual at best in a commercial environment. At minimum publishers should contact the library to give the library a grace period to investigate rather than being cut off automatically.

The basis for the conversation over the SLA includes the fact that the law is on the subscriber’s side if everyone is doing it legally. It would help to have an understanding of the extent of infringing activity going on with University networks (considering that people can ‘mask’ themselves). This would be useful for thinking of thresholds.

Next steps

We need to open up the conversation to a wider group of librarians. We are hoping that we might be able to work with RLUK and funding councils to come to an agreed set of requirements that we can have endorsed by the community and which we can then take to to publishers.

Debbie Hansen and Danny Kingsley attended the RLUK conference thanks to the support of the Arcadia Fund, a charitable fund of Lisbet Rausing and Peter Baldwin.

Published 30 March 2017
Written by Dr Danny Kingsley
Creative Commons License