Monthly Archives: May 2017

Cambridge RCUK Block Grant spend for 2016-2017

May 22, 2017UncategorizedAPC, article processing charges, funder, open access, RCUKOffice of Scholarly Communication

Much to our relief, last Friday we sent off our most recent report on our expenditure of the RCUK Block Grant fund. The report is available in our repository. Cambridge makes all of its information about spend on Open Access publicly available. This blog continues on from that describing our spend from 2009 – 2016, and from the blog on our open access spend in 2014.

Compliance

We are pleased to be able to report that we reached 80% compliance in this reporting period, up from 76% last year. The RCUK is expecting 75% compliance by the end of the transition period on 31 March 2018, so we are well over target.

According to our internal helpdesk system ZenDesk, our compliance is shared between 52% gold (publication in an Open Access journal or payment for hybrid Open Access), and 28% green (placement of the work into our institutional repository, Apollo). We do not have the breakdown of how many of the gold APC payments were for hybrid. In the past it we have had an overall 86.8% spend on hybrid.

Not only do we have an increase from 76% to 80% in our compliance rates overall, this is even more impressive when we consider that this is in the face of a 15% increase in the number of research outputs acknowledging RCUK funding. Web of Science indicated in a search for articles, reviews and proceedings papers that Cambridge published 2400 papers funded by RCUK in 2016. In 2015 Web of Science the same search counted 2080 RCUK funded research outputs.

Headline numbers

In total Cambridge spent £1.68 million of RCUK funds on APCs (this is up from £1.28 last year)
1920 articles identified as being RCUK funded were submitted to the Open Access Service, of which 1248 required payment for RCUK*
The average article processing charge was £1850 – this is significantly less than the £2008 average last year, reflecting the value of the memberships we have (see below)

*Note these numbers will differ slightly from the report due to the difference in dates between the calendar and financial years (see below).

Non APC spend

In total Cambridge spent £1.94 million of RCUK funds in this reporting period, of which £1.68 million was on APCs. Approximately 13% was spent on other costs, primarily distributed between staffing, infrastructure and memberships. The greatest proportion is staffing, with £95,000 spent on this cost. Memberships were the next largest category, mostly arrangements to reduce the cost of APCs, including:

£42,000 on the open access component of the Springer Compact
£22,000 on memberships to obtain discounts – there is a list of these on the OSC website
£18,000 on the University’s SCOAP3 subscription

The RCUK fund has also supported the infrastructure for Open Access at Cambridge, with £62,000 covering the cost of several upgrades of DSpace and general support for the repository. This has allowed us to implement new services such as the minting of DOIs and our hugely successful Request a Copy service which allows people to contact authors of embargoed material in the repository and ask them to send through the author’s accepted manuscript. This category also covers our license for our helpdesk system, ZenDesk, which helps the Open Access team manage the on-average responses to 60 queries a day. We are also able to run most of our reporting out of ZenDesk.

There are some other smaller items in the non APC category, including £1500 on bank charges that for various reasons we have not been able to allocate to specific articles.

Are these deals good value?

Some are. The Springer Compact is shown as a single charge in the report with the articles listed individually. The RCUK Block Grant contributed £46,020 to the Springer Compact and 128 Cambridge papers were published by Springer that acknowledged RCUK funding. This gives us an average APC cost per paper to the RCUK fund* of £359.53 including VAT. This represents excellent value, given that the average APC for Springer is $3,000 (about £2,300).

*Note that in some instances the papers acknowledging RCUK may also have acknowledged COAF in which case the overall cost for the APC for those papers will be higher.

Cambridge has now completed a year having a prepayment arrangement with Wiley. Over this time we contributed £108,000 to the account and published 68 papers acknowledging RCUK. This works out that on average the Wiley APC cost was £1,588 per paper. Like Springer, the average APC is approximately £2,300 so this amount appears to be good value.

However the RCUK has contributed a higher proportion to the Wiley account than COAF because at the time the account was established we had run low on COAF funds. Because the University does not provide any of its own funds for Open Access, there was no option other than to use RCUK funds. We will need to do some calculations to ensure that the correct proportion of COAF and RCUK funds are supporting this account. It is a reflection of the challenges we are facing on a rolling basis when the dates are fluid (see below).

It appears we need to look very closely at our membership with Oxford University Press. We spent £44,000 of RCUK funds on this, and published 22 articles acknowledging RCUK funding. This works out to be an APC of £2000 per article, which is not dissimilar to an average OUP APC, and therefore does not represent any value at all. This is possibly because our allocation of the expense of the membership between COAF and RCUK might not reflect what has been published with OUP. We need to investigate further.

Caveat – the date problem

We manage Open Access funds that operate on different patterns. The COAF funds match the academic year, with the new grants starting on 1 October each year. The RCUK works on a financial year, starting on 1 April each year. Many of our memberships and offset deals work on the calendar year.

To add to the confusion, the RCUK is behind in its payments, so for this current year which started on 1 April 2017, we will not receive our first part-payment until 1 June. That amount will not cover the commitments we had already made by the end of 2016, let alone those made between 1 April when this year started and the 1 June when the money is forthcoming. This means we will remain in the red. Cambridge is carrying half a million pounds in commitments at any given time. The situation makes it very difficult to balance the books.

Our recent RCUK report covers the period of 1 April 2016 – 31 March 2017 and refers only to invoices paid in this period. In the report the dates go beyond the 31 March 2017 because the reconciliation in the system sometimes takes longer, so items are logged as later dates even though the payment was made within the period. The publication dates for the articles these invoices relate to are wildly different, and many of these have not yet been published due to the delay between acceptance and publication which ranges from days to years.

This means working out averages is an inexact science. It is only possible to filter Web of Science by year, so we are only able to establish the number of papers published in a given calendar year. This set of papers is not the same set for which we have paid, but we can compare year on year and identify some trends that make sense.

Published 22 May 2017
Written by Dr Danny Kingsley

Open at scale: sharing images in the Open Research Pilot

May 8, 2017Uncategorizeddata, Open Research, scholarly communication, Wellcome TrustOffice of Scholarly Communication

Dr Ben Steventon is one of the participants in the Open Research Pilot. He is working with the Office of Scholarly Communication to make his research process more open and here reports on some of the major challenges he perceives at the beginning of the project.

The Steventon Group is a new group established last year which looks at embryonic development, in particular focusing on the zebrafish. To investigate problems in this area the group uses time-lapse imaging and tracks cells in 3D visualisations which presents many challenges when it comes to data sharing, which they hope to address through the Wellcome Trust Open Research Project. Whilst the difficulties that this group are facing are specific to a particular type of research, they highlight some common challenges across open research: sharing large files, dealing with proprietary software and joining up the different outputs of a group.

Sharing imaging data

The data created by time-lapse imaging and cell tracking is frequently on a scale that presents a technical, as well as financial, challenge. The raw data consists of several terabytes of film which is then compressed for analysis into 500GB files. These compressed files are of a high enough quality that they can be used for analysis but they are still not small enough that they can be easily shared. In addition the group also generates spreadsheets of tracking data, which can be easily shared but are meaningless without the original imaging files and specific software to allow the two pieces of data to be connected. One solution which we are considering is the Image Data Resource, which is working to make imaging datasets in the life sciences, which have not previously been shareable due to their size, available to the scientific community to re-use.

Making it usable

The software used in this type of research is a major barrier to making the group’s work reproducible. The Imaris software the group uses costs thousands of pounds so anything shared in their proprietary formats are only accessible to an extremely small group of researchers at wealthier institutions, which is in direct opposition to the principles of Open Research. It is possible to use Fiji, an open source alternative, to recreate tracking with the imaging files and tracking spreadsheets; however, the data annotation originally performed in Imaris will be lost when the images are not saved in the proprietary formats.

An additional problem in such analyses is the sharing of protocols that detail the methodologies applied, from the preparation of the samples all the way through data generation and analysis. This is a common problem with standard peer-review journals that are often limited in the space available for the description of methods. The group are exploring new ways to communicate their research protocols and have created an article for the Journal of Visualised Experiments, but these are time consuming to create and so are not always possible. Open peer-review platforms potentially offer a solution to sharing detailed protocols in a more rapid manner, as do specialist platforms such as Wellcome Open Research and Protocols.io.

Increasing efficiency by increasing openness

Whilst the file size and proprietary software in this type of research presents some barriers to sharing, there are also opportunities through sharing to improve practice across the community. Currently there are several different software packages being used for visualisation and tracking. Therefore, sharing more imaging data would allow groups to try out different types of images on different tools and make better purchasing decisions with their grant money. Furthermore, there is a great frustration in this area that lots of people are working on different algorithms for different datasets, so greater sharing of these algorithms could reduce the amount of time wasted creating algorithms when it might be possible to adapt a pre-existing one.

Shifting models of scholarly communication

As we move towards a model of greater openness, research groups are facing a new difficulty in working out how best to present their myriad outputs. The Steventon group intends to publish data (in some form), protocols and a preprint at the same time as submitting their papers to a traditional journal. This will make their work more reproducible, and it also allows researchers who are interested in different aspects of their work to access the bits that interest them. These outputs will link to one another, through citations, but this relies on close reading of the different outputs and checking references. The Steventon group would like to make the links between the different aspects of their work more obvious and browsable, so the context is clear to anyone interest in the lab’s work. As the research of the group is so visual it would be appropriate to represent the different aspects of their work in a more appealing form than a list of links.
The Steventon lab is attempting to link and contextualise their work through their website, and it is possible to cross-reference resources in many repositories (including Cambridge’s Apollo), but they would like there to be a more sustainable solution. They work in areas with crossovers to other disciplines – some people may be interested in their methodologies, others the particular species they work on, and others still the particular developmental processes they are researching. There are opportunities here for openness to increase the discoverability of interdisciplinary research and we will be exploring this, as well as the issues around sharing images and proprietary software, as part of the Open Research Pilot.

Published 8 May 2017
Written by Rosie Higman and Dr Ben Steventon

Strategies for engaging senior leadership with RDM – IDCC discussion

May 5, 2017Uncategorizedfunders, higher education, policy, RDM, research data management, research integrity, senior leadershipOffice of Scholarly Communication

This blog post gathers key reflections and take-home messages from a Birds of a Feather discussion on the topic of senior management engagement with RDM, and while written by a small number of attendees, the content reflects the wider discussion in the room on the day. [Authors: Silke Bellanger, Rosie Higman, Heidi Imker, Bev Jones, Liz Lyon, Paul Stokes, Marta Teperek*, Dirk Verdicchio]

On 20 February 2017, stakeholders interested in different aspects of data management and data curation met in Edinburgh to attend the 12th International Digital Curation Conference, organised by the Digital Curation Centre. Apart from discussing novel tools and services for data curation, the take-home message from many presentations was that successful development of Research Data Management (RDM) services requires the buy-in of a broad range of stakeholders, including senior institutional leadership.

Summary

The key strategies for engaging senior leadership with RDM that were discussed were:

Refer to doomsday scenarios and risks to reputations
Provide high profile cases of fraudulent research
Ask senior researchers to self-reflect and ask them to imagine a situation of being asked for supporting research data for their publication
- Ready to use risk-assessment tools developed by the Research Data Service at the University of Illinois at Urbana-Champaign
Refer to the institutional mission statement / value statement
Collect horror stories of poor data management practice from your research community
Know and use your networks – know who your potential allies are and how they can help you
Work together with funders to shape new RDM policies
Don’t be afraid to talk about the problems you are experiencing – most likely you are not alone and you can benefit from exchanging best practice with others

Why it is important to talk about engaging senior leadership in RDM?

Endorsement of RDM services by senior management is important because frequently it is a prerequisite for the initial development of any RDM support services for the research community. However, the sensitive nature of the topic (both financially and sometimes politically as well) means there are difficulties in openly discussing the issues that RDM service developers face when proposing business cases to senior leadership. This means the scale of the problem is unknown and is often limited to occasional informal discussions between people in similar roles who share the same problems.

This situation prevents those developing RDM services from exchanging best practice and addressing these problems effectively. In order to flesh out common problems faced by RDM service developers and to start identifying possible solutions, we organised an informal Birds of a Feather discussion on the topic during the 12th IDCC conference. The session was attended by approximately 40 people, including institutional RDM service providers, senior organisational leaders, researchers and publishers.

What is the problem?

We started by fleshing out the problems, which vary greatly between institutions. Many participants said that their senior management was disengaged with the RDM agenda and did not perceive good RDM as an area of importance to their institution. Others complained that they did not even have the opportunity to discuss the issue with their senior leadership. So the problems identified were both with the conversations themselves, as well as with accessing senior management in the first place.

We explored the type of senior leadership groups that people had problems engaging with. Several stakeholders were identified: top level institutional leadership, heads of faculties and schools, library leadership, as well as some research team leaders. The types of issues experienced when interacting with these various stakeholder groups also differed.

Common themes

Next we considered if there were any common factors shared between these different stakeholder groups. One of the main issues identified was that people’s personal academic/scientific experience and historic ideals of scientific practice were used as a background for decision making.

Senior leaders, like many other people, tend to look at problems with their own perspective and experience in mind. In particular, within the rapidly evolving scholarly communication environment what they perceive as community norms (or in fact community problems) might be changing and may now be different for current researchers.

The other common issue was the lack of tangible metrics to measure and assess the importance of RDM which could be used to persuade senior management of RDM’s usefulness. The difficulties in applying objective measures to RDM activities are mostly due to the fact that every researcher is undertaking an amount of RDM by default so it is challenging to find an example of a situation without any RDM activities that could be used as a baseline for an evidenced-based cost benefit analysis of RDM. The work conducted by Jisc in this area might be able to provide some solutions for this. Current results from this work can be found on the Research Data Network website.

What works?

The core of our discussion was focused on exchanging effective methods of convincing managers and how to start gathering evidence to support the case for an RDM service within an institution.

Doomsday scenarios

We all agreed that one strategy that works for almost all possible audience types are doomsday scenarios – disasters that can happen when researchers do not adhere to good RDM practice. This could be as simple as asking individual senior researchers what they would do if someone accused them of falsifying research data five years after they have published their corresponding research paper. Would they have enough evidence to reject such accusations? The possibility of being confronted with their own potential undoing helped convince many senior managers of the importance of RDM.

Other doomsday scenarios which seem to convince senior leaders were related to broader institutional crises, such as risk of fire. Useful examples are the fire which destroyed the newly built Chemistry building at the University of Nottingham, the fire which destroyed valuable equipment and research at the University of Southampton (£120 million pounds’ worth of equipment and facilities), the recent fire at the Cancer Research UK Manchester Institute and a similar disaster at the University of Santa Cruz.

Research integrity and research misconduct

Discussion of doomsday scenarios led us to talk about research integrity issues. Reference to documented cases of fraudulent research helped some institutions convince their senior leadership of the importance of good RDM. These cases included the fraudulent research by Diederik Stapel from Tilburg University or by Erin Potts-Kant from Duke University, where $200 million in grants was awarded based on fake data. This led to a longer discussion about research reproducibility and who owns the problem of irreproducible research – individual researchers, funders, institutions or perhaps publishers. We concluded that responsibility is shared, and that perhaps the main reason for the current reproducibility crisis lies in the flawed reward system for researchers.

Research ethics and research integrity are directly connected to good RDM practice and are also the core ethical values of academia. We therefore reflected on the importance of referring to the institutional value statement/mission statement or code of conduct when advocating/arguing for good RDM. One person admitted adding a clear reference to the institutional mission statement whenever asking senior leadership for endorsement for RDM service improvements. The UK Concordat on Open Research Data is a highly regarded external document listing core expectations on good research data management and sharing, which might be worth including as a reference. In addition, most higher education institutions will have mandates in teaching and research, which might allow good RDM practice to be endorsed through their central ethics committees.

Bottom up approaches to reach the top

The discussion about ethics and the ethos of being a researcher started a conversation about the importance of bottom up approaches in empowering the research community to drive change and bring innovation. As many researcher champions as possible should convince senior leadership about important services. Researcher voices are often louder than those of librarians, or those running central support services, so consider who will best help to champion your cause.

Collecting testimonies from researchers about the difficulties of working with research data when good data management practice was not adhered to is also a useful approach. Shared examples of these included horror stories such as data loss from stolen laptops (when data had not been backed up), newly started postdocs inheriting projects and the need to re-do all the experiments from scratch due to lack of sufficient data documentation from their predecessor, or lost patent cases. One person mentioned that what worked at their institution was an ‘honesty box’ where researchers could anonymously share their horror data management stories.

We also discussed the potential role of whistle-blowers, especially given the fact that reputational damage is extremely important for institutions. There was a suggestion that institutions should add consequences of poor data management practice to their institutional risk registers. The argument that good data management practice leads to time and efficiency savings also seems to be powerful when presented to senior leadership.

The importance of social networks

We then discussed the importance of using one’s relationships in getting senior management’s endorsement for RDM. The key to this is getting to know the different stakeholders, their interests and priorities, and thinking strategically about target groups: who are potential allies? Who are the groups who are most hesitant about the importance of RDM? Why are they hesitant? Could allies help with any of these discussions? A particularly powerful example was from someone who had a Nobel Prize winner ally, who knew some of the senior institutional leaders and helped them to get institutional endorsement for their cause.

Can people change?

The question was asked whether anyone had an example of a senior leader changing their opinion, not necessarily about RDM services. Someone suggested that in case of unsupportive leadership, persistence and patience are required and that sometimes it is better to count on a change of leadership than a change of opinions. Another suggestion was that rebranding the service tends to be more successful than hoping for people to change. Again, knowing the stakeholders and their interests is helpful in getting to know what is needed and what kind of rebranding might be appropriate. For example, shifting the emphasis from sharing of research data and open access to supporting good research data management practice and increasing research efficiency was something that had worked well at one institution.

This also led to a discussion about the perception of RDM services and whether their governance structure made a difference to how they were perceived. There was a suggestion that presenting RDM services as endeavours from inside or outside the Library could make a difference to people’s perceptions. At one science-focused institution anything coming from the library was automatically perceived as a waste of money and not useful for the research community and, as a result, all business cases for RDM services were bound to be unsuccessful due to the historic negative perception of the library as a whole. Opinion seemed to confirm that in places where libraries had not yet managed to establish themselves as relevant to 21st century academics, pitching library RDM services to senior leadership was indeed difficult. A suggested approach is to present RDM services as collaborative endeavours, and as joint ventures with other institutional infrastructure or service providers, for example as a collaboration between the library and the central IT department. Again, strong links and good relationships with colleagues at other University departments proved to be invaluable in developing RDM services as joint ventures.

The role of funding bodies

We moved on to discuss the need for endorsement for RDM at an institutional level occurring in conjunction with external drivers. Institutions need to be sustainable and require external funding to support their activities, and therefore funders and their requirements are often key drivers for institutional policy changes. This can happen on two different levels. Funding is often provided on the condition that any research data generated as a result needs to be properly managed during the research lifecycle, and is shared at the end of the project.

Non-compliance with funders’ policies can result in financial sanctions on current grants or ineligibility for individual researchers to apply for future grant funding, which can lead to a financial loss for the University overall. Some funders, such as the Engineering and Physical Sciences Research Council (EPSRC) in the United Kingdom, have clear expectations that institutions should support their researchers in adhering to good research data management practice by providing adequate infrastructure and policy framework support, therefore directly requesting institutions to support RDM service development.

Could funders do more?

There was consensus that funding bodies could perhaps do more to support good research data management, especially given that many non-UK funders do not yet have requirements for research data management and sharing as a condition of their grants. There was also a useful suggestion that funders should make more effort to ensure that their policies on research data management and sharing are adhered to, for example by performing spot-checks on research papers acknowledging their funding to see if supporting research data was made available, as the EPSRC have been doing recently.

Similarly, if funders would do more to review and follow up on data management plans submitted as part of grant applications it would be useful in convincing researchers and senior leadership of the importance of RDM. Currently not all funders require that researchers submit data management plans as part of grant applications. Although some pioneering work aiming to implement active data management plans started, people taking part in the discussion were not aware of any funding body having a structured process in place to review and follow up on data management plans. There was a suggestion that institutions should perhaps be more proactive in working together with funders in shaping new policies. It would be useful to have institutional representatives at funders’ meetings to ensure greater collaboration.

Future directions and resources

Overall we felt that it was useful to exchange tips and tricks so we can avoid making the same mistakes. Also, for those who had not yet managed to secure endorsement for RDM services from their senior leaders it was reassuring to understand that they were not the only ones having difficulty. Community support was recognised as valuable and worth maintaining. We discussed what would be the best way of ensuring that the advice exchanged during the meeting was not lost, and also how an effective exchange of ideas on how best to engage with senior leadership should be continued. First of all we decided to write up a blog post report of the meeting and to make it available to a wider audience.

Secondly, Jisc agreed to compile the various resources and references mentioned and to create a toolkit of techniques with examples for making RDM business cases for RDM. An initial set of resources useful in making the case can be found on the Research Data Network webpages. The current resources include A High Level Business Case, some Case studies and Miscellaneous resources – including Videos, slide decks, infographics, links to external toolkits, etc. Further resources are under development and are being added on a regular basis.

The final tip to all RDM service providers was that the key to success was making the service relevant and that persistence in advocating for the good cause is necessary. RDM service providers should not be shy about sharing the importance of their work with their institution, and should be proud of the valuable work they are doing. Research datasets are vital assets for institutions, and need to be managed carefully, and being able to leverage this is the key in making senior leadership understand that providing RDM services is essential in supporting institutional business.

Published 5 May 2017
Written by Silke Bellanger, Rosie Higman, Heidi Imker, Bev Jones, Liz Lyon, Paul Stokes, Dr Marta Teperek and Dirk Verdicchio

Unlocking Research

Open Research at Cambridge