Tag Archives: open access

Is CC-BY really a problem or are we boxing shadows?

Comments from researchers and colleagues have indicated some disquiet about the Creative Commons (CC-BY) licence in some areas of the academic community. However, in conversation with some legal people and contemporaries at other institutions (some of these exchanges are replicated at the end of the blog) one of the observations was that generally academics are not necessarily cognizant with what the licences offer and indeed what protections are available under regular copyright.

To try and determine whether this was an education and advocacy problem or if there are real issues we had a roundtable discussion on 29 February at Cambridge University attended by about 35 people who were a mixture of academics, administrators, publishers and legal practitioners. The discussion centred on some of the objections raised in the information circulated before the meeting (which is summarised at the end of this blog). For ease of description each objection is addressed in turn.

Background

Creative Commons provide a series of licences that people who create work can add to their work which tell users what they can or cannot do with it. There are a range of licenses that run from no restrictions at all CC-0 to fairly restrictive CC-BY-NC-ND-SA* where the user must attribute the author, not amend the work, cannot make any financial gain from it and must put the same licence on anything they produce using this work.

There are increasing requirements from funders such as the Wellcome Trust and RCUK in the UK that any work published open access must have a Creative Commons Attribution (CC-BY) licence attached to it. The rationale behind this is that research needs to be available for other researchers to both read and reuse, but also to text and data mine without fear of copyright breaches. Work that is available under a CC-BY licence can be easily incorporated into course reading lists without copyright complications.

* Note added 8 March – a comment has been sent through is that the CC-BY-NC-ND-SA is impossible to apply because the share-alike and no derivatives clauses are mutually exclusive and cannot be applied together. See this explanation.

Summary of the discussion

The general feeling in the discussions was that academics do want to share their work but they don’t want things to be used incorrectly. The outcome of the discussion was that while there are some confusions in this area, and we could do some work on advocacy and educational materials there are also some specific cases where CC-BY has the potential to cause issues.  In a small number of cases issues have actually occurred.

Is CC-BY a problem? For whom?

We should note here that CC-BY only affects a proportion of research published in the UK. While all research is potentially affected by the HEFCE requirement to make work available, the route preferred is through placing a copy in a repository. So this discussion affects only those researchers who have a specific grant from the Charities Open Access Fund (Wellcome Trust) or the RCUK. Humanities researchers tend not to hold grants, and for those that do, it is their articles, not their monographs that are affected by this requirement.

While there are some actual concrete examples of issues for researchers in the Arts and Humanities, many of the problems discussed here are what could happen. There was a comment from a scientific publisher that the sciences also had some concerns about CC-BY when it was first introduced, but none of the concerns have actually come to fruition. Another person noted there have been hundreds of thousands of pieces of content published under CC-BY licences, with very few known problem cases or harm. This is telling. The question was raised: Are we just repeating myths?

On the other hand, just because issues haven’t happened yet does not mean that it would not be a serious problem should they did occur. One of the questions at the end of the discussion was: “Are the ethical norms of society strong enough to stop these concerns happening?” It would appear that to date they have been in the sciences.

Moral rights

CC-BY is an attribution licence. This means the moral right for the originator of the work to be identified is retained. However the moral right for the integrity of the research is not protected. The discussion centred around this.

If someone uses work under a CC-BY licence and makes alterations to it, they do need to indicate they have changed a work but not how they have altered it. The concern in the group was that the work could be altered so the meaning is entirely changed and it would still be attributed to the original author.

Authors can object to the derogatory treatment of their work. The recourse of being able to ask to have the originator’s name taken off the work was not seen as satisfactory because then the person who has adapted the work is potentially able to publish the work, which is based substantially on someone else’s work, as their own.

That said, one comment was that academic works are always open to interpretation, whether quoted or not and whether available under a CC-BY licence or not.

Translation

The area of translations does appear to have some concrete examples of problems caused by CC-BY for Humanities & Social Science authors. One of the issues is it is very difficult to check a translation unless the original author can read the language into which their work has been translated.

Plagiarism

Of all of the areas of discussion, plagiarism raised the most opinions. The accusation that CC-BY somehow ‘encourages’ plagiarism is often levelled. Some arguments are that making work available under a Creative Commons licence protect authors against plagiarism rather than encourage it. Works available in the public domain are far more easily identified as the original work than something published on paper and held on a library shelf, for example.

There was a debate about what actually constitutes plagiarism. One opinion was that ‘It’s plagiarism unless it’s in quotes’. However while the use of quote marks would protect the integrity of the work, there is nothing legally wrong with a derivative use of a work that is available under CC-BY – legally this is not plagiarism.

Nothing about the CC-BY licence overrides UK law about fair dealing. One of the lawyers present noted that academics don’t understand the details of copyright. Academics want full protection but also full sharing. In the world of the internet there’s a free-for-all – people copy-and-paste from wherever they want. No-one respects licences, so an academic work is not necessarily protected under current rules.

It was noted that plagiarism occurs all the time, even when articles are all rights reserved and under traditional copyright. And while Open Access publishing does make plagiarism easier (regardless of the licence), it doesn’t change the underlying principle that it’s unethical. Ethical behaviour in academia sits separately from copyright law.

Sensitive information

The area of sensitive information seems to have the strongest case for not using a CC-BY licence. Researchers working in areas that might contain sensitive information – such as medical or criminal areas – spend a great deal of time ensuring that their findings are presented sensitively and ensuring their distribution is appropriate. The concern with CC-BY licences mean that these findings can be misconstrued which would be damaging to the researcher and could go back to the participants and affect them. If presented in the wrong way, altered research outputs could affect not just their research but also participants.

There is an issue about the dialogue between the people that are being studied and if they have any moral rights about how the information is being used.

An example that was given was in anthropology, working with a community of Native Americans in northern California, who released sensitive data and stories from their cultural past which they want to be accessed. However because they have been exploited in the past they wanted some form of restriction on how these things can be reused. This is an example where a CC-BY licence would not be appropriate.

An oral historian discussed the type of work they do with subjects talking about traumatic periods of their life. In these cases the researcher enters in a covenant with them about how their work can be used. This would not be able to be dealt with ethically under a CC-BY licence. The issue is about subsequent control over reuse of research, with concern about it being co-opted and used in another context.

The question about ethical use of material was raised again, with someone noting that no matter what licence it is available under you can’t control what people do with your work if they disagree with you.

Items containing third party copyright

Being required to publish work under a CC-BY licence does cause problems for people whose work contains a large amount of 3rd party material. This is because the burden on the author to obtain permissions for all of the works would be both time consuming and expensive. May researchers have raised questions about whether they can even do their work if they’re required to publish under CC-BY.

That said, if researchers are themselves using CC-BY works this issue is mitigated because they automatically have permission to use the material. This raises the question; does CC-BY make it more difficult or easier?

Commercialisation

There were some examples raised where a series of works that were freely available had been packaged up and sold. This raised the question: Who is being harmed in commercial exploitation of academic works?

Academics do not publish in journals for money, so the originator of a work that is subsequently sold on is not personally losing a revenue stream. There was a distinction between the academic and non-academic publishing environment. It was agreed that the person buying these works are being scammed. The concern is that people are being exploited by being made to pay for things that should be freely available.

The discussion moved to whether a Non Commercial licence would solve this problem. The issue here is the confusion over the definition of ‘commercial’ in this context. An institution that has a revenue stream from student fees could be seen to be commercial and therefore unable to include CC-BY-NC items on their reading lists.

It was noted that CC-BY–NC-ND is extremely restrictive about ways works can be used.

Academic freedom

The discussion several times touched on the broader issue of the government putting an increasing number of requirements against researchers. The questions raised were: “Does someone who is fronting up with the money have the rights to enforce a particular licence? What about the subjects of a study?”

There is supposed to be arms length between funders and universities but a concern is that funding bodies want to have more power to tell academics what to work on.

Next steps

In summary, the discussion indicated that CC-BY licences do not encourage plagiarism, or issues with commercialism within academia (although there is a broader ethical issue). However in some cases CC-BY licences could pose problems for the moral integrity of the work and cause issues with translations. CC-BY licenses do create challenges for works containing sensitive information and for works containing third party copyright.

There is an expectation amongst the academic community that people behave ethically and within cultural norms.

As agreed with the group we have published this blog post which summarises the discussions held this week. In discussions about the Open Access Policy Framework for the University it would be helpful to include a statement that there is concern about CC-BY licences for some disciplines and types of research.

Background information sent to participants prior to the discussion

Commentary on CC-BY in published reports

The issue of the CC-BY licenses was a recurrent theme in A review of the RCUK review of implementation of its OA policy (March 2015). Many arts, humanities and social science disciplines hold ‘principled and practical objections to the use of CC-BY licences’ (p18). This is partly because work under a CC-BY license ‘could be both used commercially in ways of which the author does not approve and also might not be properly acknowledged as their work’ (pp19-20).

The Royal Historical Society evidence to the RCUK review noted that humanities scholars have particular objections to certain kinds of ‘derivative use’ that amount to the encouragement of plagiarism. Because the ‘attribution’ requirement in CC BY is very loose, it is possible for a reuser of a humanities article to alter it and reissue it under their own name, specifying only that it is an adaptation of the original, but without specifying how it has been adapted. In this way reusers may adopt the style, argument and ‘personality’ of the original work under their own name (and even copyright it). This represents a violation of the specific moral right of the author to the integrity of the work, and the only recourse offered to the author by CC BY is to have their name removed from the attribution (which makes the violation worse). This kind of re-use is as likely to degrade as to enhance the public benefit of the research.

The British Academy’s response to the Commons Select Committee (2013) noted that many articles in HSS subjects are the product of single-author scholarship, where there is more of a claim on ‘moral rights’ that are not adequately protected under an unrestricted CC-BY licence. There were also concerns about commercial reuse of work that contains third party copyright, involving complicated permissions. The response suggests that it should be possible to vary Creative Commons licences according to the usages and requirements of different subject areas – and that an ‘Attribution-NonCommercial-NoDerivs’ licence (CC-BY-NC-ND) may very often be more appropriate

Notes on an April 2013 Royal Historic Society position changing workshop on CC-BY and Humanities (chaired by Peter Mandler) noted that the editors of a number of history journals have suggested that the CC-BY licence facilitates and promotes commercial re-use and uses akin to plagiarism; that the licence therefore amounts to an infringement of authors’ moral and intellectual property rights; and that it is likely to damage the quality of education.

The HistoryUK Submission to the 2013 Business, Innovation and Skills Committee Enquiry on Open Access Publishing raised issues about the loss of protection of intellectual property, the dangers associated with allowing derivative works in sensitive areas of research, and the possible increased costs or embargos publishers may feel compensate for the transfer of a commercial asset to a third party.

Comments from researchers and administrators

In preparation for the round table, Danny Kingsley asked her community across the sector what kinds of objections different people in an administrative or library role had heard from researchers. These are summarised below.

English researcher at Cambridge – “I would prefer not to make my work, produced with the benefit of public funding, available in a form that would allow others to exploit it commercially, as the simple CC-BY licence does. My preference would be for the CC BY-NC-SA licence.”

Research Information Specialist – One question to ask here is whether traditional publishing models – such as signing over copyright itself – are really more beneficial to authors, and of course to weigh the risk of a negative CC experience against the benefits of positive ones.

Concerns raised in discussion with academics in the Humanities (reflected in two responses)

  1. A belief that CC BY encourages plagiarism
  2. That content licenced under CC BY is not monitored for copyright and other infringement to the same extent as more restrictive licences (a misguided belief that publishers actively monitor use and reuse of content I think)
  3. I have also heard the more vague concern about ideas being manipulated or twisted in some way and then re-published under the author’s name
  4. That encouraging reuse, especially derivatives, means the author has no control over what people do with the information (and therefore are associated with something that they would rather not be)

Advice provided on Creative Commons and licensing

Published 3 March 2016
Written by Dr Danny Kingsley, with thanks to Dr Philip Boyes and Dr Joyce Heckman for their notes.

Creative Commons License

 

‘It is all a bit of a mess’ – observations from Researcher to Reader conference

“It is all a bit of a mess. It used to be simple. Now it is complicated.” This was the conclusion of Mark Carden, the coordinator of the Researcher to Reader conference after two days of discussion, debate and workshops about scholarly publication..

The conference bills itself as: ‘The premier forum for discussion of the international scholarly content supply chain – bringing knowledge from the Researcher to the Reader.’ It was unusual because it mixed ‘tribes’ who usually go to separate conferences. Publishers made up 47% of the group, Libraries were next with 17%, Technology 14%, Distributors were 9% and there were a small number of academics and others.

In addition to talks and panel discussions there were workshop groups that used the format of smaller groups that met three times and were asked to come up with proposals. In order to keep this blog to a manageable length it does not include the discussions from the workshops.

The talks were filmed and will be available. There was also a very active Twitter discussion at #R2RConf.  This blog is my attempt to summarise the points that emerged from the conference.

Suggestions, ideas and salient points that came up

  • Journals are dead – the publishing future is the platform
  • Journals are not dead – but we don’t need issues any more as they are entirely redundant in an online environment
  • Publishing in a journal benefits the author not the reader
  • Dissemination is no longer the value added offered by publishers. Anyone can have a blog. The value-add is branding
  • The drivers for choosing research areas are what has been recently published, not what is needed by society
  • All research is generated from what was published the year before – and we can prove it
  • Why don’t we disaggregate the APC model and charge for sections of the service separately?
  • You need to provide good service to the free users if you want to build a premium product
  • The most valuable commodity as an editor is your reviewer time
  • Peer review is inconsistent and systematically biased.
  • The greater the novelty of the work the greater likelihood it is to have a negative review
  • Poor academic writing is rewarded

Life After the Death of Science Journals – How the article is the future of scholarly communication

Vitek Tracz, the Chairman of the Science Navigation Group which produces the F1000Research series of publishing platforms was the keynote speaker. He argued that we are coming to the end of journals. One of the issues with journals is that the essence of journals is selection. The referee system is secret – the editors won’t usually tell the author who the referee is because the referee is working for the editor not the author. The main task of peer review is to accept or reject the work – there may be some idea to improve the paper. But that decision is not taken by the referees, but by the editor who has the Impact Factor to consider.

This system allows for information to be published that should not be published – eventually all publications will find somewhere to publish. Even in high level journals many papers cannot be replicated. A survey by PubMed found there was no correlation between impact factor and likelihood of an abstract being looked at on PubMed.

Readers can now get papers they want by themselves and create their own collections that interest them. But authors need journals because IF is so deeply embedded. Placement in a prestigious journal doesn’t increase readership, but it does increase likelihood of getting tenure. So authors need journals, readers don’t.

Vitek noted F1000Research “are not publishers – because we do not own any titles and don’t want to”. Instead they offer tools and services. It is not publishing in the traditional sense because there is no decision to publish or not publish something – that process is completely driven by authors. He predicted this will be the future of science publishing will shift from journals to services (there will be more tools & publishing directly on funder platforms).

In response to a question about impact factor and author motivation change, Vitek said “the only way of stopping impact factors as a thing is to bring the end of journals”. This aligns with the conclusions in a paper I co-authored some years ago. ‘The publishing imperative: the pervasive influence of publication metrics’

Author Behaviours

Vicky Williams, the CEO of research communications company Research Media discussed “Maximising the visibility and impact of research” and talked abut the need to translate complex ideas in research into understandable language.

She noted that the public does want to engage with research. A large percentage of public want to know about research while it is happening. However they see communication about research is poor. There is low trust in science journalism.

Vicki noted the different funding drivers – now funding is very heavily distributed. Research institutions have to look at alternative funding options. Now we have students as consumers – they are mobile and create demand. Traditional content formats are being challenged.

As a result institutions are needing to compete for talent. They need to build relationships with industry – and promotion is a way of achieving that. Most universities have a strong emphasis on outreach and engagement.

This means we need a different language, different tone and a different medium. However academic outputs are written for other academics. Most research is impenetrable for other audiences. This has long been a bugbear of mine (see ‘Express yourself scientists, speaking plainly isn’t beneath you’).

Vicki outlined some steps to showcase research – having a communications plan, network with colleagues, create a lay summary, use visual aids, engage. She argued that this acts as a research CV.

Rick Anderson, the Associate Dean of the University of Utah talked about the Deeply Weird Ecosystem of publishing. Rick noted that publication is deeply weird, with many different players – authors (send papers out), publishers (send out publications), readers (demand subscriptions), libraries (subscribe or cancel). All players send signals out into the school communications ecosystem, when we send signals out we get partial and distorted signals back.

An example is that publishers set prices without knowing the value of the content. The content they control is unique – there are no substitutable products.

He also noted there is a growing provenance of funding with strings. Now funders are imposing conditions on how you want to publish it not just the narrative of the research but the underlying data. In addition the institution you work for might have rules about how to publish in particular ways.

Rick urged authors answer the question ‘what is my main reason for publishing’ – not for writing. In reality it is primarily to have high impact publishing. By choosing to publish in a particular journal an author is casting a vote for their future. ‘Who has power over my future – do they care about where I publish? I should take notice of that’. He said that ‘If publish with Elsevier I turn control over to them, publishing in PLOS turns control over to the world’.

Rick mentioned some journal selection tools. JANE is a system (oriented to biological sciences) where authors can plug in abstract to a search box and it analyses the language and comes up with suggested list of journals. The Committee on Publication Ethics (COPE) member list provides a ‘white list’ of publishers. Journal Guide helps researchers select an appropriate journal for publication.

A tweet noted that “Librarians and researchers are overwhelmed by the range of tools available – we need a curator to help pick out the best”.

Peer review

Alice Ellingham who is Director of Editorial Office Ltd which runs online journal editorial services for publishers and societies discussed ‘Why peer review can never be free (even if your paper is perfect)’. Alice discussed the different processes associated with securing and chasing peer review.

She said the unseen cost of peer review is communication, when they are providing assistance to all participants. She estimated that per submission it takes about 45-50 minutes per paper to manage the peer review. 

Editorial Office tasks include looking for scope of a paper, the submission policy, checking ethics, checking declarations like competing interests and funding requests. Then they organise the review, assist the editors to make a decision, do the copy editing and technical editing.

Alice used an animal analogy – the cheetah representing the speed of peer review that authors would like to see, but a tortoise represented what they experience. This was very interesting given the Nature news piece that was published on 10 February “Does it take too long to publish research?

Will Frass is a Research Executive at Taylor & Francis and discussed the findings of a T&F study “Peer review in 2015 – A global view”. This is a substantial report and I won’t be able to do his talk justice here, there is some information about the report here, and a news report about it here.

One of the comments that struck me was that researchers in the sciences are generally more comfortable with single blind review than in the humanities. Will noted that because there are small niches in STM, double blind often becomes single blind anyway as they all know each other.

A question from the floor was that reviewers spend eight hours on a paper and their time is more important than publishers’. The question was asking what publishers can do to support peer review? While this was not really answered on the floor* it did cause a bit of a flurry on Twitter with a discussion about whether the time spent is indeed five hours or eight hours – quoting different studies.

*As a general observation, given that half of the participants at the conference were publishers, they were very underrepresented in the comment and discussion. This included the numerous times when a query or challenge was put out to the publishers in the room. As someone who works collaboratively and openly, this was somewhat frustrating.

The Sociology of Research

Professor James Evans, who is a sociologist looking at the science of science at the University of Chicago spoke about How research scientists actually behave as individuals and in groups.

His work focuses on the idea of using data from the publication process that tell rich stories into the process of science. James spoke about some recent research results relating to the reading and writing of science including peer reviews and the publication of science, research and rewarding science.

James compared the effect of writing styles to see what is effective in terms of reward (citations). He pitted ‘clarity’ – using few words and sentences, the present tense, and maintaining the message on point against ‘promotion’ – where the author claims novelty, uses superlatives and active words.

The research found writing with clarity is associated with fewer citations and writing in promotional style is associated with greater citations. So redundancy and length of clauses and mixed metaphors end up enhancing a paper’s search ability. This harks back to the conversation about poor academic writing the day before – bad writing is rewarded.

Scientists write to influence reviewers and editors in the process. Scientists strategically understand the class of people who will review their work and know they will be flattered when they see their own research. They use strategic citation practices.

James noted that even though peer review is the gold standard for evaluating the scientific record. In terms of determining the importance or significance of scientific works his research shows peer review is inconsistent and systematically biased. The greater the reviewer distance results in more positive reviews. This is possibly because if a person is reviewing work close to their speciality, they can see all the criticism. The greater the novelty of the work the greater likelihood it is to have a negative review. It is possible to ‘game’ this by driving the peer review panels. James expressed his dislike of the institution of suggesting reviewers. These provide more positive, influential and worse reviews (according to the editors).

Scientists understand the novelty bias so they downplay the new elements to the old elements. James discussed Thomas Kuhn’s concept of the ‘essential tension’ between the classes of ‘career considerations’ – which result in job security, publication, tenure (following the crowd) and ‘fame’ – which results in Nature papers, and hopefully a Nobel Prize.

This is a challenge because the optimal question for science becomes a problem for the optimal question for a scientific career. We are sacrificing pursuing a diffuse range of research areas for hubs of research areas because of the career issue.

The centre of the research cycle is publication rather than the ‘problems in the world’ that need addressing. Publications bear the seeds of discovery and represent how science as a system thinks. Data from the publication process can be used to tune, critique and reimagine that process.

James demonstrated his research that clearly shows that research today is driven by last year’s publications. Literally. The work takes a given paper and extracts the authors, the diseases, the chemicals etc and then uses a ‘random walk’ program. The result ends up predicting 95% of the combinations of authors and diseases and chemicals in the following year.

However scientists think they are getting their ideas, the actual origin is traceable in the literature. This means that research directions are not driven by global or local health needs for example.

Panel: Show me the Money

I sat on this panel discussion about ‘The financial implications of open access for researchers, intermediaries and readers’ which made it challenging to take notes (!) but two things that struck me in the discussions were:

Rick Andersen suggested that when people talk about ‘percentages’ in terms of research budgets they don’t want you to think about the absolute number, noting that 1% of Wellcome Trust research budget is $7 million and 1% of the NIH research budget is $350 million.

Toby Green, the Head of Publishing for the OECD put out a challenge to the publishers in the audience. He noted that airlines have split up the cost of travel into different components (you pay for food or luggage etc, or can choose not to), and suggested that publishers split APCs to pay for different aspects of the service they offer and allow people to choose different elements. The OECD has moved to a Freemium model where that the payment comes from a small number of premium users – that funds the free side.

As – rather depressingly – is common in these kinds of discussions, the general feeling was that open access is all about compliance and is too expensive. While I am on the record as saying that the way the UK is approaching open access is not financially sustainable, I do tire of the ‘open access is code for compliance’ conversation. This is one of the unexpected consequences of the current UK open access policy landscape. I was forced to yet again remind the group that open access is not about compliance, it is about providing public access to publicly funded research so people who are not in well resourced institutions can also see this research.

Research in Institutions

Graham Stone, the Information Resources Manager, University of Huddersfield talked about work he has done on the life cycle of open access for publishers, researchers and libraries. His slides are available.

Graham discussed how to get open access to work to our advantage, saying we need to get it embedded. OAWAL is trying to get librarians who have had nothing to do with OA into OA.

Graham talked the group through the UK Open Access Life Cycle which maps the research lifecycle for librarians and repository managers, research managers, fo authors (who think magic happens) and publishers.

My talk was titled ‘Getting an Octopus into a String Bag’. This discussed the complexity of communicating with the research community across a higher education institution. The slides are available.

The talk discussed the complex policy landscape, the tribal nature of the academic community, the complexity of the structure in Cambridge and then looked at some of the ways we are trying to reach out to our community.

While there was nothing really new from my perspective – it is well known in research management circles that communicating with the research community – as an independent and autonomous group – is challenging. This is of course further complicated by the structure of Cambridge. But in preliminary discussions about the conference, Mark Carden, the conference organiser, assured me that this would be news to the large number of publishers and others who are not in a higher education institution in the audience.

Summary: What does everybody want?

Mark Carden summarised the conference by talking about the different things different stakeholder in the publishing game want.

Researchers/Authors – mostly they want to be left alone to get on with their research. They want to get promoted and get tenure. They don’t want to follow rules.

Readers – want content to be free or cheap (or really expensive as long as something else is paying). Authors (who are readers) do care about the journals being cancelled if it is one they are published in. They want a nice clear easy interface because they are accessing research on different publisher’s webpages. They don’t think about ‘you get what you pay for.’

Institutions – don’t want to be in trouble with the regulators, want to look good in league tables, don’t want to get into arguments with faculty, don’t want to spend any money on this stuff.

Libraries – Hark back to the good old days. They wanted manageable journal subscriptions, wanted free stuff, expensive subscriptions that justified ERM. Now libraries are reaching out for new roles and asking should we be publishers, or taking over the Office of Research, or a repository or managing APCs?

Politicians – want free public access to publicly funded research. They love free stuff to give away (especially other people’s free stuff).

Funders – want to be confusing, want to be bossy or directive. They want to mandate the output medium and mandate copyright rules. They want possibly to become publishers. Mark noted there are some state controlled issues here.

Publishers – “want to give huge piles of cash to their shareholders and want to be evil” (a joke). Want to keep their business model – there is a conservatism in there. They like to be able to pay their staff. Publishers would like to realise their brand value, attract paying subscribers, and go on doing most of the things they do. They want to avoid Freemium. Publishers could be a platform or a mega journal. They should focus on articles and forget about issues and embrace continuous publishing. They need to manage versioning.

Reviewers – apparently want to do less copy editing, but this is a lot of what they do. Reviewers are conflicted. They want openness and anonymity, slick processes and flexibility, fast turnaround and lax timetables. Mark noted that while reviewers want credit or points or money or something, you would need to pay peer reviewers a lot for it to be worthwhile.

Conference organisers – want the debate to continue. They need publishers and suppliers to stay in business.

Published 18 February 2016
Written by Dr Danny Kingsley
Creative Commons License

Charities’ perspective on research data management and sharing

In 2015 the Cambridge Research Data Team organised several discussions between funders and researchers. In May 2015 we hosted Ben Ryan from EPSRC, which was followed by a discussion with Michael Ball from BBSRC in August. Now we have invited our two main charity funders to discuss their views on data management and sharing with Cambridge researchers.

David Carr from the Wellcome Trust and Jamie Enoch from Cancer Research UK (CRUK) met with our academics on Friday 22 January at the Gurdon Institute. The Gurdon Institute was founded jointly by the Wellcome Trust and CRUK to promote research in the areas of developmental biology and cancer biology, and to foster a collaborative environment for independent research groups with diverse but complementary interests.

This blog summarises the presentations and discusses the data sharing expectations from Wellcome Trust and CRUK. A second related blog ‘In conversation with Wellcome Trust and CRUK‘ summarises the question and answer session that was held with a group of researchers on the same day.

Wellcome Trust’s requirements for data management and sharing

Sharing research data is key for Wellcome’s goal of improving health

David Carr started his presentation explaining that the Wellcome Trust’s mission is to support research with the goal of improving health. Therefore, the Trust is committed to ensuring research outputs (including research data) can be accessed and used in ways that will maximise health and societal benefits. David reminded the audience of benefits of data sharing. Data which is shared has the potential to:

  • Enable validity and reproducibility of research findings to be assessed
  • Increase the visibility and use of research findings
  • Enable research outputs to be used to answer new questions
  • Reduce duplication and waste
  • Enable access to data to other key communities – public, policymakers, healthcare professionals etc.

Data sharing goes mainstream

David gave on overview of data sharing expectations from various angles. He started by referring to the Royal Society’s report from 2012: Science as an open enterprise, which sets sharing as the standard for doing science. He then also mentioned other initiatives like the G8 Science Ministers’ statement, the joint report from the Academy of Medical Sciences, BBSRC, MRC and Wellcome Trust on reproducibility and reliability of biomedical research and the UK Concordat on Open Research Data with a take-home message that sharing data and other research outputs is increasingly becoming a global expectation, and a core element of good research practice.

Wellcome Trust’s policy for open data

The next aspect of David’s presentation was Wellcome Trust’s policy on data management and sharing. The policy was first published almost a decade ago (2007) with subsequent modifications in 2010. The principle of the policy is simple: research data should be shared and preserved in a manner which maximises its value to advance research and improve health. Wellcome Trust also requires data management plans as a compulsory part of grant applications, where the proposed research is likely to generate a dataset that will have significant value to researchers and other users. This is to ensure that researchers understand the importance of data management and sharing and to plan for it from the start their projects.

Cost of data sharing

Planning for data management and sharing involves costing for these activities in the grant proposal. The Wellcome Trust’s FAQ guidance on data sharing policy says that: “The Trust considers that timely and appropriate data management and sharing should represent an integral component of the research process. Applicants may therefore include any costs associated with their proposed approach as part of their proposal.” David then outlined the types of costs that can be included in grant applications (including for dedicated staff, hardware and software, and data access costs). He noted that in the current draft guidance on costing for data management estimated costs for long-term preservation that extend beyond the lifetime of the grant are not eligible, although costs associated with the deposition of data in recognised data repositories can be requested.

Key priorities and emerging areas in data management and sharing

Infrastructure

The Wellcome Trust also identified key priorities and emerging areas where work needs to be done to better support of data management and sharing. The first one was to provide resources and platforms for data sharing and access. David pointed out that wherever available, discipline-specific data repositories are the best home for research data, as they provide rich metadata standards, community curation and better discoverability of datasets.

However, the sustainability of discipline-specific repositories is sometimes uncertain. Discipline-specific resources are often perceived as ‘free’. However, research data submitted to ‘free’ data repositories has to be stored somewhere and the amount of data produced and shared is growing exponentially – someone has to pay for the cost of storage and long-term curation in discipline-specific data repositories. An additional point for consideration is that many disciplines do not have their own repositories and therefore need to heavily rely on institutional support.

Access

Wellcome Trust funds a large number of projects in clinical areas. Dealing with patient data requires careful ethical considerations and planning from the very start of the project to ensure that data can be successfully shared at the end of the project. To support researchers in dealing with patient data The Expert Advisory Group on Data Access (a cross-funder advisory body established by MRC, ESRC, Cancer Research UK and the Wellcome Trust) has developed guidance documents and practice papers about handling of sensitive data: how to ask for informed consent, how to anonymise data and the procedures that need to be in place when granting access to data. David stressed that balance needs to be struck between maximising the use of data and the need to safeguard research participants.

Incentives for sharing

Finally, if sharing is to become the normal thing to do, researchers need incentives to do so. Wellcome Trust is keen to work with others to ensure that researchers who generate and share datasets of value receive appropriate recognition for their efforts. A recent report from the Expert Advisory Group on Data Access proposed several recommendations to incentivise data sharing, with specific roles for funders, research leaders, institutions and publishers. Additionally, in order to promote data re-use, the Wellcome Trust joined forces with the National Institutes of Health and the Howard Hughes Medical Institute and launched the Open Science Prize competition to encourage prototyping and development of services, tools or platforms that enable open content.

Cancer Research UK’s views on data sharing

The next talk was by Jamie Enoch from Cancer Research UK. Jamie started by saying that because Cancer Research UK (CRUK) is a charity funded by the public, it needs to ensure it makes the most of its funded research: sharing research data is elemental to this. Making the most of the data generated through CRUK grants could help accelerate progress towards the charity’s aim in its research strategy, to see three quarters of people surviving cancer by 2034. Jamie explained that his post – Research Funding Manager (Data) – has been created as a reflection of data sharing being increasingly important for CRUK.

The policy

Jamie started talking about the key principles of CRUK data sharing policy by presenting the main issues around research data sharing and explaining the CRUK’s position in relation to them:

  • What needs to be shared? All research data, including unpublished data, source code, databases etc, if it is feasible and safe to do so. CRUK is especially keen to ensure that data underpinning publications is made available for sharing.
  • Metadata: Researchers should adhere to community standards/minimum information guidelines where these exist.
  • Discoverability: Groups should be proactive in communicating the contents of their datasets and showcasing the data available for sharing

Jamie explained that CRUK really wants to increase the discoverability of data. For example, clinical trials units should ideally provide information on their websites about the data they generate and clear information about how it can be accessed.

  • Modes of sharing: Via community or generalist repositories, under the auspices of the PI or a combination of methods

Jamie explained that not all data can be/should be made openly available. Due to ethical considerations sometimes access to data will have to be restricted. Jamie explained that as long as restrictions are justified, it is entirely appropriate to use them. However, if access to data is restricted, the conditions on which access will be granted should be considered at the project outset, and these conditions will have to be clearly outlined in metadata descriptions to ensure fair governance of access.

  • Timeframes: Limited period of exclusive use permitted where justified

Jamie suggested adhering to community standards when thinking about any periods of exclusive use of generated research data. In some communities research data is made accessible at the time of publication. Other communities will expect data release at the time of generation (especially in collaborative genomics projects). Jamie further explained that particularly in cases where new data can affect policy development, it is key that research data is released as soon as possible.

  • Preservation: Data to be retained for at least 5 years after grant end
  • Acknowledgement: Secondary users of data should credit original researcher and CRUK
  • Costs: Appropriately justified costs can be included in grant proposals

As of late 2015, financial support for data management and sharing can be requested as a running cost in grant applications. Jamie explained that there are no particular guidelines in place explaining eligible and non-eligible costs and that the most important aspect is whether the costs are well justified or not, and reasonable in the context of the research envisaged.

Jamie stressed that the key point of the CRUK policy is to facilitate data sharing and to engage with the research community, recognising the challenges of data sharing for different projects and the need to work through these collaboratively, rather than enforce the policy in a top-down fashion.

Policy implementation

Subsequently, the presentation discussed ways in which CRUK policy is implemented. Jamie explained that the main tool for the policy implementation is the new requirement for data management plans as compulsory part of grant applications.

Two of the three main response mode committees: Science Committee and Clinical Research Committee have a two-step process of writing a data management plan. During the grant application stage researchers need to write a short, free-form description about how they plan to adhere to CRUK’s policy on data sharing. Only if the grant is accepted, the beneficiary will be asked to write a more detailed data management plan, in consultation with CRUK representatives.

This approach serves two purposes as it:

  • ensures that all applicants are aware of CRUK’s expectations on data sharing (they all need to write a short paragraph about data sharing)
  • saves researchers’ time: only those applicants who were successful will have to provide a detailed data management plan, and it allows the CRUK office to engage with successful applicants on data sharing challenges and opportunities

In contrast, applicants for the other main CRUK response mode committee, the Population Research Committee, all fill out a detailed data management and sharing plan at application stage because of the critical importance of sharing data from cohort and epidemiological studies.

Outlooks for the future

Similarly to the Wellcome Trust, CRUK realised that cultural change is needed for sharing to become the normality. CRUK have initiated many national and international partnerships to help the reward of data sharing.

One of them is a collaboration with the YODA (Yale Open Data Access) project aiming to develop metrics to monitor and evaluate data sharing. Other areas of collaborative work include collaboration with other funders on development of guidelines on ethics of data management and sharing, platforms for data preservation and discoverability, procedures for working with population and clinical data. Jamie stressed that the key thing for CRUK is to work closely with researchers and research managers – to understand the challenges and work through these collaboratively, and consider exciting new initiatives to move the data sharing field forwards.

Links

Published 5 February 2016
Written by Dr Marta Teperek, verified by David Carr and Jamie Enoch
Creative Commons License