Tag Archives: policy

Watch this space – the first OSI workshop

It was always an ambitious project – trying to gather 250 high level delegates from all aspects of the scholarly communication process with the goal of better communication and idea sharing between sectors of the ecosystem. The first meeting of the Open Scholarship Initiative (OSI) happened in Fairfax, Virginia last week. Kudos to the National Science Communication Institute for managing the astonishing logistics of an exercise like this – and basically pulling it off.

This was billed as a ‘meeting between global, high-level stakeholders in research’ with a goal to ‘lay the groundwork for creating a global collaborative framework to manage the future of scholarly publishing and everything these practices impact’. The OSI is being supported by UNESCO who have committed to the full 10 year life of the project. As things currently stand, the plan is to repeat the meeting annually for a decade.

Structure of the event

The process began in July last year with emailed invitations from Glenn Hampson, the project director. For those who accepted the invitation, a series of emails from Glenn started with tutorials attached to try and ensure the delegates were prepared and up to speed. The emails gathered momentum with online discussions between participants. Indeed much was made of the (many) hundreds of emails the event had generated.

The overall areas the Open Scholarship Initiative hopes to cover include research funding policies, interdisciplinary collaboration efforts, library budgets, tenure evaluation criteria, global institutional repository efforts, open access plans, peer review practices, postdoc workload, public policy formulation, global research access and participation, information visibility, and others. Before arriving delegates had chosen their workgroup topic from the following list:

  • Embargos
  • Evolving open solutions (1)
  • Evolving open solutions (2)
  • Information overload & underload
  • Open impacts
  • Peer review
  • Usage dimensions of open
  • What is publishing? (1)
  • What is publishing? (2)
  • Impact factors
  • Moral dimensions of open
  • Participation in the current system
  • Repositories & preservation
  • What is open?
  • Who decides?

The 190+ delegates from 180+ institutions, 11 countries and 15 stakeholder groups gathered together at George Mason University (GMU), and after preliminary introductions and welcomes the work began immediately with everyone splitting into their workgroups. We spent the first day and a half working through our topics and preparing a short presentation for feedback on the second afternoon. There was then another working session to finalise the presentations before the live-streamed final presentations on the Friday morning. These presentations are all available in Figshare (thanks to Micah Vandegrift).

The event is trying to address some heady and complex questions and it was clear from the first set of presentations that in some instances it had been difficult to come to a consensus, let alone a plan for action. My group had the relative luxury of a topic that is fairly well defined – embargoes. It might be useful for the next event to focus on specific topics and move from the esoteric to the practical.

In addition the meeting had a team of ‘at large’ people who floated between groups to try and identify themes. Unsurprisingly, the ‘Primacy of Promotion and Tenure’ was a recurring theme throughout many of the presentations. It has been clear for some time that until we can achieve some reform of the promotion and tenure process, many of the ideas and innovations in scholarly communication won’t take hold. I would suggest that the different aspects of the reward/incentive system would be a rich vein to mine at OSI2017.

Closed versus open

In terms of outcomes there was some disquiet beforehand, by people who were not attending, about the workshop effectively being ‘closed’. This was because there was a Chatham House Rule for the workgroups to allow people to speak freely about their own experiences.

There was also some disquiet by those people who were attending about a request that the workgroups remain device-free. This was to try and discourage people checking emails and not participating. However people revert to type – in our group we all used our devices to collaborate on our documents. In the end we didn’t have much of a choice, the incredibly high tech room we were using in the modern GMU library flummoxed us and we were unable to get the projector to work.

That all said, there is every intention to disseminate the findings of the workshops widely and openly. During the feedback and presentations sessions there was considerable Twitter discussion at #OSI2016 – there is a downloadable list of all tweets in figshare – note there were enough to make the conference trend on Twitter at one point. This networked graphic shows the interrelationships across Twitter (thanks to Micah and his colleague). In addition there will be a report published by George Mason University Press incorporating the summary reports from each of the groups.

Team Embargo

Our workgroup, like all of them, represented a wide mix of interest groups. We were:

  • Ann Riley – President, Association of College and Research Libraries
  • Audrey McCulloch, Chief Executive, Association of Learned and Professional Societies
  • Danny Kingsley – Head of Scholarly Communication, Cambridge University
  • Eric Massant, Senior Director of Government and Industry Affairs, RELX Group
  • Gail McMillan, Director of Scholarly Communication, Virginia Tech
  • Glenorchy Campbell, Managing Director, British Medical Journal North America
  • Gregg Gordon, President, Social Science Research Network
  • Keith Webster, Dean of Libraries, Carnegie Mellon University
  • Laura Helmuth, incoming president, National Association of Science Writers
  • Tony Peatfield, Director of Corporate Affairs, Medical Research Council, Research Councils, UK
  • Will Schweitzer, Director of Product Development, AAAS/Science

It might be worth noting here that our workgroup was naughty and did not agree beforehand on who would facilitate, so therefore no-one had attended the facilitation pre-workshop webinar. This meant our group was gloriously facilitator and post-it note free – we just got on with it.

Banishing ghosts

We began with some definitions about what embargoes are, noting that press embargoes, publication embargoes and what we called ‘security’ embargoes (like classified documents) all serve different purposes.

Embargoes are not ‘all bad’. In the instance of press embargoes they allow journalists early access to the publication in order for them to be able to investigate and write/present informed pieces in the media. This benefits society because it allows for stronger press coverage. In terms of security embargoes they protect information that is not meant to be in the public domain. However embargoes on Author’s Accepted Manuscripts in repositories are more contentious, with qualified acceptance that these are a transitional mechanism in a shift to full open access.

The causal link of green open access resulting in subscription loss is not yet proven. The September 2013 UK Business, Innovation and Skills Committee Fifth Report: Open Access stated “There is no available evidence base to indicate that short or even zero embargoes cause cancellation of subscriptions”. In 2012 the Committee for Economic Development Digital Connections Council in The Future of Taxpayer-Funded Research: Who Will Control Access to the Results? concluded that “No persuasive evidence exists that greater public access as provided by the NIH policy has substantially harmed subscription-supported STM publishers over the last four years or threatens the sustainability of their journals”.

However there is no argument that traffic on websites for journals that rely on advertising dollars (such as medical journals) suffer when the attention is pulled to another place. This clearly potentially affects advertising revenue which in turn can impact on the financial model of those publication.

During our discussions about the differences between press embargoes and publication embargoes I mentioned some recent experiences in Cambridge. The HEFCE Open Access Policy requires us to collect Author’s Accepted Manuscripts at the time of acceptance and make the metadata about them available, ideally before publication. We respect publishers’ embargoes and keep the document itself locked down until these have passed post-publication. However we have been managing calls from sometimes distressed members of our research community who are worried that making the metadata available prior to publication will result in the paper being ‘pulled’ by the journal. Whether this has ever actually happened I do not know – and indeed would be happy to hear from anyone who has a concrete example so we can start managing reality instead of rumour. The problem in these instances is the researchers are confusing the press embargo with the publication embargo.

And that is what this whole embargo discussion comes down to. Much of the discourse and arguments about embargoes are not evidence based. There is precious little evidence to support the tenet that sits behind embargoes – which is that if publishers allow researchers to make copies of their work available open access then they will lose subscriptions. The lack of evidence does not prevent the possibility it is true however – and that is why we need to settle the situation once and for all. If there is a sustainability issue for journals because of wider green open access then we need to put some longer term management in place and work towards full open access.

It is possible the problem is not repositories, institutional or subject-based. Many authors are making the final version of their published work available in contravention of their Copyright Transfer Agreement in ResearchGate or Academia.edu. It might be that this availability of work is having an impact on researcher’s usage of work on the publishers’ sites. Given that in institutional repositories repository managers make huge efforts to comply with complicated embargoes it is quite possible that repositories are not the problem. Indeed, only a small proportion of work is made available through repositories according to the August 2015 Monitoring the Transition to Open Access report (look at ‘Figure 9. Location of online postings (including illicit postings)’ on page 38).  If this is the case, requiring institutions to embargo the Author’s Accepted Manuscripts they hold in their repositories for long periods will not make any difference. They are not the solution.

Our conclusion from our preliminary discussions was that there needs to be some concrete, rigorous research into the rationale behind embargoes to inform publishers, researchers and funders.

Our proposal – research questions

In response to this the Embargo workgroup decided that the most effective solution was to collaborate on an agreed research process that will have the buy-in of all stakeholders. The overarching question that we want to try and answer is ‘What are the impacts of embargoes on scholarly communication?’ with the goal to create an evidence base for informed discussion on embargoes .

In order to answer that question we have broken the big issue into a series of smaller questions:

  • How are embargoes determined?
  • How do researchers/students find research articles?
  • Who needs access?
  • Impact of embargoes on researchers/students?
  • Effect of embargoes on other stakeholders?

We decided that if the research found there was a case for publication embargoes then agreement on the metrics that should be used to determine the length of an embargo would be helpful. We are hoping that this research will allow standards to be introduced in the area of embargoes.

Discoverability and the issue of searching behaviour is extremely relevant in this space. Our hypothesis is if people are following publishers’ journal pages to find material then the fact that some of the same information is disbursed amongst lots of repositories means that the publisher arguments that embargoes threaten their finances are weakened. However if people are primarily using centralised search engines such as Google Scholar (which favours open versions of articles over paid ones) then that strengthens the publisher argument that they need embargoes to protect revenue.

The other question is whether access really is an issue for researchers. The March 2015 STM Report looked at the research in this area which indicate that well over 90% of researchers surveyed in separate studies said research papers were easy or fairly easy to access which appears to suggests on the face of it little problem in the way of access (look for the ‘Researchers’ access to journals’ section starting p83). Rather than repeating these surveys indicators for how much embargoes restrict access to researchers could include:

  • The usage of Request a Copy buttons in repositories
  • The number of ‘turn-aways’ from publishers platforms
  • The take-up level of Pay Per View options on publisher sites
  • The level of usage of ‘Get it Now’ – where the library obtains a copy through interlibrary loan or document delivery and absorbs the cost.

Our proposal – Research structure

The project will begin with a Literature Review and an investigation into the feasibility of running some Case Studies.

Two clear Case Studies could provide direct evidence if the publishers were willing to share what they have learned. In both cases, there has been a move from an embargo period for green OA to removing embargoes completely. In the first instance, Taylor and Francis began a trial in 2011 to allow immediate green OA for their library and information science journals, meaning that authors published in 35 library and information science journals have the right to deposit their Accepted Manuscript into their institutional repository and make it immediately available. Authors who choose to publish in these journals are no longer asked to assign copyright. They now sign a license to publish, which allows Taylor & Francis to publish the Version of Record. Additionally, authors can choose to make their work green open access with no embargoes applied. In 2014 the pilot was extended for ‘at least a further year’.

As part of the pilot, Taylor and Francis say a survey was conducted by Routledge to canvas opinions on the Library & Information Science Author Rights initiative and also investigated author and researcher behaviour and views on author rights policies, embargoes and posting work to repositories. The survey elicited over 500 responses, including: “Having the option to upload their work to a repository directly after publication is very important to these authors: more than 2/3 of respondents rated the ability to upload their work to repositories at 8, 9, or 10 out of 10, with the vast majority saying they feel strongly that authors should have this right”. There are no links to this survey that I have been able to uncover. It would be useful to include this survey in the Literature Review and possibly build on it for other stakeholders.

The second Case Study is Sage that, in 2013, decided to move to an immediate green policy. Both examples would have enough data by now to indicate if these decisions have resulted in subscription cancellations. I have proposed this type of study before, to no end. Hopefully we might now have more traction.

The Literature Review and Case Studies will then inform the development of a Survey of different stakeholders – which may have to be slightly altered depending on the audience being surveyed.  This is an ambitious goal – because the intention is to have at least preliminary findings available for discussion at the next OSI in 2017.

There was some lively Twitter discussion in the room about our proposal to do the study. Some were saying that the issue is resolved. I would argue that anyone who is negotiating the embargo landscape at the moment (such as repository managers) would strongly disagree with the position. Others referred to research already done in this space, for example the Publishing and Ecology of European Research (PEER) project. This study does discuss embargoes but approached the question with a position that embargoes are valid. The study we are proposing is asking specifically if there is any evidence base for embargoes.

Next steps

We will be preparing a project brief and our report for the OSI publication over the next couple of weeks.

The biggest issue for the project will be for us to gather funding. We have done a preliminary assessment of the time required to do the work so we could work out a ballpark figure for the fundraising goal. Note that our estimation of the number of workdays required for the project was deemed as ‘ludicrously low’ by a consultant in discussion later.

It was noted by a funder in casual discussions that because publishers have a vested interest in embargoes they should fund research that investigates their validity. Indeed Elsevier have already offered to assist financially for which we are grateful, but for this work to be considered robust and for it to be widely accepted it will need to be funded from a variety of sources. To that end we intend to ‘crowd fund’ the research in batches of $5000. The number of those batches will depend on the level of our underestimation of the time required to undertake the work (!).

In terms of governance, Team Embargo (perhaps we might need a better name…) will be working together as the steering committee to develop the brief, organise funding and choose the research team to do the work. We will need to engage an independent researcher or research group to ensure impartiality.

Wrap up summary of the workshop

There were a few issues relating to the organisation of the workshop. Much was made of the many hundreds of emails that were sent both from the organising group and also amongst the delegates before-hand. This level of preliminary discussion was beneficial but using another tool might help. It was noted that the level of email was potentially the reason why some of the delegates who were invited did not attend.

There was a logistic issue in having 190+ delegates staying in a hotel situated in the middle of a set of highways that was a 30 minute bus ride away from the conference location at George Mason University (also situated in an isolated location). The solution was a series of buses to ferry us each way each day, and to and from the airport. We ate breakfast, lunch and dinner together at the workshop location. This combined with the lack of alcohol because we were at an undergraduate American campus (where the legal drinking age is 21) gave the experience something of a school camp feel. Coming from another planned capital city (Canberra, Australia) I am sure that Washington is a beautiful and interesting place. This was not the visit to find that out.

These minor gripes aside, as is often the case, the opportunity to meet people face to face was fantastic. Because there was a heavy American flavour to the attendees, I have now met in person many of the people I ‘know’ well through virtual exchanges. It was also a very good process to work directly with a group of experienced and knowledgeable people who all contributed to a tangible outcome.

OSI is an ambitious project, with plans for annual meetings over the next decade. It will be interesting to see if we really can achieve change.

Published 24 April 2016
Written by Dr Danny Kingsley
Creative Commons License

Is CC-BY really a problem or are we boxing shadows?

Comments from researchers and colleagues have indicated some disquiet about the Creative Commons (CC-BY) licence in some areas of the academic community. However, in conversation with some legal people and contemporaries at other institutions (some of these exchanges are replicated at the end of the blog) one of the observations was that generally academics are not necessarily cognizant with what the licences offer and indeed what protections are available under regular copyright.

To try and determine whether this was an education and advocacy problem or if there are real issues we had a roundtable discussion on 29 February at Cambridge University attended by about 35 people who were a mixture of academics, administrators, publishers and legal practitioners. The discussion centred on some of the objections raised in the information circulated before the meeting (which is summarised at the end of this blog). For ease of description each objection is addressed in turn.

Background

Creative Commons provide a series of licences that people who create work can add to their work which tell users what they can or cannot do with it. There are a range of licenses that run from no restrictions at all CC-0 to fairly restrictive CC-BY-NC-ND-SA* where the user must attribute the author, not amend the work, cannot make any financial gain from it and must put the same licence on anything they produce using this work.

There are increasing requirements from funders such as the Wellcome Trust and RCUK in the UK that any work published open access must have a Creative Commons Attribution (CC-BY) licence attached to it. The rationale behind this is that research needs to be available for other researchers to both read and reuse, but also to text and data mine without fear of copyright breaches. Work that is available under a CC-BY licence can be easily incorporated into course reading lists without copyright complications.

* Note added 8 March – a comment has been sent through is that the CC-BY-NC-ND-SA is impossible to apply because the share-alike and no derivatives clauses are mutually exclusive and cannot be applied together. See this explanation.

Summary of the discussion

The general feeling in the discussions was that academics do want to share their work but they don’t want things to be used incorrectly. The outcome of the discussion was that while there are some confusions in this area, and we could do some work on advocacy and educational materials there are also some specific cases where CC-BY has the potential to cause issues.  In a small number of cases issues have actually occurred.

Is CC-BY a problem? For whom?

We should note here that CC-BY only affects a proportion of research published in the UK. While all research is potentially affected by the HEFCE requirement to make work available, the route preferred is through placing a copy in a repository. So this discussion affects only those researchers who have a specific grant from the Charities Open Access Fund (Wellcome Trust) or the RCUK. Humanities researchers tend not to hold grants, and for those that do, it is their articles, not their monographs that are affected by this requirement.

While there are some actual concrete examples of issues for researchers in the Arts and Humanities, many of the problems discussed here are what could happen. There was a comment from a scientific publisher that the sciences also had some concerns about CC-BY when it was first introduced, but none of the concerns have actually come to fruition. Another person noted there have been hundreds of thousands of pieces of content published under CC-BY licences, with very few known problem cases or harm. This is telling. The question was raised: Are we just repeating myths?

On the other hand, just because issues haven’t happened yet does not mean that it would not be a serious problem should they did occur. One of the questions at the end of the discussion was: “Are the ethical norms of society strong enough to stop these concerns happening?” It would appear that to date they have been in the sciences.

Moral rights

CC-BY is an attribution licence. This means the moral right for the originator of the work to be identified is retained. However the moral right for the integrity of the research is not protected. The discussion centred around this.

If someone uses work under a CC-BY licence and makes alterations to it, they do need to indicate they have changed a work but not how they have altered it. The concern in the group was that the work could be altered so the meaning is entirely changed and it would still be attributed to the original author.

Authors can object to the derogatory treatment of their work. The recourse of being able to ask to have the originator’s name taken off the work was not seen as satisfactory because then the person who has adapted the work is potentially able to publish the work, which is based substantially on someone else’s work, as their own.

That said, one comment was that academic works are always open to interpretation, whether quoted or not and whether available under a CC-BY licence or not.

Translation

The area of translations does appear to have some concrete examples of problems caused by CC-BY for Humanities & Social Science authors. One of the issues is it is very difficult to check a translation unless the original author can read the language into which their work has been translated.

Plagiarism

Of all of the areas of discussion, plagiarism raised the most opinions. The accusation that CC-BY somehow ‘encourages’ plagiarism is often levelled. Some arguments are that making work available under a Creative Commons licence protect authors against plagiarism rather than encourage it. Works available in the public domain are far more easily identified as the original work than something published on paper and held on a library shelf, for example.

There was a debate about what actually constitutes plagiarism. One opinion was that ‘It’s plagiarism unless it’s in quotes’. However while the use of quote marks would protect the integrity of the work, there is nothing legally wrong with a derivative use of a work that is available under CC-BY – legally this is not plagiarism.

Nothing about the CC-BY licence overrides UK law about fair dealing. One of the lawyers present noted that academics don’t understand the details of copyright. Academics want full protection but also full sharing. In the world of the internet there’s a free-for-all – people copy-and-paste from wherever they want. No-one respects licences, so an academic work is not necessarily protected under current rules.

It was noted that plagiarism occurs all the time, even when articles are all rights reserved and under traditional copyright. And while Open Access publishing does make plagiarism easier (regardless of the licence), it doesn’t change the underlying principle that it’s unethical. Ethical behaviour in academia sits separately from copyright law.

Sensitive information

The area of sensitive information seems to have the strongest case for not using a CC-BY licence. Researchers working in areas that might contain sensitive information – such as medical or criminal areas – spend a great deal of time ensuring that their findings are presented sensitively and ensuring their distribution is appropriate. The concern with CC-BY licences mean that these findings can be misconstrued which would be damaging to the researcher and could go back to the participants and affect them. If presented in the wrong way, altered research outputs could affect not just their research but also participants.

There is an issue about the dialogue between the people that are being studied and if they have any moral rights about how the information is being used.

An example that was given was in anthropology, working with a community of Native Americans in northern California, who released sensitive data and stories from their cultural past which they want to be accessed. However because they have been exploited in the past they wanted some form of restriction on how these things can be reused. This is an example where a CC-BY licence would not be appropriate.

An oral historian discussed the type of work they do with subjects talking about traumatic periods of their life. In these cases the researcher enters in a covenant with them about how their work can be used. This would not be able to be dealt with ethically under a CC-BY licence. The issue is about subsequent control over reuse of research, with concern about it being co-opted and used in another context.

The question about ethical use of material was raised again, with someone noting that no matter what licence it is available under you can’t control what people do with your work if they disagree with you.

Items containing third party copyright

Being required to publish work under a CC-BY licence does cause problems for people whose work contains a large amount of 3rd party material. This is because the burden on the author to obtain permissions for all of the works would be both time consuming and expensive. May researchers have raised questions about whether they can even do their work if they’re required to publish under CC-BY.

That said, if researchers are themselves using CC-BY works this issue is mitigated because they automatically have permission to use the material. This raises the question; does CC-BY make it more difficult or easier?

Commercialisation

There were some examples raised where a series of works that were freely available had been packaged up and sold. This raised the question: Who is being harmed in commercial exploitation of academic works?

Academics do not publish in journals for money, so the originator of a work that is subsequently sold on is not personally losing a revenue stream. There was a distinction between the academic and non-academic publishing environment. It was agreed that the person buying these works are being scammed. The concern is that people are being exploited by being made to pay for things that should be freely available.

The discussion moved to whether a Non Commercial licence would solve this problem. The issue here is the confusion over the definition of ‘commercial’ in this context. An institution that has a revenue stream from student fees could be seen to be commercial and therefore unable to include CC-BY-NC items on their reading lists.

It was noted that CC-BY–NC-ND is extremely restrictive about ways works can be used.

Academic freedom

The discussion several times touched on the broader issue of the government putting an increasing number of requirements against researchers. The questions raised were: “Does someone who is fronting up with the money have the rights to enforce a particular licence? What about the subjects of a study?”

There is supposed to be arms length between funders and universities but a concern is that funding bodies want to have more power to tell academics what to work on.

Next steps

In summary, the discussion indicated that CC-BY licences do not encourage plagiarism, or issues with commercialism within academia (although there is a broader ethical issue). However in some cases CC-BY licences could pose problems for the moral integrity of the work and cause issues with translations. CC-BY licenses do create challenges for works containing sensitive information and for works containing third party copyright.

There is an expectation amongst the academic community that people behave ethically and within cultural norms.

As agreed with the group we have published this blog post which summarises the discussions held this week. In discussions about the Open Access Policy Framework for the University it would be helpful to include a statement that there is concern about CC-BY licences for some disciplines and types of research.

Background information sent to participants prior to the discussion

Commentary on CC-BY in published reports

The issue of the CC-BY licenses was a recurrent theme in A review of the RCUK review of implementation of its OA policy (March 2015). Many arts, humanities and social science disciplines hold ‘principled and practical objections to the use of CC-BY licences’ (p18). This is partly because work under a CC-BY license ‘could be both used commercially in ways of which the author does not approve and also might not be properly acknowledged as their work’ (pp19-20).

The Royal Historical Society evidence to the RCUK review noted that humanities scholars have particular objections to certain kinds of ‘derivative use’ that amount to the encouragement of plagiarism. Because the ‘attribution’ requirement in CC BY is very loose, it is possible for a reuser of a humanities article to alter it and reissue it under their own name, specifying only that it is an adaptation of the original, but without specifying how it has been adapted. In this way reusers may adopt the style, argument and ‘personality’ of the original work under their own name (and even copyright it). This represents a violation of the specific moral right of the author to the integrity of the work, and the only recourse offered to the author by CC BY is to have their name removed from the attribution (which makes the violation worse). This kind of re-use is as likely to degrade as to enhance the public benefit of the research.

The British Academy’s response to the Commons Select Committee (2013) noted that many articles in HSS subjects are the product of single-author scholarship, where there is more of a claim on ‘moral rights’ that are not adequately protected under an unrestricted CC-BY licence. There were also concerns about commercial reuse of work that contains third party copyright, involving complicated permissions. The response suggests that it should be possible to vary Creative Commons licences according to the usages and requirements of different subject areas – and that an ‘Attribution-NonCommercial-NoDerivs’ licence (CC-BY-NC-ND) may very often be more appropriate

Notes on an April 2013 Royal Historic Society position changing workshop on CC-BY and Humanities (chaired by Peter Mandler) noted that the editors of a number of history journals have suggested that the CC-BY licence facilitates and promotes commercial re-use and uses akin to plagiarism; that the licence therefore amounts to an infringement of authors’ moral and intellectual property rights; and that it is likely to damage the quality of education.

The HistoryUK Submission to the 2013 Business, Innovation and Skills Committee Enquiry on Open Access Publishing raised issues about the loss of protection of intellectual property, the dangers associated with allowing derivative works in sensitive areas of research, and the possible increased costs or embargos publishers may feel compensate for the transfer of a commercial asset to a third party.

Comments from researchers and administrators

In preparation for the round table, Danny Kingsley asked her community across the sector what kinds of objections different people in an administrative or library role had heard from researchers. These are summarised below.

English researcher at Cambridge – “I would prefer not to make my work, produced with the benefit of public funding, available in a form that would allow others to exploit it commercially, as the simple CC-BY licence does. My preference would be for the CC BY-NC-SA licence.”

Research Information Specialist – One question to ask here is whether traditional publishing models – such as signing over copyright itself – are really more beneficial to authors, and of course to weigh the risk of a negative CC experience against the benefits of positive ones.

Concerns raised in discussion with academics in the Humanities (reflected in two responses)

  1. A belief that CC BY encourages plagiarism
  2. That content licenced under CC BY is not monitored for copyright and other infringement to the same extent as more restrictive licences (a misguided belief that publishers actively monitor use and reuse of content I think)
  3. I have also heard the more vague concern about ideas being manipulated or twisted in some way and then re-published under the author’s name
  4. That encouraging reuse, especially derivatives, means the author has no control over what people do with the information (and therefore are associated with something that they would rather not be)

Advice provided on Creative Commons and licensing

Published 3 March 2016
Written by Dr Danny Kingsley, with thanks to Dr Philip Boyes and Dr Joyce Heckman for their notes.

Creative Commons License

 

In conversation with Wellcome Trust and CRUK

On Friday 22 January Cambridge University invited our two main charity funders to discuss their views on data management and sharing with Cambridge researchers. David Carr from the Wellcome Trust and Jamie Enoch from Cancer Research UK came to the University to talk to our researchers.

The related blog ‘Charities’ perspective on research data management and sharing‘ summarises the presentations Jamie and David gave. After this event, a group of researchers from the School of Biological Sciences and from the School of Clinical Medicine at the University of Cambridge were invited to ask questions about the Wellcome Trust data management and sharing policy and CRUK data sharing and preservation policy directly of David and Jamie.

This blog is a summary of the discussion, with questions thematically grouped. These questions will be added to the list of Frequently Asked Questions on the University’s Research Data Management Website.

In summary:

  • It is not recommended that researchers simply share a link and release the data when requested. Research data should be available, accessible and discoverable.
  • The first responsibility is to protect the study participants. The funders provide guidance documents on sharing of patient data. Ethics committees also provide advice and guidance on what data can be shared. In principle, patient data should be safeguarded, but this should not preclude sharing. There are models for managed access to data that allow personal/sensitive data to be shared for legitimate purposes in a safe and secure manner.
  • The funders do not want to prevent new collaborations. When sharing data they recommend data generators provide a statement in the description of the data that they are willing to collaborate
  • It is recognised that it is often appropriate for researchers to have a defined period of exclusive access to the data they generate, but this should be determined by disciplinary norms. Any exemptions or delays have to be justified on a case by case basis, ideally at the outset of the project.
  • The funders expect research data that supports publications to be made accessible and publications should have a clear statement explaining how to access the underlying research data.
  • However researchers need to decide what is useful to be shared considering the effort of preparing the data for deposit and of sharing the data. If nobody is going to use the data, sharing is not a good use of researcher’s time.
  • Discipline-specific data repositories, where these exist, are recommended preferentially over general purpose or institutional repositories
  • Biosharing is an excellent resource with references to discipline-specific metadata schemas.
  • Staff members whose role is to manage data is an eligible cost on a grant
  • There are no funds for sharing data from old projects, although there are exceptions on a case by case basis
  • The funders are considering monitoring data management plans but their current primary goal is to encourage people to think about data management and sharing from the very start of the project

Access to research data

Q: Are funders benefiting from the expertise of organisations such as UK Data Service when providing advice on data access? UK Data Service has been managing controlled access to research data for a long time and it would be advantageous to benefit from their expertise.

A: Yes, we are in discussion with the UK Data Service. We are also working with the UK Data Service to consider whether it might be appropriate for hosting data from other disciplines beyond social science. We also believe there is significant scope to share lessons and best practices for data sharing between the social and biomedical sciences.

Q: Could we just share research data only when asked for it?

A: This is not a recommended solution: research data should be available, accessible and discoverable. Data access controls and criteria for what needs to happen for the access to be granted have to be made clear in metadata description.

Q: I have patient data which has to be stored in a secure space. I always say in my data management plan that I cannot share my data. I would like to get ethical guidance which will explain to me how to share these data. It is very easy to say that data cannot be shared. I would like to share my data, but I would like to do it properly. With patient data it is extremely difficult, especially with genomics data, where there is a risk that patients can be identified.

A: Sharing of clinical data is not easy. Both Wellcome Trust and Cancer Research UK are helping to drive a great deal of work which is considering access and governance models through which sensitive patient data can be made available for research in a safe, secure and trusted manner. They provide guidance documents on sharing of patient data. Safety of patients and patients’ data is important. Ethics committees also provide advice and guidance on what data can be shared.

Q: What about sharing of physical materials? I have received a request to share a culture derived from a patient material, but the Ethics Committee did not approve sharing of this material. What shall I do?

A (Peter Hedges, Head of Research Office): If your ethical approval says that you cannot share that material, you cannot share it. Your first responsibility is to protect your study participants.

Q: If I share my data via a repository and people can simply download my data, I can no longer collaborate with them to work on the data and I have lost the possibility of getting credit for my data.

A: Nobody wants to prevent new collaborations from happening. A solution might be to add a statement that you are willing to collaborate in the description of your data. Your data requestor might be interested in collaborating, simply because you know your data the best. Funders also expect that the data re-used by others is appropriately acknowledged/cited, and they want to ensure that due credit results from the secondary use of data.

Quality control of research data

Q: If researchers start sharing unpublished research data via data repositories there is a risk that these data will not be of good quality as they will not be peer-reviewed.

A: Authors of unpublished data can simply state in the data description that the item was not peer-reviewed. If applicable, funders also encourage reciprocal links between publications and supporting research data.

What data needs to be shared and when?

Q: If researchers start to share everything there will be a lot of useless data available in data repositories. How to prevent a flood of useless data on the internet?

A: We would like researchers to decide what data is useful to be shared. If nobody is likely to use the data, sharing is not a good use of researcher’s time. Repositories also need to make decisions over what is worth keeping over time.

Comment (Peter Hedges, Head of Research Office): The Research Council UK focuses on research data supporting publications and this is what we recommend to researchers: share research data which underpins publications.

Q: Are we expected to share large datasets resulting from bigger projects (databases, long-term datasets) or data supporting individual publications?

A: We expect research data that supports individual publications to be made available with a hyperlink to the data. We also want researchers to consider and plan more broadly how they can make data assets of value resulting from our funded research available to others in a timely and appropriate manner.

Q: What about images? Is it useful to share them? It involves a lot of time to organise images. Besides, a single confocal picture with multiple layers is 1GB. In theory it is possible to share all raw data and all raw images, but who would want to look at them? 10 figures of 10 images is already 100 GB of data. Where would I store all these images, who is going to use these data and how am I going to pay for this?

A: The effort of preparing the data for deposit and of sharing the data should be proportionate to the potential benefits of data sharing. Researchers need to decide what is useful to be shared, following disciplinary best practices and norms (recognising that disciplines are in very different places in terms of defining these).

Q: Is there a set amount of time for exclusive use of research data?

A: Researchers should adhere to disciplinary norms. For example, in genomics research data is frequently shared before publication (sometimes under a publication moratorium which protects the data generator’s right to first publication). Any exemptions or delays have to be justified on a case by case basis.

Comment (Peter Hedges, Head of Research Office): Research is competitive. Sometimes it might be useful for researchers to know who wants to get the access to data and what do they need them for.

Cost of data sharing

Q: Can I ask in my grant for a staff member to help me with data management?

A: Yes, this is an eligible cost on grant applications: you can request a salary to support a research data manager for your research project, as long as it is justified.

Q: According to CRUK policy, costs for data sharing can be budgeted in grant applications only from August 2015. What about research data from older projects, when these costs were not eligible in grant applications? Is there any transition fund available to pay for this?

A: Unfortunately, there are no additional funds to pay for these costs. Researchers who have older datasets that might be of significant value to the community should contact CRUK – all requests for support will be considered on a case by case basis.

Q: Wellcome Trust encourages data sharing and data re-use, but does not allow for costs of long-term data preservation to be budgeted in grant applications. This does not make sense to me.

A: We are still reviewing our policy on costs of data management and sharing and we might be revisiting this issue – however, it is problematic for us to consider estimated costs for preservation that extend before the life-time of the grant. Our understanding is that costs of long-term data preservation are often less significant than costs of initial data ingestion by the repository (and we will cover ingestion costs).

Q: Who is then going to pay for the long-term data storage?

A: Wellcome Trust funds some discipline-specific repositories, but this is done jointly with other funders. We support bigger undertakings and we are also working with partners to develop platforms for data sharing and discoverability in some priority areas (notably clinical trials). Cancer Research UK pays for some long-term storage options, if these are justified for particular needs of the project. These decisions are made on a case by case basis, depending on how the costs are justified and whether these are directly related to the scientific value of the project.

Metadata standards

Q: At the moment there are many general purpose and institutional repositories, which are not well structured. To support efficient re-use of data it is important to use structured data repositories and adhere to metadata standards. What are funders’ opinions about this?

A: Wherever possible, discipline-specific data repositories should be used preferentially over general purpose or institutional repositories. Adherence to discipline-specific metadata standards is also encouraged. It has to be acknowledged that development of well-structured data repositories is very resource-intensive and not all disciplines have good quality repositories to support them. For example, it took over 30 years to adapt unified metadata standards at Cambridge Crystallographic Data Centre. The time need to properly solve problems should never be underestimated.

Q: Are funders planning to provide researchers with a list of recommended schemas for metadata?

A: Biosharing is an excellent resource with references to discipline-specific metadata schemas. It is a useful suggestion to include a reference to Biosharing on our website.

Policy implementation

Q: Are you planning to monitor researchers’ adherence to data management plans? For example, the BBSRC does not have the manpower to check all data management plans manually, but they are planning to create a system to check if data has been uploaded automatically.

A: We are considering this. At the moment we require data management plans with the primary goal to encourage people to think about data management and sharing from the very start of the project.

Published 5 February 2016
Written by Dr Marta Teperek, verified by David Carr and Jamie Enoch
Creative Commons License