All posts by Office of Scholarly Communication

Milestone – 10,000th article processed by OA Service

The Open Access Service at Cambridge has received its 10,000th Open Access submission – highlighting its commitment to making research freely available to anybody who wants to access it, without publisher paywalls or expensive journal subscriptions.

Through open access our research can reach a worldwide audience.

Nita Forouhi

The 10,000th submission, reporting on the impact of eating a Mediterranean diet on the risk of developing cardiovascular disease in a UK population, was deposited by Signe Wulund at the MRC Epidemiology Unit, on behalf of Dr Nita Forouhi, Programme Leader in Nutritional Epidemiology at the MRC Epidemiology Unit, and several co-authors.

The Open Access movement has been growing in strength in academia for many years, and it is increasingly being mandated by funding bodies and government.

Dr Forouhi said: “Through open access our research can reach a worldwide audience. It would be a huge pity if interested researchers, practitioners or policy makers could not read about new research, such as our latest findings on the link between the Mediterranean diet and cardiovascular health in a non-Mediterranean setting, because of something as simple as lacking a journal subscription.

“Open access enables wider dissemination of research findings, and in turn, facilitates better research and evidence-based policy and clinical practice.”

The Cambridge Open Access Service was established within the University Library in 2013 in response to Research Councils UK (RCUK) making Open Access mandatory for anyone accepting their funding. Many other major funders, including the Wellcome Trust, Cancer Research UK and the British Heart Foundation, have similar policies.

In 2014, the Higher Education Funding Council for England announced that Open Access would be compulsory for any article included in the next Research Excellence Framework (REF) exercise. This policy came into force on April 1, 2016, effectively meaning that all research in UK institutions now has to be made freely available.

Since its inception in 2013, the Open Access service has processed 10,000 manuscripts, across all University faculties and departments and worked with 3,000 different members of staff. 6,000 of the papers were covered by the HEFCE open access policy; 4,000 acknowledged RCUK funding and 1,900 COAF (many papers fall into multiple categories, and some into none). More than £5.4 million of Open Access grants from funding bodies have also been distributed.

Meeting these requirements is a major task for the University, and one it has tried to make as simple as possible for researchers. Authors are simply required to upload their manuscript to www.openaccess.cam.ac.uk when it’s accepted for publication, and the Open Access team advise them on what they need to do to comply with funder requirements, eligibility for any funding body grants, and handle depositing the article into Apollo, the University’s institutional repository.

Ten thousand manuscripts have now been received in this way, and the vast majority of them have been able to be made Open Access, free for anyone who wants to read and benefit from them.

The 10,000th article was: ‘Prospective association of the Mediterranean diet with cardiovascular disease incidence and mortality and its population impact in a non-Mediterranean population: the EPIC-Norfolk Study’ in BMC Medicine. [DOI:10.1186/s12916-016-0677-4]

The Open Access team at the University of Cambridge is part of the Office of Scholarly Communication (OSC), within the University Library. As well as assisting researchers with Open Access and Open Data compliance, it advises on scholarly communication tools, techniques, policies and practices, and provides training.

This story originally appeared on the University of Cambridge Research news pages.

Published 05 October 2016
Written by Dr Philip Boyes
Creative Commons License

Taking a Principled stance – the Scholarly Commons

It only rains about 10 days a year in San Diego. And Tuesday was one of them. In a rooftop room on campus in San Diego at UCSD, a group had gathered for the FORCE11 Scholarly Commons workshop. The workshop brought together members of the Scholarly Commons working group, who hail from around the world and come from the broad scholarly commons. The Scholarly Commons is an idea to help define the future of research communication. The goal is to promote the best research and scholarship possible through rapid and wide dissemination to all who need or want it.

FORCE stands for the Future of Research Communications and eScholarship and is an organisation (or community) open to anyone interested in these issues. The group consisted of researchers from multiple disciplines, communicators, programmers, and a couple of librarians. This is the unusual and powerful thing about FORCE11 – the diversity of its members. Someone actually remarked: ‘you know, there probably should be a few more librarians here’ which is something you don’t often hear at meetings about open access issues. Usually librarians are delighted if a real live researcher turns up.

We were meeting to discuss the draft of 18 Principles of the Commons – an attempt to define what the community considers the attributes and behaviours of a person who is fully participating in research. The Principles are broadly separated into four major themes of being Open, Equitable, Sustainable and Research & Culture Driven.

FORCE11 works openly and tries to be as accessible as possible so there were full and open notes being collaboratively taken and the Twitter hashtag was #futurecommons.

The workshop was very hands-on, and expertly moderated by Jeroen Bosman and Bianca Kramer who are the power behind the excellent 101 Innovations in Scholarly Communication project. As their ‘wheel’ of tools identifying tools available across the research life stages and through time demonstrates, it is becoming increasingly difficult to navigate the new research space. Indeed that is part of the rationale behind this Scholarly Commons project. It is an attempt to take stock and make sense of what we, the community, want to see in an open and accessible future.

Despite having fewer than 40 people we managed to have multiple activities running concurrently with several ‘unworkshops’. Everything was fed back into the group, and there was a very broad range of discussions, agreements and ideas. To prevent this blog being a tome, I am only going to cover here a couple of areas that were discussed.

Standing on the shoulders of giants

Due to a flight delay I was only able to catch the end of the Sunday evening welcome reception where we were asked to reflect on the 18 draft Principles plastered on the walls and decide which we agreed with and which we did not (or had an issue with). As I scanned through I was struck by the overall similarity they had with Robert Merton’s 1942 publication, The Normative Structure of Science where he proposed that science operated to four ‘norms’, of Universalism, Communalism, Disinterestedness and Organised Skepticism.

Screen Shot 2016-09-22 at 11.53.53I mentioned this in an early discussion and was somewhat chuffed that the group really did take this on board – to the extent that one unworkshop group worked on updating the norms to reflect today’s situation.

As an aside – the challenge with having people coming together from multiple research areas is everyone brings their a priori biases with them. People tend to see the problem through their own lens, so have different ways of approaching the problem. For the group to agree that this perspective was a good one was personally very validating.

Considering outreach as part of the research lifecycle

I the first unworkshop I joined we discussed how research-centred the Principles are – they did not consider the importance of outreach. Given the impenetrable nature of the language in many academic papers, we agreed that making something Open Access facilitates outreach but is not outreach itself.

The discussion moved to idea about what researchers could do to help with outreach – even if they themselves did not want (or were unable) to do it. These are fairly simple including providing supplementary material that is accessible in terms of the descriptive language used (no jargon), potentially providing the information in a different language to English, and ensuring the license under which it is made available is open.

We proposed that the Commons should facilitate outreach and have the outreach in mind even if the researcher themselves does not generate the outreach. There had been a comment earlier in the Equity discussion that noted “Each part of the research cycle is equally valid and none should not be preferenced over the other.” Our discussion concluded that outreach (for the lay public) should be considered to be part of the research process and equally valued.

It should be noted that we are not discussing paper-related activities here. Making the paper open access or tweeting a link to the paper doesn’t count. This is about sharing the information in an understandable manner outside the Academy.

Tool mapping

thumb_IMG_2188_1024The workshop, as mentioned, was very hands-on. By that I mean we did several ‘craft activities’ involving dots, glue, sticky tape and scissors. One of these activities involved ranking various tools for research against the four themes of the Principles, deciding whether they were in alignment with them (green), in opposition with them (red) or in-between (yellow).

Screen Shot 2016-09-22 at 13.12.47We then placed these assessments on the windows under the part of the research lifecycle they related to, and ordered them. The most Principle-friendly tools were up high, and the least down low.

 

thumb_IMG_2191_1024 We then did an activity where we tried to trace the path of our own discipline in terms of the tools our disciplines tended to use. This exercise was an attempt to see if there were any discernable patterns about where some disciplines tend to align or otherwise with the Principles. While the sample size for each discipline was too small to really come to any conclusions, this exercise did open up ideas for a way of disseminating the Principles.

The Principles as an Innovation

This is where another of my disciplinary perspectives comes into play. If we accept that the Principles are themselves an ‘innovation’ – in that they are “an idea, practice, or object that is perceived as new by an individual or other unit of adoption”, then we can look to Everett Rogers Diffusion of Innovations first published in 1962, and now in its 5th edition. You might not have heard of him but you know about his work – Rogers was the person who coined the idea of ‘early adopters’, late adopters’ and ‘laggards’.

Amongst lots of interesting insights about why people adopt new ideas, Rogers came up with five ways to evaluate an innovation which will determine the success or otherwise of its adoption. These are judged as a whole and are interrelated:

  • Relative advantage – the perceived efficiencies gained by the innovation relative to current tools or procedures
  • Compatibility with the pre-existing system
  • Complexity or difficulty to learn (it needs to be easy)
  • Trialability or testability without risking the current system
  • Observed effects.

It is the second point which is the interesting one here – ‘Compatibility with the pre-existing system’. The reason why this is relevant is we are not talking about one system when we discuss scholarship – there are a myriad of systems. There is no ‘one solution’. If we are to try and implement something like the Principles across the academy, we will need to do it along disciplinary lines. (Disclosure – this happens to be the conclusion of my 2008 PhD thesis on the adoption of open access across disciplines).

Disciplinary dissemination

This leads us to the question of audiences for the Principles. Ideally we would have institutions signing up to them, pledging that they will work with their research community to work in this manner. But this is unrealistic currently due to the diverse nature of research institutions. But there might be a way to have funders sign up, because often funding is given within disciplinary restraints. This is doubly the case because funders (in the UK, Australia and the US at least) are increasingly using an ‘Impact narrative’ and the Principles offer a way to practically identify and reward impact behaviour.

And we are not coming from a standing start. We can build on the work done by Jeroen Bosman and Bianca Kramer in their 101 Innovations in Scholarly Communication project. There were over 20,000 responses to their survey of innovation use and this allows a detailed mapping of disciplinary behaviours. If we the further map those findings against an assessment of the research tools being used at a disciplinary level and whether they are aligned with the Principles, we should be able to see which disciplinary areas are already working in the Principled way. It is the funders of these disciplines that we should approach first to try and gain early adoption of the Principles. This  work would become a checklist that can reward people for the behaviours that they are already doing in this space.

A project like this would in turn open up some questions about what we need to do at a disciplinary level to help that community become more aligned with the Principles. These may require a number of approaches – Do they have the tools that work for them or do these need to be developed? Is there a cultural reason why this discipline is not engaging? In answering these questions we come up with the answer to the question: What does a Scholarly Commons researcher look like in this discipline?  Until we have some evidence of where these areas are we are effectively stabbing in the dark.

Making this happen now

In a different unworkshop we talked about how the nature of the Principles themselves went against the idea of being  inclusive because we are potentially creating a binary situation – either you are following the Principles or not.  What we really need to do, we agreed, is not reject people for acting in ways that are not totally in line with the Principles, rather reward behaviour that supports the Principles.

In order to facilitate this, we designed a series of ‘Decision Trees’ to help researchers be as open as they can. This is a recognition that researchers are working within a complex ecosystem. With all the will in the world, if there is not an Open Access journal available to you in your field, you cannot publish in one.

thumb_IMG_2196_1024The easiest part of the research lifecycle was to tackle was publishing, in terms of choosing a publication outlet. The decision tree allows for people who cannot publish in an Open Access journal, nor afford to pay for hybrid (not something I personally recommend anyway) to still be ‘Principled’ by putting a copy of their work in a repository.

thumb_IMG_2197_1024Our discussion about data was more complex. For a start, there is a question about whether the data is digital or not. As we discussed it, our draft tree became incredibly complex so we created two separate flows. The Data 1 decision tree says to someone who has analogue data and no funds to digitise, that as long as they put in some information in their paper about how to contact them for the supporting data, then they have met the spirit of the Principles to the best of their ability.

thumb_IMG_2198_1024While we know the gold standard for data sharing is to have the data (with well defined metadata) available openly in a non proprietary repository with a DOI, for various reasons this is not always possible. We should not sanction a researcher because they are unable to meet that (very high) standard. The Data 2 tree shows that data that is in a repository under embargo without a DOI is discoverable in a way it would not be if it were in a desk drawer – so that is, again, within the spirit of the Principles. We need to consider the ‘close enough’ option as being a valid one, at least in the implementation stage of the Principles.

We agreed that in some areas of the research lifecycle that a list of tools that could help would be of more use than a decision tree. Time restraints meant there are a couple of areas of the lifecycle which still need consideration (and we need to do some decision tree design work!), but generally the group agreed that this was probably quite useful.

Conclusions

When it comes to the Principles themselves, we are still working on it. We did however agree that we thought the Principles were something worth doing, and that they were more or less something we can start working with (and on – they are likely to be dynamic). One suggestion was that we call them Scholarly Commons Principles 1.0 – a reference to this being the first version of possibly many. There are plans for several subgroups to pitch for funding to do some deeper work in some areas. So it is an ongoing project, but a substantial one.

There are some troopers in the Scholarly Communication community. Several people at our workshop had ‘done the double’ – attending  the SciDataCon 2016 conference and associated meetings over eight days in Denver last week and then coming to this event. The gruelling pace was starting to show by the end of the last day of our workshop.

thumb_IMG_2177_1024You know you have been on a very short visit when you fly back with the same in-flight crew as your outward bound journey. One of them even recognised me and commented on how quickly I was returning. So while the trip was an exhausting few days, it was productive and worthwhile. And it was really nice to smell eucalypt trees (rather bizarrely) and do laps in an outdoor pool – things I have not done since moving to the UK.

Published 22 September 2016
Written by Dr Danny Kingsley
Creative Commons License

Cambridge University spend on Open Access 2009-2016

Today is the deadline for those universities in receipt of an RCUK grant to submit their reports on the spend. We have just submitted the Cambridge University 2015-2016 report to the RCUK and have also made it available as a dataset in our repository.

Compliance

Cambridge had an estimated overall compliance rate of 76% with 46% of all RCUK funded papers  available through the gold route and 30% of all RCUK funded papers available through the green route.

The RCUK Open Access Policy indicates that at the end of the fifth transition year of the policy (March 2018) they expect 75% of Open Access papers from the research they fund will be delivered through immediate, unrestricted, on‐line access with maximum opportunities for re‐use (‘gold’). Because Cambridge takes the position that if there is a green option that is compliant we do not pay for gold, our gold compliance number is below this, although our overall compliance level is higher, at 76%.

Compliance caveats

The total number of publications arising from research council funding was estimated by searching Web of Science for papers published by the University of Cambridge in 2015, and then filtered by funding acknowledgements made to the research councils. The number of papers (articles, reviews and proceedings papers) returned in 2015 was 2080. This is almost certainly an underestimate of the total number of publications produced by the University of Cambridge with research council funding. The analysis was performed on 15/09/2016.

Expenditure

The APC spend we have reported is only counting papers submitted to the University of Cambridge Open Access Team between 1 August 2015 and 31 July 2016. The ‘OA grant spent’ numbers provided are the actual spend out of the finance system. The delay between submission of an article, the commitment of the funds and the subsequent publication and payment of the invoice means that we have paid for invoices during the reporting period that were submitted outside the reporting period. This meant reconciliation of the amounts was impossible. This funding discrepancy was given in ‘Non-staff costs’, and represents unallocated APC payments not described in the report (i.e. they were received before or after the reporting period but incurred on the current 2015-16 OA grant).

The breakdown of costs indicates we have spent 4.6% of the year’s allocation on staff costs and 5.1% on systems support. We noted in the report that the staff time paid for out of this allocation also supports the processing of Wellcome Trust APCs for which no support is provided by Wellcome Trust.

Headline numbers

  • In total Cambridge spent £1,288,090 of RCUK funds on APCs
  • 1786 articles identified as being RCUK funded were submitted to the Open Access Service, of which 890 required payment for RCUK*
  • 785 articles have been invoiced and paid
  • The average article cost was ~£2008

Caveats

The average article cost can be established by adding the RCUK fund expenditure to the COAF fund expenditure on co-funded articles (£288,162.28)  which gives a complete expenditure for these 785 articles of £1,576,252.42. The actual average cost is £2007.96.

* The Open Access Service also received many COAF only funded and unfunded papers during this period. The number of articles paid for does not include those made gold OA due to the Springer Compact as this would throw out the average APC value.

Observations

In our report on expenditure for 2014 the average article APC was £1891. This means there has been a 6% increase in Cambridge University’s average spend on an APC since then. It should be noted that of the journals for which we most frequently process APCs, Nature Communication is the second most popular. This journal has an APC of £3,780 including VAT.

Datasets on Cambridge APC spend 2009-2016

Cambridge released the information about its 2014 APC spend for RCUK and COAF in March last year and intended to do a similar report for the spend in 2015, however a recent FOI request has prompted us to simply upload all of our data on APC spend into our repository for complete transparency. The list of datasets now available is below.

1. Report presented to Research Councils UK for article processing charges managed by the University of Cambridge, 2014-2015

2. Report presented to the Charity Open Access Fund for article processing charges managed by the University of Cambridge, 2015-2016

3. Report presented to the Charity Open Access Fund for article processing charges managed by the University of Cambridge, 2014-2015

4. Report presented to Jisc for article processing charges managed by the University of Cambridge, 2014

5. Open access publication data for the management of the Higher Education Funding Council for England, Research Councils UK, Charities Open Access Fund and Wellcome Trust open access policies at the University of Cambridge, 2014-2016

Note: In October 2014 we started using a new system for recording submissions. This has allowed us to obtain more detailed information and allow multiple users to interact with the system. Until December 2015 our financial information was recorded in the spreadsheet below. There is overlap between reports 5. and 6. for the period 24 October and 31 December 2015.  As of January 2016, all data is being collected in the one place.

6. Open access publication data for the management of Research Councils UK, Charities Open Access Fund and Wellcome Trust article processing charges at the Office of Scholarly Communication, 2013-2015

Note: In 2013 the Open Access Service began and took responsibility for the new RCUK fund, and was transferred responsibility for the new Charities Open Access Fund (COAF). At this time the team were recording when an article was fully Wellcome Trust funded, even though the Wellcome Trust funding is a component of COAF.

7. Open access publication data for the management of Wellcome Trust article processing charges from the School of Biological Sciences, 2009-2014

Note: Management of the funds to support open access publishing has changed over the past seven years. Before the RCUK open access policy came into force in 2013, the Wellcome Trust funds were managed by the School of Biological Sciences.

Published 14 September 2016
Written by Dr Danny Kingsley & Dr Arthur Smith
Creative Commons License

Making the connection: research data network workshop

During International Data Week 2016, the Office of Scholarly Communication is celebrating with a series of blog posts about data. The first post was a summary of an event we held in July. This post looks at the challenges associated with financially supporting RDM training.

corpus-main-hallFollowing the success of hosting the Data Dialogue: Barriers to Sharing event  in July we were delighted to welcome the Research Data Management (RDM) community to Cambridge for the second Jisc research data network workshop. The event was held in Corpus Christi College with meals held in the historical dining room. (Image: Corpus Christi )

RDM services in the UK are maturing and efforts are increasingly focused on connecting disparate systems, standardising practices and making platforms more usable for researchers. This is also reflected in the recent Concordat on Research Data which links the existing statements from funders and government, providing a more unified message for researchers.

The practical work of connecting the different systems involved in RDM is being led by the Jisc Research Data Shared Services project which aims to share the cost of developing services across the UK Higher Education sector. As one of the pilot institutions we were keen to see what progress has been made and find out how the first test systems will work. On a personal note it was great to see that the pilot will attempt to address much of the functionality researchers request but that we are currently unable to fully provide, including detailed reporting on research data, links between the repository and other systems, and a more dynamic data display.

Context for these attempts to link, standardise and improve RDM systems was provided in the excellent keynote by Dr Danny Kingsley, head of the Office of Scholarly Communication at Cambridge, reminding us about the broader need to overhaul the reward systems in scholarly communications. Danny drew on the Open Research blogposts published over the summer to highlight some of the key problems in scholarly communications: hyperauthorship, peer review, flawed reward systems, and, most relevantly for data, replication and retraction. Sharing data will alleviate some of these issues but, as Danny pointed out, this will frequently not be possible unless data has been appropriately managed across the research lifecycle. So whilst trying to standardise metadata profiles may seem irrelevant to many researchers it is all part of this wider movement to reform scholarly communication.

Making metadata work

Metadata models will underpin any attempts to connect repositories, preservation systems, Current Research Information Systems (CRIS), and any other systems dealing with research data. Metadata presents a major challenge both in terms of capturing the wide variety of disciplinary models and needs, and in persuading researchers to provide enough metadata to make preservation possible without putting them off sharing their research data. Dom Fripp and Nicky Ferguson are working on developing a core metadata profile for the UK Research Data Discovery Service. They spoke about their work on developing a community-driven metadata standard to address these problems. For those interested (and Git-Hub literate) the project is available here.

They are drawing on national and international standards, such as the Portland Common Data Model, trying to build on existing work to create a standard which will work for the Shared Services model. The proposed standard will have gold, silver and bronze levels of metadata and will attempt to reward researchers for providing more metadata. This is particularly important as the evidence from Dom and Nicky’s discussion with researchers is that many researchers want others to provide lots of metadata but are reluctant to do the same themselves.

We have had some success with researchers filling in voluntary metadata fields for our repository, Apollo, but this seems to depend to a large extent on how aware researchers are of the role of metadata, something which chimes with Dom and Nicky’s findings. Those creating metadata are often unaware of the implications of how they fill in fields, so creating consistency across teams, let alone disciplines and institutions can be a struggle. Any Cambridge researchers who wish to contribute to this metadata standard can sign up to a workshop with Jisc in Cambridge on 3rd October.

Planning for the long-term

A shared metadata standard will assist with connecting systems and reducing researchers’ workload but if replicability, a key problem in scholarly communications, is going to be possible digital preservation of research data needs to be addressed. Jenny Mitcham from the University of York presented the work she has been undertaking alongside colleagues from the University of Hull on using Archivematica for preserving research data and linking it to pre-existing systems (more information can be found on their blog.)

Jenny highlighted the difficulties they encountered getting timely engagement from both internal stakeholders and external contractors, as well as linking multiple systems with different data models, again underlining the need for high quality and interoperable metadata. Despite these difficulties they have made progress on linking these systems and in the process have been able to look into the wide variety of file formats currently in use at York. This has lead to conversations with the National Archive about improving the coverage of research file formats in PRONOM (a registry of file formats for preservation purposes), work which will be extremely useful for the Shared Services pilot.

In many ways the project at York and Hull felt like a precursor to the Shared Services pilot; highlighting both the potential problems in working with a wide range of stakeholders and systems, as well as the massive benefits possible from pooling our collective knowledge and resources to tackle the technical challenges which remain in RDM.

Published 14 September 2016
Written by Rosie Higman
Creative Commons License

Beyond compliance – dialogue on barriers to data sharing

Welcome to International Data Week. The Office of Scholarly Communication is celebrating with a series of blog posts about data, starting with a summary of an event we held in July.

JME_0629.jpgOn 29 July 2016 the Cambridge Research Data Team joined forces with the Science and Engineering South Consortium to organise a one day conference at the Murray Edwards College to gather researchers and practitioners for a discussion about the existing barriers to data sharing. The whole aim of the event was to move beyond compliance with funders’ policies. We hoped that the community was ready to change the focus of data sharing discussions from whether it is worth sharing or not towards more mature discussions about the benefits and limitations of data sharing.

What are the barriers?

So what are the barriers to effective sharing of research data? There were three main barriers identified, all somewhat related to each other: poorly described data, insufficient data discoverability and difficulties with sharing personal/sensitive data. All of these problems arise from the fact that research data does not always shared in accordance to FAIR principles: that data is Findable, Accessible, Interoperable and Re-usable.

Poorly described data

The event started with an inspiring keynote talk from Dr Nicole Janz from the Department of Sociology at the University of Cambridge: “Transparency in Social Science Research & Teaching”. Nicole regularly runs replication workshops at Cambridge, where students select published research papers and they work hard for several weeks to reproduce the published findings. The purpose of these workshop is to allow students to learn by experience on what is important in making their own work transparent and reproducible to others.

Very often students fail to reproduce the results. Frequently, the reasons for failures are insufficient methodology available, or simply the fact that key datasets were not made available. Students learn that in order to make research reproducible, one not only needs to make the raw data files available, but that the data needs to be shared with the source code used to transform it and with written down methodology of the process, ideally in a README file. While doing replication studies, students also learn about the five selfish benefits of good data management and sharing: data disasters are avoided, it is easier to write up papers from well-managed data, transparent approach to sharing makes the work more convincing to reviewers, the continuity of research is possible and researchers can build their reputation for being transparent. As a tip for researchers, Nicole suggested always asking a colleague to try to reproduce the findings before submitting a paper for peer-review.

The problem of insufficient data description/availability was also discussed during the first case study talk by Dr Kai Ruggeri from the Department of Psychology, University of Cambridge. Kai reflected on his work on the assessment of happiness and wellbeing across many European countries, which was part of the ESRC Secondary Data Analysis Initiative. Kai re-iterated that missing data make the analysis complicated and sometimes prevent one from being able to make effective policy recommendations. Kai also stressed that frequently the choice of baseline for data analysis can affect the final results. Therefore, proper description of methodology and approaches taken is key for making research reproducible.

Insufficient data discoverability

JME_0665We also heard several speakers describing problems with data discoverability. Fiona Nielsen founded Repositive – a platform for finding human genomic data. Fiona founded the platform out of frustration that genomic data was so difficult to find and access. Proliferation of data repositories made it very hard for researchers to actually find what they need.

IMG_SearchingForData_20160911Fiona started with doing a quick poll among the audience: how do researchers look for data? It turned out that most researchers find data by doing a literature research or by googling for it. This is not surprising – there is no search engine enabling looking for information simultaneously across the multiple repositories where the data is available. To make it even more complicated, Fiona reported that in 2015 80PB of human genomic data was generated. Unfortunately, only 0.5PB of human genomic data was made available in a data repository.

So how can researchers find the other datasets, which are not made available in public repositories? Repositive is a platform harvesting metadata from several repositories hosting human genomic data and providing a search engine allowing researchers to simultaneously look for datasets shared in all of them. Additionally, researchers who cannot share their research data via a public repository (for example, due to lack of participants’ consent for sharing), can at least create a metadata record about the data – to let others know that the data exist and to provide them with information on data access procedure.

The problem of data discoverability is however not only related to people’s awareness that datasets exists. Sometimes, especially in the case of complex biological data with a vast amount of variables, it can be difficult to find the right information inside the dataset. In an excellent lightening talk, Jullie Sullivan from the University of Cambridge described InterMine –platform to make biological data easily searchable (‘mineable’). Anyone can simply upload their data onto the platform to make it searchable and discoverable. One example of the platform’s use is FlyMine – database where researchers looking for results of experiments conducted on fruit fly can easily find and share information.

Difficulties with sharing personal/sensitive data

The last barrier to sharing that we discussed was related to sharing personal/sensitive research data. This barrier is perhaps the most difficult one to overcome, but here again the conference participants came up with some excellent solutions. First one came from the keynote speech by Louise Corti – with a talk with a very uplifting title: “Personal not painful: Practical and Motivating Experiences in Data Sharing”.

Louise based her talk on the long experience of the UK Data Service with providing managed access to data containing some forms of confidential/restricted information. Apart from being able to host datasets which can be made openly available, the UKDS can also provide two other types of access: safeguarded access, where data requestors need to register before downloading the data, and controlled data, where requests for data are considered on a case by case basis.

At the outset of the research project, researchers discuss their research proposals with the UKDS, including any potential limitations to data sharing. It is at this stage – at the outset of the research project, that the decision is made on the type of access that will be required for the data to be successfully shared. All processes of project management and data handling, such as data anonymisation and collection of informed consent forms from study participants, are then carried in adherence to that decision. The UKDS also offers protocols clarifying what is going to happen with research data once they are deposited with the repository. The use of standard licences for sharing make the governance of data access much more transparent and easy to understand, both from the perspective of data depositors and data re-users.

Louise stressed that transparency and willingness to discuss problems is key for mutual respect and understanding between data producers, data re-users and data curators. Sometimes unnecessary misunderstandings make data sharing difficult, when it does not need to be. Louise mentioned that researchers often confuse ‘sensitive topic’ with ‘sensitive data’ and referred to a success case study where, by working directly with researchers, UKDS managed to share a dataset about sedation at the end of life. The subject of study was sensitive, but because the data was collected and managed with the view of sharing at the end of the project, the dataset itself was not sensitive and was suitable for sharing.

As Louise said “data sharing relies on trust that data curators will treat it ethically and with respect” and open communication is key to build and maintain this trust.

So did it work?

JME_0698The purpose of this event was to engage the community in discussions about the existing limitation to data sharing. Did we succeed? Did we manage to engage the community? Judging by the fact that we have received twenty high quality abstract applications from researchers across various disciplines for only five available case study speaking slots (it was so difficult to shortlist the top five ones!) and also because the venue was full – with around eighty attendees from Cambridge and other institutions, I think that the objective was pretty well met.

Additionally, the panel discussion was led by researchers and involved fifty eight active users on the Sli.do platform for questions to panellists. There were also questions asked outside of Sli.do platform. So overall I feel that the event was a great success and it was truly fantastic to be part of it and to see the degree of participant involvement in data sharing.

Another observation is also the great progress of the research community in Cambridge in the area of sharing: we have successfully moved away from discussions whether research data is worth sharing to how to make data sharing more FAIR.

It seems that our intense advocacy, and the effort of speaking with over 1,800 academics from across the campus since January 2015 paid off and we have indeed managed to build an engaged research data management community.

Read (and see!) more:

Published 12 September 2016
Written by Dr Marta Teperek
Creative Commons License

Could Open Research benefit Cambridge University researchers?

This blog is part of the recent series about Open Research and reports on a discussion with Cambridge researchers  held on 8 June 2016 in the Department of Engineering. Extended notes from the meeting and slides are available at the Cambridge University Research Repository. This report is written by  Lauren Cadwallader, Joanna Jasiewicz and Marta Teperek (listed alphabetically by surname).

At the Office of Scholarly Communication we have been thinking for a while about Open Research ideas and about moving beyond mere compliance with funders’ policies on Open Access and research data sharing. We thought that the time has come to ask our researchers what they thought about opening up the research process and sharing more: not only publications and research data, but also protocols, methods, source code, theses and all the other elements of research. Would they consider this beneficial?

Working together with researchers – democratic approach to problem-solving

To get an initial idea of the expectations of the research community in Cambridge, we organised an open discussion hosted at the Department of Engineering. Anyone registering was asked three questions:

  • What frustrates you about the research process as it is?
  • Could you propose a solution that could solve that problem?
  • Would you be willing to speak about your ideas publicly?

20160608_163000Interestingly, around fifty people registered to take part in the discussion and almost all of them contributed very thought-provoking problems and appealing solutions. To our surprise, half of the people expressed their will to speak publicly about their ideas. This shaped our discussion on the day.

So what do researchers think about Open Research? Not surprisingly, we started from an animated discussion about unfair reward systems in academia.

Flawed metrics

A well-worn complaint: the only thing that counts in academia is publication in a high impact journal. As a result, early career researchers have no motivation to share their data and to publish their work in open access journals, which can sometimes have lower impact factors. Additionally, metrics based on the whole journal do not reflect the importance of the research described: what is needed is article-level impact measurements. But it is difficult to solve this systemic problem because any new journal which wishes to introduce a new metrics system has no journal-level impact factor to start with, and therefore researchers do not want to publish in it.

Reproducibility crisis: where quantity, not quality, matters

Researchers also complained that the volume of produced research is higher and higher in terms of quantity and science seems to have entered an ‘era of quantity’. They raised the concern that the quantity matters more than the quality of research. Only the fast and loud research gets rewarded (because it is published in high impact factor journals), and the slow and careful seems to be valued less. Additionally, researchers are under pressure to publish and they often report what they want to see, and not what the data really shows. This approach has led to the reproducibility crisis and lack of trust among researchers.

Funders should promote and reward reproducible research

The participants had some good ideas for how to solve these problems. One of the most compelling suggestions was that perhaps funding should go not only to novel research (as it seems to be at the moment), but also to people who want to reproduce existing research. Additionally, reproducible research itself should be rewarded. Funders could offer grant renewal schemes for researchers whose research is reproducible.

Institutions should hire academics committed to open research

Another suggestion was to incentivise reward systems other than journal impact factor metrics. Someone proposed that institutions should not only teach the next generation of researchers how to do reproducible research, but also embed reproducibility of research as an employment criteria. Commitment to Open Research could be an essential requirement in job description. Applicants could be asked at the recruitment stage how they achieve the goals of Open Research. LMU University in Munich had recently included such a statement in a job description for a professor of social psychology (see the original job description here and a commentary here).

Academia feeding money to exploitative publishers

Researchers were also frustrated by exploitative publishers. The big four publishers (Elsevier, Wiley, Springer and Informa) have a typical annual profit margin of 37%. Articles are donated to the publishers for free by the academics, and reviewed by other academics, also free of charge. Additionally, noted one of the participants, academics also act as journal editors, which they also do for free.

[*A comment about this statement was made on 15 August 2017 noting that some editors do get paid. While the participant’s comment stands as a record of what was said, we acknowledge that this is not an entirely accurate statement.]

In addition to this, publishers take away the copyright from the authors. As a possible solution to the latter, someone suggested that universities should adopt institutional licences on scholarly publishing (similar to the Harvard licence) which could protect the rights of their authors

Pre-print services – the future of publishing?

Could Open Research aid the publishing crisis? Novel and more open ways of publishing can certainly add value to the process. The researchers discussed the benefits of sharing pre-print papers on platforms like arXiv and bioRxiv. These services allow people to share manuscripts before publication (or acceptance by a journal). In physics, maths and computational sciences it is common to upload manuscripts even before submitting the manuscript to a journal in order to get feedback from the community and have the chance to improve the manuscript.

bioRxiv, the life sciences equivalent of arXiv, started relatively recently. One of our researchers mentioned that he was initially worried that uploading manuscripts into bioRxiv might jeopardise his career as a young researcher. However, he then saw a pre-print manuscript describing research similar to his published on bioRxiv. He was shocked when he saw how the community helped to change that manuscript and to improve it. He has since shared a lot of his manuscripts on bioRxiv and as his colleague pointed out, this has ‘never hurt him’. To the contrary, he suggested that using pre-print services promotes one’s research: it allows the author to get the work into the community very early and to get feedback. And peers will always value good quality research and the value and recognition among colleagues will come back to the author and pay back eventually.

Additionally, someone from the audience suggested that publishing work in pre-print services provides a time-stamp for researchers and helps to ensure that ideas will not be scooped by anyone – researchers are free to share their research whenever they wish and as fast they wish.

Publishers should invest money in improving science – wishful thinking?

It was also proposed that instead of exploiting academics, publishers could play an important role in improving the research process. One participant proposed a couple of simple mechanisms that could be implemented by publishers to improve the quality of research data shared:

  • Employment of in-house data experts: bioinfomaticians or data scientists, who could judge whether supporting data is of a good enough quality
  • Ensure that there is at least one bioinfomatician/data scientist on the reviewing panel for a paper
  • Ask for the data to be deposited in a public, discipline-specific repository, which would ensure quality control of the data and adherence to data standards.
  • Ask for the source code and detailed methods to be made available as well.

Quick win: minimum requirements for making shared data useful

A requirement that, as a minimum, three key elements should be made available with publications – the raw data, the source code and the methods – seems to be a quick win solution to make research data more re-usable. Raw data is necessary as it allows users to check if the data is of a good quality overall, while publishing code is important to re-run the analysis and methods need to be detailed enough to allow other researchers to understand all the processes involved in data processing. An excellent case study example comes from Daniel MacArthur who has described how to reproduce all the figures in his paper and has shared the supporting code as well.

It was also suggested that the Office of Scholarly Communication could implement some simple quality control measures to ensure that research data supporting publications is shared. As a minimum the Office could check the following:

  • Is there a data statement in the publication?
  • If there is a statement – is there a link to the data?
  • Does the link work?

This is definitely a very useful suggestion from our research community and in fact we have already taken this feedback aboard and started checking for data citations in Cambridge publications.

Shortage of skills: effective data sharing is not easy

The discussion about the importance of data sharing led to reflections that effective data sharing is not always easy. A bioinformatician complained that datasets that she had tried to re-use did not satisfy the criteria of reproducibility, nor re-usability. Most of the time there was not enough metadata available to successfully use the data. There is some data shared, there is the publication, but the description is insufficient to understand the whole research process: the miracle, or the big discovery, happens somewhere in the middle.

Open Research in practice: training required

Attendees agreed that it requires effort and skills to make research open, re-usable and discoverable by others. More training is needed to ensure that researchers are equipped with skills to allow them to properly use the internet to disseminate their research, as well as with skills allowing them to effectively manage their research data. It is clear that discipline-specific training and guidance around how to manage research data effectively and how to practise open research is desired by Cambridge researchers.

Nudging researchers towards better data management practice

Many researchers have heard or experienced first-hand horror stories of having to follow up on somebody else’s project, where it was not possible to make any sense of the research data due to lack of documentation and processes. This leads to a lot of time wasted in every research group. Research data need to be properly documented and maintained to ensure research integrity and research continuity. One easy solution is to nudge researchers towards better research data management practice could be formalised data management requirements. Perhaps as a minimum, every researchers should have a lab book to document research procedures.

The time is now: stop hypocrisy

Finally, there was a suggestion that everyone should take the lead in encouraging Open Research. The simplest way to start is to stop being what has been described as a hypocrite and submit articles to journals which are fully Open Access. This should be accompanied by making one’s reviews openly available whenever possible. All publications should be accompanied by supporting research data and researchers should ensure that they evaluate individual research papers and that their judgement is not biased by the impact factor of the journal.

Need for greater awareness and interest in publishing

One of the Open Access advocates present at the meeting stated that most researchers are completely unaware of who are the exploitative and ethical publishers and the differences between them. Researchers typically do not directly pay the exploitative publishers and are therefore not interested in looking at the bigger picture of sustainability of scholarly publishing. This is clearly an area when more training and advocacy can help and the Office of Scholarly Communication is actively involved in raising awareness in Open Access. However, while it is nice to preach in a room of converts, how do we get other researchers involved in Open Access? How should we reach out to those who can’t be bothered to come to a discussion like the one we had? This is the area where anyone who understands the benefits Open Access has a job to do.

Next steps

We are extremely grateful to everyone who came to the event and shared their frustrations and ideas on how to solve some problems. We noted all the ideas on post it notes – the number of notes at the end of the discussion was impressive, an indication of how creative the participants were in just 90 minutes. It was a very productive meeting and we wish to thank all the participants for their time and effort.

20160608_160721

We think that by acting collaboratively and supporting good ideas we can achieve a lot. As an inspiration, McGill University’s Montreal Neurological Institute and Hospital (the Neuro) in Canada have recently adopted a policy on Open Research: over the next five years all results, publications and data will be free to access by everyone.

Follow up

If you would like to host similar discussions directly in your departments/institutes, please get in touch with us at info@osc.cam.ac.uk – we would be delighted to come over and hear from researchers in your discipline.

In the meantime, if you have any additional ideas that you wish to contribute, please send them to us. Everyone who is interested in being informed about the progress here is encouraged to sign up for a mailing distribution list here.

Extended notes from the meeting and slides are available at the Cambridge University Research Repository. We are particularly grateful to Avazeh Ghanbarian, Corina Logan, Ralitsa Madsen, Jenny Molloy, Ross Mounce and Alasdair Russell (listed alphabetically by surname) for agreeing to publicly speak at the event.

Published 3 August 2016
Written by Lauren Cadwallader, Joanna Jasiewicz and Marta Teperek
Creative Commons License

The case for Open Research: solutions?

This series arguing the case for Open Research has to date looked at some of the issues in scholarly communication today. Hyperauthorship, HARKing, the reproducibility crisis, a surge in retractions all stem from the requirement that researchers publish in high impact journals. The series has also looked at the invalidity of the impact factor and issues with peer review.

This series is one of an increasing cacophony of calls to move away from this method of rewarding researchers. Richard Smith noted in a recent BMJ blog criticising the current publication in journal system: “The whole outdated enterprise is kept alive for one main reason: the fact that employers and funders of researchers assess researchers primarily by where they publish. It’s extraordinary to me and many others that the employers, mainly universities, outsource such an important function to an arbitrary and corrupt system.”

Universities need to open research to ensure academic integrity and adjust to support modern collaboration and scholarship tools, and begin rewarding people who have engaged in certain types of process rather than relying on traditional assessment schemes. This was the thrust of a talk in October last year”Openness, integrity & supporting researchers“. If nothing else, this approach makes ‘nightmare scenarios’ less likely. As Prof Tom Cochrane said in the talk, the last thing an institution needs is to be on the front page because of a big fraud case. 

What would happen if we started valuing and rewarding other parts of the research process? This final blog in the series looks at opening up research to increase transparency. The argument suggests we need to move beyond rewarding only the journal article – and not only other research outputs, such as data sets but research productivity itself.

So, let’s look at how opening up research can address some of the issues raised in this series.

Rewarding study inception

In his presentation about HARKing (Hypothesising After the Results are Known) at FORCE2016 Eric Turner, Associate Professor OHSU suggested that what matters is the scientific question and methodological rigour. We should be emphasising not the study completion but study inception before we can be biased by the results.  It is already a requirement to post results of industry sponsored research in ClinicalTrials.gov – a registry and results database of publicly and privately supported clinical studies of human participants conducted around the world. Turner argues we should be using it to see the existence of studies.  He suggested reviews of protocols should happen without the results (but not include the methods section because this is written after the results are known).

There are some attempts to do this already. In 2013 Registered Reports was launched: “The philosophy of this approach is as old as the scientific method itself: If our aim is to advance knowledge then editorial decisions must be based on the rigour of the experimental design and likely replicability of the findings – and never on how the results looked in the end.” The proposal and process is described here. The guidelines for reviewers and authors are here, including the requirement to “upload their raw data and laboratory log to a free and publicly accessible file-sharing service.”

This approach has been met with praise by a group of scientists with positions on more than 100 journal editorial boards, who are “calling for all empirical journals in the life sciences – including those journals that we serve – to offer pre-registered articles at the earliest opportunity”. The signatories noted “The aim here isn’t to punish the academic community for playing the game that we created; rather, we seek to change the rules of the game itself.” And that really is the crux of the argument. We need to move away from the one point of reward.

Getting data out there

There is definite movement towards opening research. In the UK there is now a requirement from most funders that the data underpinning research publications are made available. Down under, the Research Data Australia project is a register of data from over 100 institutions, providing a single point to search, find and reuse data. The European Union has an Open Data Portal.

Resistance to sharing data amongst the research community is often due to the idea that if data is released with the first publication then there is a risk that the researcher will be ‘scooped’ before they can get those all-important journal articles out. In response to this query during a discussion with the EPSRC it was pointed out that the RCUK Common Principles state that those who undertake Research Council funded work may be entitled to a limited period of privileged use of the data they have collected to enable them to publish the results of their research. However, the length of this period varies by research discipline.

If the publication of data itself were rewarded as a ‘research output’ (which of course is what it is), then the issue of being scooped becomes moot. There have been small steps towards this goal, such as a standard method of citing data.

A new publication option is Sciencematters, which allows researchers to submit observations which are subjected to triple-blind peer review, so that the data is evaluated solely on its merits, rather than on the researcher’s name or organisation. As they indicate “Standard data, orphan data, negative data, confirmatory data and contradictory data are all published. What emerges is an honest view of the science that is done, rather than just the science that sells a story”.

Despite the benefits of having data available there are some vocal objectors to the idea of sharing data. In January this year a scathing editorial in the New England Journal of Medicine suggested that researchers who used other people’s data were ‘research parasites’. Unsurprisingly this position raised a small storm of protest (an example is here). This was so sustained that four days later a clarification was issued, which did not include the word ‘parasites’.

Evaluating & rewarding data

Ironically, one benefit of sharing data could be an improvement to the quality of the data itself. A 2011 study into why some researchers were reluctant to share their data found this to be associated with weaker evidence (against the null hypothesis of no effect) and a higher prevalence of apparent errors in the reporting of statistical results. The unwillingness to share data was particularly clear when reporting errors had a bearing on statistical significance.

Professor Marcus Munafo in his presentation at the Research Libraries UK conference held earlier this year suggested that we need to introduce quality control methods implicitly into our daily practice. Open data is a very good step in that direction. There is evidence that researchers who know their data is going to be made open are more thorough in their checking of it. Maybe it is time for an update in the way we do science – we have statistical software that can run hundreds of analysis, and we can do text and data mining of lots of papers. We need to build in new processes and systems that refine science and think about new ways of rewarding science.

So should researchers be rewarded simply for making their data available? Probably not, some kind of evaluation is necessary. In a public discussion about data sharing held at Cambridge University last year, there was the suggestion that rather than having the formal peer review of data, it would be better to have an evaluation structure based on the re-use of data – for example, valuing data which was downloadable, well-labelled and re-usable.

Need to publish null results

Generally, this series looking at the case for Open Research has argued that the big problem is the only thing that ‘counts’ is publication in high impact journals. So what happens to all the results that don’t ‘find’ anything?

Most null results are never published with a study in 2014 finding that of 221 sociological studies conducted between 2002 and 2012, only 48% of the completed studies had been published. This is a problem because not only is the scientific record inaccurate, it means  the publication bias “may cause others to waste time repeating the work, or conceal failed attempts to replicate published research”.

But it is not just the academic reward system that is preventing the widespread publication of null results – the interference of commercial interests on the publication record is another factor. A recent study looked into the issue of publication agreements – and whether a research group had signed one prior to conducting randomised clinical trials for a commercial entity. The research found that  70% of protocols mentioned an agreement on publication rights between industry and academic investigators; in 86% of those agreements, industry retained the right to disapprove or at least review manuscripts before publication. Even more concerning was  that journal articles seldom report on publication agreements, and, if they do, statements can be discrepant with the trial protocol.

There are serious issues with the research record due to selected results and selected publication which would be ameliorated by the requirement to publish all results – including null results.

There are some attempts to address this issue. Since June 2002 the Journal of Articles in Support of the Null Hypothesis has been published bi-annually. The World Health Organisation has a Statement on the Public Disclosure of Clinical Trial Results, saying: “Negative and inconclusive as well as positive results must be published or otherwise made publicly available”. A project launched in February last year by PLOS ONE is a collection focusing on negative, null and inconclusive results. The Missing Pieces collection had 20 articles in it as of today.

In January this year there were reports that a group of ten editors of management, organisational behaviour and work psychology research had pledged they would publish the results of well-conceived, designed, and conducted research even if the result was null.  The way this will work is the paper is presented without results or discussion first and it is assessed on theory, methodology, measurement information, and analysis plan.

Movement away from using the impact factor

As discussed in the first of this series of blogs ‘The mis-measurement problem‘, we have an obsession with high impact journals. These blogs have been timely, falling as they have within what seems to be a plethora of similarly focused commentary. An example is a recent Nature news story by Mario Biagioli, who argued the focus on impact of published research has created new opportunities for misconduct and fraudsters. The piece concludes that “The audit culture of universities — their love affair with metrics, impact factors, citation statistics and rankings — does not just incentivize this new form of bad behaviour. It enables it.”

In recent discussion amongst the Scholarly Communication community about this mis-measurement the suggestion that we can address the problem by limiting the number of articles that can be submitted for promotion was raised. This ideally reduces the volume of papers produced overall, or so the thinking goes. Harvard Medical School and the Computing Research Association “Best Practices Memo” were cited as examples by different people.

This is also the approach that has been taken by the Research Excellence Framework in the UK – researchers put forward their best four works from the previous period (typically about five years). But it does not prevent poor practice. Researchers are constantly evaluated for all manner of reasons. Promotion, competitive grants, tenure, admittance to fellowships are just a few of the many environments a researcher’s publication history will be considered.

Are altmetrics a solution? There is a risk that any alternative indicator becomes an end in itself. The European Commission now has an Open Science Policy Platform, which, amongst other activities has recently established an expert group to advise on the role of metrics and altmetrics in the development of its agenda for open science and research.

Peer review experiments

Open peer review is where peer review reports identify the reviewers and are published with the papers.  One of the more recent publishers to use this method of review is the University of California Press’ open access mega journal called Collabra, launched last year. In an interview published by Richard Poynder, UC Press Director Alison Mudditt notes that there are many people who would like to see more transparency in the peer review process. There is some evidence to show that identifying reviewers results in more courteous reviews.

PLOS One publishes work after an editorial review process which does not include potentially subjective assessments of significance or scope to focus on technical, ethical and scientific rigor. Once an article is published readers are able to comment on the work in an open fashion.

One solution could be that used by CUP journal JFM Rapids, which has a ‘fast-track’ section of the journal offering fast publication for short, high-quality papers. This also operates a policy whereby no paper is reviewed twice, thus authors must ensure that their paper is as strong as possible in the first instance. The benefit is it offers a fast turnaround time while reducing reviewer fatigue.

There are calls for post publication peer review, although some attempts to do this have been unsuccessful, there are arguments that it is simply a matter of time – particularly if reviewers are incentivised. One publisher that uses this system is the platform F1000Research which publishes work immediately and invites open post-publication review. And, just recently, Wellcome Open Research was launched using services developed by F1000Research. It will make research outputs available faster and in ways that support reproducibility and transparency. It uses an open access model of immediate publication followed by transparent, invited peer review and inclusion of supporting data.

Open ways of conducting research

All of these initiatives demonstrate a definite movement towards an open way of doing research by addressing aspects of the research and publication process. But there are some research groups that are taking a holistic approach to open research.

Marcus Munafo published last month a description of the experience the UK Center for Tobacco and Alcohol Studies and the MRC Integrative Epidemiology Unit at the University of Bristol over the past few years of attempting to work within an Open Science Model focused on three core areas:  study protocols, data, and publications.

Another example is the Open Source Malaria project which includes researchers and students using open online laboratory notebooks from around the world including Australia, Europe and North America. Experimental data is posted online each day, enabling instant sharing and the ability to build on others’ findings in almost real time. Indeed, according to their site ‘anyone can contribute’. They have just announced that undergraduate classes are synthesising molecules for the project. This example fulfils all of the five basic principles of open research suggested here.

The Netherlands Organisation for Scientific Research (NWO) has just announced that it is making 3 million euros available for a Replication Studies pilot programme. The pilot will concentrate on the replication of social sciences, health research and healthcare innovation studies that have a large impact on science, government policy or the public debate. The intention after this study will be to “include replication research in an effective manner in all of its research programmes”.

A review of literature published this week has demonstrated that open research is associated with increases in citations, media attention, potential collaborators, job opportunities and funding opportunities. These findings are evidence, the authors say,  “that open research practices bring significant benefits to researchers relative to more traditional closed practices”.

This series has been arguing that we should move to Open Research as a way of changing the reward system that bastardises so much of the scientific endeavour. However there may be other benefits according to a recently published opinion piece which argues that Open Science can serve a different purpose to “help improve the lot of individual working scientists”.

Conclusion

There are clearly defined problems within the research process that in the main stem from the need to publish in  high impact journals. Throughout this blog there are multiple examples of initiatives and attempts to provide alternative ways of working and publishing.

However, all of this effort will only succeed if those doing the assessing change the rules of the game. This is tricky. Often the people who have succeeded have some investment in the status quo remaining. We need strong and bold leadership to move us out of this mess and towards a more robust and fairer future. I will finish with a quote that has been attributed to Mark Twain, Einstein and Henry Ford. “If you always do what you’ve always done, you’ll always get what you’ve always got”. It really is up to us.

Published 2 August 2016
Written by Dr Danny Kingsley
Creative Commons License

The case for Open Research: does peer review work?

This is the fourth in a series of blog posts on the Case for Open Research, this time looking at issues with peer review. The previous three have looked at the mis-measurement problem, the authorship problem and the accuracy of the scientific record. This blog follows on from the last and asks – if peer review is working why are we facing issues like increased retractions and the inability to reproduce considerable proportion of the literature? (Spoiler alert – peer review only works sometimes.)

Again, there is an entire corpus of research behind peer review, this blog post merely scrapes the surface. As a small indicator, there has been a Peer Review Congress held every four years for the past thirty years (see here for an overview). Readers might also be interested in some work I did on this published as The peer review paradox – An Australian case study.

There is a second, related post published with this one today. Last year Cambridge University Press invited a group of researchers to discuss the topic of peer review – the write-up is here.

An explainer

What is peer review? Generally, peer review is the process by which research submitted for publication is overseen by colleagues who have expertise in the same or similar field before publication. Peer review is defined as having several purposes:

  • Checking the work for ‘soundness’
  • Checking the work for originality and significance
  • Determining whether the work ‘fits’ the journal
  • Improving the paper

Last year, during peer review week the Royal Society hosted a debate on whether peer review was fit for purpose. The debate found that in principle peer review is seen as a good thing, but the implementation is sometimes concerning. A major concern was the lack of evidence of the effectiveness of the various forms of peer review.

Robert Merton in his seminal 1942 work The Normative Structure of Science described four norms of science*. ‘Organised scepticism’ is the norm that scientific claims should be exposed to critical scrutiny before being accepted.  How this has manifested has changed over the years. Refereeing in its current form, as an activity that symbolises objective judgement of research is a relatively new phenomenon – something that has only taken hold since the 1960s.  Indeed, Nature was still publishing some unrefereed articles until 1973.

(*The other three norms are ‘Universalism’ – that anyone can participate, ‘Communism’ – that there is common ownership of research findings and ‘Disinterestedness’ – that research is done for the common good, not private benefit. These are an interesting framework with which to look at the Open Access debate, but that is another discussion.)

Crediting hidden work

The authorship blog in this series  looked at credit for contribution to a research project, but the academic community contributes to the scholarly ecosystem in many ways.  One of the criticisms of peer review is that it is ‘hidden’ work that researchers do. Most peer review is ‘double blind’ – where the reviewer does not know  the name of the author and the author does not know who is reviewing the work. This makes it very difficult to quantify who is doing this work.  Peer review and journal editing is a huge tranche of unpaid work that academics contributions to research.

One of the issues with peer review is the sheer volume of articles being submitted for publication each year. A 2008 study  ‘Activities, costs and funding flows in the scholarly communications system‘ estimated the global unpaid non-cash cost of peer review as £1.9 billion annually.

There has been some call to try and recognise peer review in some way as part of the academic workflow. In January 2015 a group of over 40 Australian Wiley editors sent an open letter Recognition for peer review and editing in Australia – and beyond?  to their universities, funders, and other research institutions and organisations in Australia, calling for a way to reward the work. In September that year in Australia,  Mark Robertson, publishing director for Wiley Research Asia-Pacific, said “there was a bit of a crisis” with peer reviewing, with new approaches needed to give peer reviewers appropriate recognition and encourage ­institutions to allow staff to put time aside to review.

There are some attempts to do something about this problem. A service called Publons is a way to ‘register’ the peer review a researcher is undertaking. There have also been calls for an ‘R index’ which would give citable recognition to reviewers. The idea is to improve the system by both encouraging more participation and providing higher quality, constructive input, without the need for a loss of anonymity.

Peer review fails

The secret nature of peer review means it is also potentially open to manipulation. An example of problematic practices is peer review fraud. A recurrent theme throughout discussions on peer review at this year’s Researcher 2 Reader conference (see the blog summary here) was that finding and retaining peer reviewers was a challenge that was getting worse. As the process of obtaining willing peer reviewers becomes more challenging, it is not uncommon for the journal to ask the author to nominate possible reviewers.  However  this can lead to peer review ‘fraud’ where the nominated reviewer is not who they are meant to be which means the articles make their way into the literature without actual review.

In August 2015 Springer was forced to retract 64 articles from 10 journals, ‘after editorial checks spotted fake email addresses, and subsequent internal investigations uncovered fabricated peer review reports’.  They concluded the peer review process had been ‘compromised’.

In November 2014, BioMed Central uncovered a scam where they were forced to retract close to 50 papers because of fake peer review issues. This prompted BioMed Central to produce the blog ‘Who reviews the reviewers?’ and Nature writing a story on Publishing: the peer review scam.

In May 2015 Science  retracted a paper because the supporting data was entirely fabricated. The paper got through peer review because it had a big name researcher on it. There is a lengthy (but worthwhile) discussion of the scandal here. The final clue was getting hold of a closed data set  that: ‘wasn’t a publicly accessible dataset, but Kalla had figured out a way to download a copy’. This is why we need open data, by the way …

But is peer review itself the problem here? Is this all not simply the result of the pressure on the research community to publish in high impact journals for their careers?

Conclusion

So at the end of all of this, is peer review ‘broken’? Yes according to a study of 270 scientists worldwide published last week. But in a considerably larger study published last year by Taylor and Francis showed an enthusiasm for peer review. The white paper Peer review in 2015: a global view,  which gathered “opinions from those who author research articles, those who review them, and the journal editors who oversee the process”. It found that researchers value the peer review process.  Most respondents agreed that peer review greatly helps scholarly communication by testing the academic rigour of outputs. The majority also reported that they felt the peer review process had improved the quality of their own most recent published article.

Peer review is the ‘least worst’ process we have for ensuring that work is sound. Generally the research community require some sort of review of research, but there are plenty of examples that our current peer review process is not delivering the consistent verification it should. This system is relatively new and it is perhaps time to look at shifting the nature of peer review once more. On option is to open up peer review, and this can take many forms. Identifying reviewers, publishing reviews with a DOI so they can be cited, publishing the original submitted article with all the reviews and the final work, allowing previous reviews to be attached to the resubmitted article are all possibilities.

Adopting  one or all of these practices benefits the reviewers because it exposes the hidden work involved in reviewing. It can also reduce the burden on reviewers by minimising the number of times a paper is re-reviewed (remember the rejection rate of some journals is up to 95% meaning papers can get cascaded and re-reviewed multiple times).

This is the last of the ‘issues’ blogs in the case for Open Research series. The series will turn its attention to some of the solutions now available.

Published 19 July 2016
Written by Dr Danny Kingsley
Creative Commons License

Lifting the lid on peer review

This blog describes some of the insights that emerged from two sets of discussions with academics at Cambridge University organised by Cambridge University Press last year. The topic was peer review and the two sessions were a group of editors in the Humanities and Social Sciences, the other a group of editors in the Science, Technical, Medical and Engineering areas.

The themes that emerged echoed many of the issues that were raised in the associated blog ‘The case for Open Research: does peer review work?‘. If anything, the discussion paints a darker picture of the peer review landscape.

Themes included the challenges of finding and retaining reviewers, the reviewing demand on some people, the reality that many reviews are done by inexperienced researchers, that peer reviewing can lead to collaboration, that blinding review can lead to terrible behaviour, but opening it may lead to an exodus of reviewers. There were no real solutions decided at these discussions, but the conversation was rich and full of insights.

Very uneven workload

It is generally known that finding and retaining reviewers is a challenge for editors. One of the first discussion points for the group was the issue of being asked to review work. Some people in the room said that they get asked about twice a week, but the requests are so great that they are only able to do about one in 10 of what is asked. At any given time researchers can be  doing at least one review.

Researchers working in different fields get asked by different journals, however some colleagues never get asked and complain about this. In reality, most people are never asked to undertake reviewing but people in top research universities are asked all the time.

The CUP suggested that we could have a shared database that lots of editors look at, however this idea was met with concern from at least one person: “You don’t want to reveal your good reviewers in case they get stolen”.  (Note that some journals publish the list of reviewers).

When the option of payment and credit for reviewing was raised the general consensus was that the reason reviewers don’t review was not because they don’t get paid, it is because they don’t have time.

Who is actually doing the reviewing?

It was freely admitted around the table that peer reviews are mostly done by PhD students and PostDocs. One of the reasons there are bad reviews is simply because they are being done by very inexperienced people. Many reviewers have not seen very many reviews before they review papers themselves. There is no formal training or assessment in peer review. And there is no incentive for editors to do something about the quality of reviews.

The question that then arises from this issue is: How we get people into the reviewing pool and how we give them some training? One solution offered in the STEM discussion was reviewer training. The option of encouraging scientists to recommend their post-docs as reviewers under their supervision would allow a new generation of reviewers to gain supervised experience.

Another problem with junior researchers reviewing is if you have people who are early in their careers they don’t feel they can say things, or are able to publish negative reviews. The problem is not the scandal, it is the hierarchy of power.

An observation in the STEM discussion was that the assumption that ‘senior = good’ sometimes does not stand up, as often early-career scientists will be excellent reviewers. It may be that senior researchers may best recognise how a paper fits into the field, however more junior scientists may be more adept in the technical details of a paper.

Discussions in the STEM group moved to the role of the Editor, where an observation was made that authors must understand that the final decision rests with the Editor, who is provided guidance by referees.

In STEM there is a practice of sharing reviews among all reviewers of a paper. Several of those present gave examples where reviews are shared mid-stream (e.g. after a ‘revise’ decision), at the end of the process, and even prior to a first decision – which gives reviewers a chance to cross-comment on each other’s reviews.

There was the comment that in STEM, editors must act pro-actively in cases of conflicting reviews, where it is the Editor’s responsibility to focus on the important points and give an informed decision and guidance to authors.

What works

The main reason peer review is essential is you have to filter out the ‘bad stuff.’ It is already very difficult to keep up with the literature, without that it would be impossible. When the peer review  happens, the end result is high quality. It is not just articles are being rejected but the work that comes out is better. A STEM editor noted that authors have written in praise of reviewing when their papers have been rejected, “So it does add quality”.

The thing you value most in a journal is the quality of reviewing and the editorial steer, observed a STEM participant. They said this was noticeable in Biology “where the editorial guidance is getting better”.

An observation in the Humanities discussion was that many of the models in the sciences don’t work for the Humanities. In early History most journal articles are published by early career people so peer review in this instance is an educational job teaching historians about how to write journal articles.

A STEM observation was that sometimes peer reviewing leads to collaboration. One editor noted that in their journal, over the last 10-15 years, there have been quite a number of papers where the reviewer has provided a helpful and detailed review of the paper and the authors have asked if they can be put on as authors of a paper.

What doesn’t work

The discussions about what doesn’t work in peer review ranged from the comment that “Peer review for monographs is ‘broken irretrievably’“. One attendee noted that peer review for edited books has never really happened.

One STEM participant said the thing they liked least about peer review was that from an author perspective is it is pretty random – picking two or three people. “If you get one or two bad reviews it won’t get published – this is up to luck”. They made the comment that peer review is not really reproducible. Another issue is because it is so closed there is no incentive for people to improve the quality of their peer reviewers – there are a small number of good and lots of average reviewers .

One humanities person noted that reviewers put the work they are reviewing “through an idea about what a journal articles should look like’” so while there “used be all kinds of writing in the 1970s now they are all similar”. This reduces work to the lowest common denominator. It is not just a minimal positive impact on work but a negative impact on work. Another person agreed on the homogenisation issue – but thought this was an editorial problem: “A good editor should be prepared to go out on a limb”.

Long delays over review

For some journals the average time for review is 6-7 months. One participant noted “I review book manuscripts shorter than that. The main problem is it is too slow”.

A post doc noted that the delay for peer review is a serious problem at that level of an academic career. It is necessary to have publications on a CV: “It is not good enough to say it is being considered by a journal (for the past year)”.

The cursory nature of many reviews arose a few times. One person asked whether as an editor you take the review or do you go to other reviewer and slow the whole process down. Some journals ask for up to six reviews which drags the whole things down. Another said the problem meant ‘you endlessly go through the ABC of the topic’.

Blaming peer review for something else?

One participant raised the question of whether we were blaming peer review for things it is not responsible for. There probably is a problem which is more to do with the changing nature of the academic endeavour. More academics are out there and everyone is being pressured to publish in top-tier journals. These are issues in the profession.

The group noted academia has too many people trying to get to too few positions. The ‘cascade’ [of publications being sent to lower tier journals after rejection] is connected to this – you have a hierarchy of quality.

The conversation moved to the pressure to publish in high-impact journals. One STEM participant noted the problem has got substantially worse than 30 years ago. It is to do with the amount of expectation put upon everyone in the STM system. The need people have to publish material that 20-30 years ago that no-one would have bothered with. The data that is sitting at the bottom of the drawer – usually when you retire. Now they are digging it out – so the rejection rate is going up because more rubbish is going in.

The free labour/payment debate

A social anthropologist noted that a major problem with peer review is we are asking people to do a whole load of free labour, “It is not just credit but we should find a way to pay people for what they do”. Some journals have a large editorial board who do a lot of the reviewing. One person noted this was not completely free labour as they get a subscription to the journal.

The idea of paying for peer review is an economic question. Does paying for things alter the relationship between the person who is paying and the person doing the work? In this discussion the participants had a concern that paying people makes authors into consumers, does it change the system by introducing an economic transaction?

There was some debate over the payment question. One researcher said they would be ‘happy to receive’ payment, but noted if they are offered payment for manuscripts they always collect books. There is ‘something exciting about which book I should go for’. Other suggested that it did not necessarily have to be a cash payment but some sort of quid pro quo, “it would be nice if there was an offer of that”.

There was some resistance to the idea of offering cash payment with the suggestion that there are people who are on a single salary and this would be a real incentive to review so they get burnt out and put poor reviews out. However, payment for timely reviews was considered a great idea by some.

A STEM participant noted that reviewers usually do so out of a sense of moral obligation, as a part of the academic world, and that it is difficult to feel morally obliged to do anything for which you are offered money, thus care must be exercised when thinking of bringing in payment or reward.

Portable reviews?

The idea of portable reviews was discussed by both groups. In principle it sounds good because a lot of work is being done twice, second reviews could happen much more quickly if they were attached. In addition with a small pool of reviewers, it is possible and likely that a paper rejected after review by one journal will then be sent to the same reviewer when re-submitted to another journal.

However the humanities group who noted there was “danger in importing the model from the hard sciences into humanities”. The STEM group noted this would require a re-programming of the culture of reviewing.

There would be some issues with implementation – for example a journal has to admit it is a second tier journal because it takes the ‘slops’, given top journals only take 4% of the papers. And there are some potential problems with re-using reviews. One participant said “I write different kinds of reviews for the top journals compared to the lower ones – so the reviews are not transferable – they could disadvantage the authors.”

There are some examples of this type of thing happening now. Antarctic Science requests authors to provide details of prior journals submitted to and reviews. But it is not universally accepted. Examples were given by the STEM group of times where authors decide to send prior reviews when submitting to a new journal, but the publishers will not accept these as they did not commission them.

Overall the STEM group broadly agreed that sharing reviews in this way would save a significant amount of time and work, the logistics of sharing reviews especially between publishers are obviously very difficult. They also noted that such procedures would greatly reduce wasted effort, and presumably also increase the sample of reviews / opinions used when making a decision on a paper.

Open peer review

The opinions in the discussion around open peer review ranged widely. The arguments against included: “Open peer review sounds like recipe for academia becoming diffused with hostility even more than already”. And: “The publication of reviews idea is absolutely terrible, you need the person to feel they can be open.” There was also some concern that people could be ingratiating if they were reviewing a researcher ‘higher up’.

A STEM participant noted that some authors had said that ‘if you publish all of the reviews at the end of the year we won’t review any more’. They noted that when you have a small pool of reviewers that is a problem. The reviewers’ concerns include that they won’t get another job.

In one case a participant said they had been involved with a journal that was doing the “absolute opposite” with triple blind review – dealing with issues of implicit bias – particular gender bias, where the editors don’t know who the author is. The conversation then noted that even in double blind it is possible to tell who the reviewer is. Most people don’t know how to de-identify the document as well.

However on the positive side, there was support for a dialogue between the author and the reviewer – involved in a three way discussion.  There is a problem in that it can be very prolonged. A STEM participant noted that sometimes the reviewer debate surrounding an article is more interesting or useful than the original paper itself.

One STEM participant observed they had been involved in open review and “was sceptical at first”. However they noted it makes people behave better. “In anonymous reviews I have seen really shocking things said“.

Conclusion

This was an interesting exercise – providing an opportunity for editors to talk amongst themselves and with a publisher about issues relating to peer review. It will be instructive to see what happens.

Published 19 July 2016
Written by Dr Danny Kingsley
Creative Commons License