Category Archives: Uncategorized

The case for Open Research: solutions?

This series arguing the case for Open Research has to date looked at some of the issues in scholarly communication today. Hyperauthorship, HARKing, the reproducibility crisis, a surge in retractions all stem from the requirement that researchers publish in high impact journals. The series has also looked at the invalidity of the impact factor and issues with peer review.

This series is one of an increasing cacophony of calls to move away from this method of rewarding researchers. Richard Smith noted in a recent BMJ blog criticising the current publication in journal system: “The whole outdated enterprise is kept alive for one main reason: the fact that employers and funders of researchers assess researchers primarily by where they publish. It’s extraordinary to me and many others that the employers, mainly universities, outsource such an important function to an arbitrary and corrupt system.”

Universities need to open research to ensure academic integrity and adjust to support modern collaboration and scholarship tools, and begin rewarding people who have engaged in certain types of process rather than relying on traditional assessment schemes. This was the thrust of a talk in October last year”Openness, integrity & supporting researchers“. If nothing else, this approach makes ‘nightmare scenarios’ less likely. As Prof Tom Cochrane said in the talk, the last thing an institution needs is to be on the front page because of a big fraud case. 

What would happen if we started valuing and rewarding other parts of the research process? This final blog in the series looks at opening up research to increase transparency. The argument suggests we need to move beyond rewarding only the journal article – and not only other research outputs, such as data sets but research productivity itself.

So, let’s look at how opening up research can address some of the issues raised in this series.

Rewarding study inception

In his presentation about HARKing (Hypothesising After the Results are Known) at FORCE2016 Eric Turner, Associate Professor OHSU suggested that what matters is the scientific question and methodological rigour. We should be emphasising not the study completion but study inception before we can be biased by the results.  It is already a requirement to post results of industry sponsored research in ClinicalTrials.gov – a registry and results database of publicly and privately supported clinical studies of human participants conducted around the world. Turner argues we should be using it to see the existence of studies.  He suggested reviews of protocols should happen without the results (but not include the methods section because this is written after the results are known).

There are some attempts to do this already. In 2013 Registered Reports was launched: “The philosophy of this approach is as old as the scientific method itself: If our aim is to advance knowledge then editorial decisions must be based on the rigour of the experimental design and likely replicability of the findings – and never on how the results looked in the end.” The proposal and process is described here. The guidelines for reviewers and authors are here, including the requirement to “upload their raw data and laboratory log to a free and publicly accessible file-sharing service.”

This approach has been met with praise by a group of scientists with positions on more than 100 journal editorial boards, who are “calling for all empirical journals in the life sciences – including those journals that we serve – to offer pre-registered articles at the earliest opportunity”. The signatories noted “The aim here isn’t to punish the academic community for playing the game that we created; rather, we seek to change the rules of the game itself.” And that really is the crux of the argument. We need to move away from the one point of reward.

Getting data out there

There is definite movement towards opening research. In the UK there is now a requirement from most funders that the data underpinning research publications are made available. Down under, the Research Data Australia project is a register of data from over 100 institutions, providing a single point to search, find and reuse data. The European Union has an Open Data Portal.

Resistance to sharing data amongst the research community is often due to the idea that if data is released with the first publication then there is a risk that the researcher will be ‘scooped’ before they can get those all-important journal articles out. In response to this query during a discussion with the EPSRC it was pointed out that the RCUK Common Principles state that those who undertake Research Council funded work may be entitled to a limited period of privileged use of the data they have collected to enable them to publish the results of their research. However, the length of this period varies by research discipline.

If the publication of data itself were rewarded as a ‘research output’ (which of course is what it is), then the issue of being scooped becomes moot. There have been small steps towards this goal, such as a standard method of citing data.

A new publication option is Sciencematters, which allows researchers to submit observations which are subjected to triple-blind peer review, so that the data is evaluated solely on its merits, rather than on the researcher’s name or organisation. As they indicate “Standard data, orphan data, negative data, confirmatory data and contradictory data are all published. What emerges is an honest view of the science that is done, rather than just the science that sells a story”.

Despite the benefits of having data available there are some vocal objectors to the idea of sharing data. In January this year a scathing editorial in the New England Journal of Medicine suggested that researchers who used other people’s data were ‘research parasites’. Unsurprisingly this position raised a small storm of protest (an example is here). This was so sustained that four days later a clarification was issued, which did not include the word ‘parasites’.

Evaluating & rewarding data

Ironically, one benefit of sharing data could be an improvement to the quality of the data itself. A 2011 study into why some researchers were reluctant to share their data found this to be associated with weaker evidence (against the null hypothesis of no effect) and a higher prevalence of apparent errors in the reporting of statistical results. The unwillingness to share data was particularly clear when reporting errors had a bearing on statistical significance.

Professor Marcus Munafo in his presentation at the Research Libraries UK conference held earlier this year suggested that we need to introduce quality control methods implicitly into our daily practice. Open data is a very good step in that direction. There is evidence that researchers who know their data is going to be made open are more thorough in their checking of it. Maybe it is time for an update in the way we do science – we have statistical software that can run hundreds of analysis, and we can do text and data mining of lots of papers. We need to build in new processes and systems that refine science and think about new ways of rewarding science.

So should researchers be rewarded simply for making their data available? Probably not, some kind of evaluation is necessary. In a public discussion about data sharing held at Cambridge University last year, there was the suggestion that rather than having the formal peer review of data, it would be better to have an evaluation structure based on the re-use of data – for example, valuing data which was downloadable, well-labelled and re-usable.

Need to publish null results

Generally, this series looking at the case for Open Research has argued that the big problem is the only thing that ‘counts’ is publication in high impact journals. So what happens to all the results that don’t ‘find’ anything?

Most null results are never published with a study in 2014 finding that of 221 sociological studies conducted between 2002 and 2012, only 48% of the completed studies had been published. This is a problem because not only is the scientific record inaccurate, it means  the publication bias “may cause others to waste time repeating the work, or conceal failed attempts to replicate published research”.

But it is not just the academic reward system that is preventing the widespread publication of null results – the interference of commercial interests on the publication record is another factor. A recent study looked into the issue of publication agreements – and whether a research group had signed one prior to conducting randomised clinical trials for a commercial entity. The research found that  70% of protocols mentioned an agreement on publication rights between industry and academic investigators; in 86% of those agreements, industry retained the right to disapprove or at least review manuscripts before publication. Even more concerning was  that journal articles seldom report on publication agreements, and, if they do, statements can be discrepant with the trial protocol.

There are serious issues with the research record due to selected results and selected publication which would be ameliorated by the requirement to publish all results – including null results.

There are some attempts to address this issue. Since June 2002 the Journal of Articles in Support of the Null Hypothesis has been published bi-annually. The World Health Organisation has a Statement on the Public Disclosure of Clinical Trial Results, saying: “Negative and inconclusive as well as positive results must be published or otherwise made publicly available”. A project launched in February last year by PLOS ONE is a collection focusing on negative, null and inconclusive results. The Missing Pieces collection had 20 articles in it as of today.

In January this year there were reports that a group of ten editors of management, organisational behaviour and work psychology research had pledged they would publish the results of well-conceived, designed, and conducted research even if the result was null.  The way this will work is the paper is presented without results or discussion first and it is assessed on theory, methodology, measurement information, and analysis plan.

Movement away from using the impact factor

As discussed in the first of this series of blogs ‘The mis-measurement problem‘, we have an obsession with high impact journals. These blogs have been timely, falling as they have within what seems to be a plethora of similarly focused commentary. An example is a recent Nature news story by Mario Biagioli, who argued the focus on impact of published research has created new opportunities for misconduct and fraudsters. The piece concludes that “The audit culture of universities — their love affair with metrics, impact factors, citation statistics and rankings — does not just incentivize this new form of bad behaviour. It enables it.”

In recent discussion amongst the Scholarly Communication community about this mis-measurement the suggestion that we can address the problem by limiting the number of articles that can be submitted for promotion was raised. This ideally reduces the volume of papers produced overall, or so the thinking goes. Harvard Medical School and the Computing Research Association “Best Practices Memo” were cited as examples by different people.

This is also the approach that has been taken by the Research Excellence Framework in the UK – researchers put forward their best four works from the previous period (typically about five years). But it does not prevent poor practice. Researchers are constantly evaluated for all manner of reasons. Promotion, competitive grants, tenure, admittance to fellowships are just a few of the many environments a researcher’s publication history will be considered.

Are altmetrics a solution? There is a risk that any alternative indicator becomes an end in itself. The European Commission now has an Open Science Policy Platform, which, amongst other activities has recently established an expert group to advise on the role of metrics and altmetrics in the development of its agenda for open science and research.

Peer review experiments

Open peer review is where peer review reports identify the reviewers and are published with the papers.  One of the more recent publishers to use this method of review is the University of California Press’ open access mega journal called Collabra, launched last year. In an interview published by Richard Poynder, UC Press Director Alison Mudditt notes that there are many people who would like to see more transparency in the peer review process. There is some evidence to show that identifying reviewers results in more courteous reviews.

PLOS One publishes work after an editorial review process which does not include potentially subjective assessments of significance or scope to focus on technical, ethical and scientific rigor. Once an article is published readers are able to comment on the work in an open fashion.

One solution could be that used by CUP journal JFM Rapids, which has a ‘fast-track’ section of the journal offering fast publication for short, high-quality papers. This also operates a policy whereby no paper is reviewed twice, thus authors must ensure that their paper is as strong as possible in the first instance. The benefit is it offers a fast turnaround time while reducing reviewer fatigue.

There are calls for post publication peer review, although some attempts to do this have been unsuccessful, there are arguments that it is simply a matter of time – particularly if reviewers are incentivised. One publisher that uses this system is the platform F1000Research which publishes work immediately and invites open post-publication review. And, just recently, Wellcome Open Research was launched using services developed by F1000Research. It will make research outputs available faster and in ways that support reproducibility and transparency. It uses an open access model of immediate publication followed by transparent, invited peer review and inclusion of supporting data.

Open ways of conducting research

All of these initiatives demonstrate a definite movement towards an open way of doing research by addressing aspects of the research and publication process. But there are some research groups that are taking a holistic approach to open research.

Marcus Munafo published last month a description of the experience the UK Center for Tobacco and Alcohol Studies and the MRC Integrative Epidemiology Unit at the University of Bristol over the past few years of attempting to work within an Open Science Model focused on three core areas:  study protocols, data, and publications.

Another example is the Open Source Malaria project which includes researchers and students using open online laboratory notebooks from around the world including Australia, Europe and North America. Experimental data is posted online each day, enabling instant sharing and the ability to build on others’ findings in almost real time. Indeed, according to their site ‘anyone can contribute’. They have just announced that undergraduate classes are synthesising molecules for the project. This example fulfils all of the five basic principles of open research suggested here.

The Netherlands Organisation for Scientific Research (NWO) has just announced that it is making 3 million euros available for a Replication Studies pilot programme. The pilot will concentrate on the replication of social sciences, health research and healthcare innovation studies that have a large impact on science, government policy or the public debate. The intention after this study will be to “include replication research in an effective manner in all of its research programmes”.

A review of literature published this week has demonstrated that open research is associated with increases in citations, media attention, potential collaborators, job opportunities and funding opportunities. These findings are evidence, the authors say,  “that open research practices bring significant benefits to researchers relative to more traditional closed practices”.

This series has been arguing that we should move to Open Research as a way of changing the reward system that bastardises so much of the scientific endeavour. However there may be other benefits according to a recently published opinion piece which argues that Open Science can serve a different purpose to “help improve the lot of individual working scientists”.

Conclusion

There are clearly defined problems within the research process that in the main stem from the need to publish in  high impact journals. Throughout this blog there are multiple examples of initiatives and attempts to provide alternative ways of working and publishing.

However, all of this effort will only succeed if those doing the assessing change the rules of the game. This is tricky. Often the people who have succeeded have some investment in the status quo remaining. We need strong and bold leadership to move us out of this mess and towards a more robust and fairer future. I will finish with a quote that has been attributed to Mark Twain, Einstein and Henry Ford. “If you always do what you’ve always done, you’ll always get what you’ve always got”. It really is up to us.

Published 2 August 2016
Written by Dr Danny Kingsley
Creative Commons License

The case for Open Research: does peer review work?

This is the fourth in a series of blog posts on the Case for Open Research, this time looking at issues with peer review. The previous three have looked at the mis-measurement problem, the authorship problem and the accuracy of the scientific record. This blog follows on from the last and asks – if peer review is working why are we facing issues like increased retractions and the inability to reproduce considerable proportion of the literature? (Spoiler alert – peer review only works sometimes.)

Again, there is an entire corpus of research behind peer review, this blog post merely scrapes the surface. As a small indicator, there has been a Peer Review Congress held every four years for the past thirty years (see here for an overview). Readers might also be interested in some work I did on this published as The peer review paradox – An Australian case study.

There is a second, related post published with this one today. Last year Cambridge University Press invited a group of researchers to discuss the topic of peer review – the write-up is here.

An explainer

What is peer review? Generally, peer review is the process by which research submitted for publication is overseen by colleagues who have expertise in the same or similar field before publication. Peer review is defined as having several purposes:

  • Checking the work for ‘soundness’
  • Checking the work for originality and significance
  • Determining whether the work ‘fits’ the journal
  • Improving the paper

Last year, during peer review week the Royal Society hosted a debate on whether peer review was fit for purpose. The debate found that in principle peer review is seen as a good thing, but the implementation is sometimes concerning. A major concern was the lack of evidence of the effectiveness of the various forms of peer review.

Robert Merton in his seminal 1942 work The Normative Structure of Science described four norms of science*. ‘Organised scepticism’ is the norm that scientific claims should be exposed to critical scrutiny before being accepted.  How this has manifested has changed over the years. Refereeing in its current form, as an activity that symbolises objective judgement of research is a relatively new phenomenon – something that has only taken hold since the 1960s.  Indeed, Nature was still publishing some unrefereed articles until 1973.

(*The other three norms are ‘Universalism’ – that anyone can participate, ‘Communism’ – that there is common ownership of research findings and ‘Disinterestedness’ – that research is done for the common good, not private benefit. These are an interesting framework with which to look at the Open Access debate, but that is another discussion.)

Crediting hidden work

The authorship blog in this series  looked at credit for contribution to a research project, but the academic community contributes to the scholarly ecosystem in many ways.  One of the criticisms of peer review is that it is ‘hidden’ work that researchers do. Most peer review is ‘double blind’ – where the reviewer does not know  the name of the author and the author does not know who is reviewing the work. This makes it very difficult to quantify who is doing this work.  Peer review and journal editing is a huge tranche of unpaid work that academics contributions to research.

One of the issues with peer review is the sheer volume of articles being submitted for publication each year. A 2008 study  ‘Activities, costs and funding flows in the scholarly communications system‘ estimated the global unpaid non-cash cost of peer review as £1.9 billion annually.

There has been some call to try and recognise peer review in some way as part of the academic workflow. In January 2015 a group of over 40 Australian Wiley editors sent an open letter Recognition for peer review and editing in Australia – and beyond?  to their universities, funders, and other research institutions and organisations in Australia, calling for a way to reward the work. In September that year in Australia,  Mark Robertson, publishing director for Wiley Research Asia-Pacific, said “there was a bit of a crisis” with peer reviewing, with new approaches needed to give peer reviewers appropriate recognition and encourage ­institutions to allow staff to put time aside to review.

There are some attempts to do something about this problem. A service called Publons is a way to ‘register’ the peer review a researcher is undertaking. There have also been calls for an ‘R index’ which would give citable recognition to reviewers. The idea is to improve the system by both encouraging more participation and providing higher quality, constructive input, without the need for a loss of anonymity.

Peer review fails

The secret nature of peer review means it is also potentially open to manipulation. An example of problematic practices is peer review fraud. A recurrent theme throughout discussions on peer review at this year’s Researcher 2 Reader conference (see the blog summary here) was that finding and retaining peer reviewers was a challenge that was getting worse. As the process of obtaining willing peer reviewers becomes more challenging, it is not uncommon for the journal to ask the author to nominate possible reviewers.  However  this can lead to peer review ‘fraud’ where the nominated reviewer is not who they are meant to be which means the articles make their way into the literature without actual review.

In August 2015 Springer was forced to retract 64 articles from 10 journals, ‘after editorial checks spotted fake email addresses, and subsequent internal investigations uncovered fabricated peer review reports’.  They concluded the peer review process had been ‘compromised’.

In November 2014, BioMed Central uncovered a scam where they were forced to retract close to 50 papers because of fake peer review issues. This prompted BioMed Central to produce the blog ‘Who reviews the reviewers?’ and Nature writing a story on Publishing: the peer review scam.

In May 2015 Science  retracted a paper because the supporting data was entirely fabricated. The paper got through peer review because it had a big name researcher on it. There is a lengthy (but worthwhile) discussion of the scandal here. The final clue was getting hold of a closed data set  that: ‘wasn’t a publicly accessible dataset, but Kalla had figured out a way to download a copy’. This is why we need open data, by the way …

But is peer review itself the problem here? Is this all not simply the result of the pressure on the research community to publish in high impact journals for their careers?

Conclusion

So at the end of all of this, is peer review ‘broken’? Yes according to a study of 270 scientists worldwide published last week. But in a considerably larger study published last year by Taylor and Francis showed an enthusiasm for peer review. The white paper Peer review in 2015: a global view,  which gathered “opinions from those who author research articles, those who review them, and the journal editors who oversee the process”. It found that researchers value the peer review process.  Most respondents agreed that peer review greatly helps scholarly communication by testing the academic rigour of outputs. The majority also reported that they felt the peer review process had improved the quality of their own most recent published article.

Peer review is the ‘least worst’ process we have for ensuring that work is sound. Generally the research community require some sort of review of research, but there are plenty of examples that our current peer review process is not delivering the consistent verification it should. This system is relatively new and it is perhaps time to look at shifting the nature of peer review once more. On option is to open up peer review, and this can take many forms. Identifying reviewers, publishing reviews with a DOI so they can be cited, publishing the original submitted article with all the reviews and the final work, allowing previous reviews to be attached to the resubmitted article are all possibilities.

Adopting  one or all of these practices benefits the reviewers because it exposes the hidden work involved in reviewing. It can also reduce the burden on reviewers by minimising the number of times a paper is re-reviewed (remember the rejection rate of some journals is up to 95% meaning papers can get cascaded and re-reviewed multiple times).

This is the last of the ‘issues’ blogs in the case for Open Research series. The series will turn its attention to some of the solutions now available.

Published 19 July 2016
Written by Dr Danny Kingsley
Creative Commons License

Lifting the lid on peer review

This blog describes some of the insights that emerged from two sets of discussions with academics at Cambridge University organised by Cambridge University Press last year. The topic was peer review and the two sessions were a group of editors in the Humanities and Social Sciences, the other a group of editors in the Science, Technical, Medical and Engineering areas.

The themes that emerged echoed many of the issues that were raised in the associated blog ‘The case for Open Research: does peer review work?‘. If anything, the discussion paints a darker picture of the peer review landscape.

Themes included the challenges of finding and retaining reviewers, the reviewing demand on some people, the reality that many reviews are done by inexperienced researchers, that peer reviewing can lead to collaboration, that blinding review can lead to terrible behaviour, but opening it may lead to an exodus of reviewers. There were no real solutions decided at these discussions, but the conversation was rich and full of insights.

Very uneven workload

It is generally known that finding and retaining reviewers is a challenge for editors. One of the first discussion points for the group was the issue of being asked to review work. Some people in the room said that they get asked about twice a week, but the requests are so great that they are only able to do about one in 10 of what is asked. At any given time researchers can be  doing at least one review.

Researchers working in different fields get asked by different journals, however some colleagues never get asked and complain about this. In reality, most people are never asked to undertake reviewing but people in top research universities are asked all the time.

The CUP suggested that we could have a shared database that lots of editors look at, however this idea was met with concern from at least one person: “You don’t want to reveal your good reviewers in case they get stolen”.  (Note that some journals publish the list of reviewers).

When the option of payment and credit for reviewing was raised the general consensus was that the reason reviewers don’t review was not because they don’t get paid, it is because they don’t have time.

Who is actually doing the reviewing?

It was freely admitted around the table that peer reviews are mostly done by PhD students and PostDocs. One of the reasons there are bad reviews is simply because they are being done by very inexperienced people. Many reviewers have not seen very many reviews before they review papers themselves. There is no formal training or assessment in peer review. And there is no incentive for editors to do something about the quality of reviews.

The question that then arises from this issue is: How we get people into the reviewing pool and how we give them some training? One solution offered in the STEM discussion was reviewer training. The option of encouraging scientists to recommend their post-docs as reviewers under their supervision would allow a new generation of reviewers to gain supervised experience.

Another problem with junior researchers reviewing is if you have people who are early in their careers they don’t feel they can say things, or are able to publish negative reviews. The problem is not the scandal, it is the hierarchy of power.

An observation in the STEM discussion was that the assumption that ‘senior = good’ sometimes does not stand up, as often early-career scientists will be excellent reviewers. It may be that senior researchers may best recognise how a paper fits into the field, however more junior scientists may be more adept in the technical details of a paper.

Discussions in the STEM group moved to the role of the Editor, where an observation was made that authors must understand that the final decision rests with the Editor, who is provided guidance by referees.

In STEM there is a practice of sharing reviews among all reviewers of a paper. Several of those present gave examples where reviews are shared mid-stream (e.g. after a ‘revise’ decision), at the end of the process, and even prior to a first decision – which gives reviewers a chance to cross-comment on each other’s reviews.

There was the comment that in STEM, editors must act pro-actively in cases of conflicting reviews, where it is the Editor’s responsibility to focus on the important points and give an informed decision and guidance to authors.

What works

The main reason peer review is essential is you have to filter out the ‘bad stuff.’ It is already very difficult to keep up with the literature, without that it would be impossible. When the peer review  happens, the end result is high quality. It is not just articles are being rejected but the work that comes out is better. A STEM editor noted that authors have written in praise of reviewing when their papers have been rejected, “So it does add quality”.

The thing you value most in a journal is the quality of reviewing and the editorial steer, observed a STEM participant. They said this was noticeable in Biology “where the editorial guidance is getting better”.

An observation in the Humanities discussion was that many of the models in the sciences don’t work for the Humanities. In early History most journal articles are published by early career people so peer review in this instance is an educational job teaching historians about how to write journal articles.

A STEM observation was that sometimes peer reviewing leads to collaboration. One editor noted that in their journal, over the last 10-15 years, there have been quite a number of papers where the reviewer has provided a helpful and detailed review of the paper and the authors have asked if they can be put on as authors of a paper.

What doesn’t work

The discussions about what doesn’t work in peer review ranged from the comment that “Peer review for monographs is ‘broken irretrievably’“. One attendee noted that peer review for edited books has never really happened.

One STEM participant said the thing they liked least about peer review was that from an author perspective is it is pretty random – picking two or three people. “If you get one or two bad reviews it won’t get published – this is up to luck”. They made the comment that peer review is not really reproducible. Another issue is because it is so closed there is no incentive for people to improve the quality of their peer reviewers – there are a small number of good and lots of average reviewers .

One humanities person noted that reviewers put the work they are reviewing “through an idea about what a journal articles should look like’” so while there “used be all kinds of writing in the 1970s now they are all similar”. This reduces work to the lowest common denominator. It is not just a minimal positive impact on work but a negative impact on work. Another person agreed on the homogenisation issue – but thought this was an editorial problem: “A good editor should be prepared to go out on a limb”.

Long delays over review

For some journals the average time for review is 6-7 months. One participant noted “I review book manuscripts shorter than that. The main problem is it is too slow”.

A post doc noted that the delay for peer review is a serious problem at that level of an academic career. It is necessary to have publications on a CV: “It is not good enough to say it is being considered by a journal (for the past year)”.

The cursory nature of many reviews arose a few times. One person asked whether as an editor you take the review or do you go to other reviewer and slow the whole process down. Some journals ask for up to six reviews which drags the whole things down. Another said the problem meant ‘you endlessly go through the ABC of the topic’.

Blaming peer review for something else?

One participant raised the question of whether we were blaming peer review for things it is not responsible for. There probably is a problem which is more to do with the changing nature of the academic endeavour. More academics are out there and everyone is being pressured to publish in top-tier journals. These are issues in the profession.

The group noted academia has too many people trying to get to too few positions. The ‘cascade’ [of publications being sent to lower tier journals after rejection] is connected to this – you have a hierarchy of quality.

The conversation moved to the pressure to publish in high-impact journals. One STEM participant noted the problem has got substantially worse than 30 years ago. It is to do with the amount of expectation put upon everyone in the STM system. The need people have to publish material that 20-30 years ago that no-one would have bothered with. The data that is sitting at the bottom of the drawer – usually when you retire. Now they are digging it out – so the rejection rate is going up because more rubbish is going in.

The free labour/payment debate

A social anthropologist noted that a major problem with peer review is we are asking people to do a whole load of free labour, “It is not just credit but we should find a way to pay people for what they do”. Some journals have a large editorial board who do a lot of the reviewing. One person noted this was not completely free labour as they get a subscription to the journal.

The idea of paying for peer review is an economic question. Does paying for things alter the relationship between the person who is paying and the person doing the work? In this discussion the participants had a concern that paying people makes authors into consumers, does it change the system by introducing an economic transaction?

There was some debate over the payment question. One researcher said they would be ‘happy to receive’ payment, but noted if they are offered payment for manuscripts they always collect books. There is ‘something exciting about which book I should go for’. Other suggested that it did not necessarily have to be a cash payment but some sort of quid pro quo, “it would be nice if there was an offer of that”.

There was some resistance to the idea of offering cash payment with the suggestion that there are people who are on a single salary and this would be a real incentive to review so they get burnt out and put poor reviews out. However, payment for timely reviews was considered a great idea by some.

A STEM participant noted that reviewers usually do so out of a sense of moral obligation, as a part of the academic world, and that it is difficult to feel morally obliged to do anything for which you are offered money, thus care must be exercised when thinking of bringing in payment or reward.

Portable reviews?

The idea of portable reviews was discussed by both groups. In principle it sounds good because a lot of work is being done twice, second reviews could happen much more quickly if they were attached. In addition with a small pool of reviewers, it is possible and likely that a paper rejected after review by one journal will then be sent to the same reviewer when re-submitted to another journal.

However the humanities group who noted there was “danger in importing the model from the hard sciences into humanities”. The STEM group noted this would require a re-programming of the culture of reviewing.

There would be some issues with implementation – for example a journal has to admit it is a second tier journal because it takes the ‘slops’, given top journals only take 4% of the papers. And there are some potential problems with re-using reviews. One participant said “I write different kinds of reviews for the top journals compared to the lower ones – so the reviews are not transferable – they could disadvantage the authors.”

There are some examples of this type of thing happening now. Antarctic Science requests authors to provide details of prior journals submitted to and reviews. But it is not universally accepted. Examples were given by the STEM group of times where authors decide to send prior reviews when submitting to a new journal, but the publishers will not accept these as they did not commission them.

Overall the STEM group broadly agreed that sharing reviews in this way would save a significant amount of time and work, the logistics of sharing reviews especially between publishers are obviously very difficult. They also noted that such procedures would greatly reduce wasted effort, and presumably also increase the sample of reviews / opinions used when making a decision on a paper.

Open peer review

The opinions in the discussion around open peer review ranged widely. The arguments against included: “Open peer review sounds like recipe for academia becoming diffused with hostility even more than already”. And: “The publication of reviews idea is absolutely terrible, you need the person to feel they can be open.” There was also some concern that people could be ingratiating if they were reviewing a researcher ‘higher up’.

A STEM participant noted that some authors had said that ‘if you publish all of the reviews at the end of the year we won’t review any more’. They noted that when you have a small pool of reviewers that is a problem. The reviewers’ concerns include that they won’t get another job.

In one case a participant said they had been involved with a journal that was doing the “absolute opposite” with triple blind review – dealing with issues of implicit bias – particular gender bias, where the editors don’t know who the author is. The conversation then noted that even in double blind it is possible to tell who the reviewer is. Most people don’t know how to de-identify the document as well.

However on the positive side, there was support for a dialogue between the author and the reviewer – involved in a three way discussion.  There is a problem in that it can be very prolonged. A STEM participant noted that sometimes the reviewer debate surrounding an article is more interesting or useful than the original paper itself.

One STEM participant observed they had been involved in open review and “was sceptical at first”. However they noted it makes people behave better. “In anonymous reviews I have seen really shocking things said“.

Conclusion

This was an interesting exercise – providing an opportunity for editors to talk amongst themselves and with a publisher about issues relating to peer review. It will be instructive to see what happens.

Published 19 July 2016
Written by Dr Danny Kingsley
Creative Commons License

The case for Open Research: reproducibility, retractions & retrospective hypotheses

This is the third instalment of ‘The case for Open Research’ series of blogs exploring the problems with Scholarly Communication caused by having a single value point in research – publication in a high impact journal. The first post explored the mis-measurement of researchers and the second looked at issues with authorship.

This blog will explore the accuracy of the research record, including the ability (or otherwise) to reproduce research that has been published, what happens if research is retracted, and a concerning trend towards altering hypotheses in light of the data that is produced.

Science is thought to progress  through the building of knowledge through questioning, testing and checking work. The idea of ‘standing on the shoulders of giants’ summarises this – we discover truth by building on previous discoveries. But scientists are very rarely rewarded for being right, they are rewarded for publishing in certain journals and for getting grants. This can result in distortion of the science.

How does this manifest? The Nine Circles of Scientific Hell describes questionable research practices that occur, ranging from Overselling, Post-Hoc storytelling, p-value Fishing, Creative use of Outliers to Non or Partial Publication of Data. We will explore some of these below. (Note this article appears in a special issue of Perspectives on Psychological Science on the Replicability in Psychological Science – which contains many other interesting articles).

Much as we like to think of science as an objective activity it is not. Scientists are supposed to be impartial observers, but in reality they need to get grants, and publish papers to get promoted to more ‘glamorous institutions’. This was the observation of Professor Marcus Munafo in his presentation ‘Scientific Ecosystems and Research Reproducibility’ at the Research Libraries UK conference held earlier this year (the link will take you to videos of the presentations). Monafo observed that scientists are rarely rewarded for being right, so the scientific record is being distorted by the scientific ecosystem.

Monafo, a Biological Psychologist at Bristol University, noted that research, particularly in the biomedical sciences, ‘might not be as robust as we might have hoped‘.

The reproducibility crisis

A recent survey of over 1500 scientists by Nature tried to answer the question “Is there a reproducibility crisis?” The answer is yes, but whether that matters appears to be debatable: “Although 52% of those surveyed agree that there is a significant ‘crisis’ of reproducibility, less than 31% think that failure to reproduce published results means that the result is probably wrong, and most say that they still trust the published literature.”

There are certainly plenty of examples of the inability to reproduce findings. Pharmaceutical research can be fraught. Some research into potential drug targets found that in almost two-thirds of the projects looked at, there were inconsistencies between published data and the data resulting from attempts to reproduce the findings. 

There are implications for medical research as well. A study published last month looked at functional MRI (fMRI), noting that when analysing data using different experimental designs they should in theory find a significance threshold of 5% (a p-value of less than 0.05  which is conventionally described as statistically significant). However they found “the most common software packages for fMRI analysis (SPM, FSL, AFNI) can result in false-positive rates of up to 70%. These results question the validity of some 40,000 fMRI studies and may have a large impact on the interpretation of neuroimaging results.”

A 2013 survey of cancer researchers found that approximately half of respondents had experienced at least one episode of the inability to reproduce published data. Of those people who followed this up with the original authors, most were unable to determine why the work was not reproducible. Some of those original authors were (politely) described as ‘less than “collegial”’.

So what factors are at play here? Partly it is due to the personal investment in a particular field. A 2012 study of authors of significant medical studies concluded that: “Researchers are influenced by their own investment in the field, when interpreting a meta-analysis that includes their own study. Authors who published significant results are more likely to believe that a strong association exists compared with methodologists.”

This was also a factor in a study Why Most Published Research Findings Are False that considered the way research studies are constructed. This work found that “for most study designs and settings, it is more likely for a research claim to be false than true. Moreover, for many current scientific fields, claimed research findings may often be simply accurate measures of the prevailing bias.”

Psychology is a discipline where there is a strong emphasis on novelty, discovery and finding something that has a p-value of less than 0.05. There is such an issue with reproducibility in psychology that there are large efforts to try and reproduce psychological studies to estimate the reproducibility of the research. The Association for Psychological Science has launched a new article type of Registered Replication Reports which consists of “multi-lab, high-quality replications of important experiments in psychological science along with comments by the authors of the original studies”.

This is a good initiative, although there might be some resistance to this type of scrutiny. Something that was interesting from the Nature survey on reproducibility was the question of what happened when researchers attempted to publish a replication study. Note that only a few of respondents had done this, possibly because incentives to publish positive replications are low and journals can be reluctant to publish negative findings. The study found that “several respondents who had published a failed replication said that editors and reviewers demanded that they play down comparisons with the original study”.

What is causing this distortion of the research? It is the emphasis on publication of novel results in high impact journals. There is no reward for publishing null results or negative findings.

HARKing problem

The p-value came up again in a discussion about HARKing at this year’s FORCE2016 conference (HARK stands for Hypothesising After the Results are Known – a term coined in 1998).

In his presentation at FORCE2016 Eric Turner, Associate Professor OHSU, spoke about HARKing (see this video 37 minutes onward).  The process is that the researcher conceives the study, writes the protocol up for their eyes only, with a hypothesis and then collects lots of other data – ‘the more the merrier’ according to Turner. Then the researcher runs the study and analyses the data. If there is enough data, the researcher can try alternative methods and can play with statistics. ‘You can torture the data and it will confess to anything’ noted Turner. At some point the p-value will come out below 0.05. Only then does the research get written up.

Turner noted that he was talking about the kind of research where the work is trying to confirm a hypothesis (like clinical trials). This is different to hypothesis-generating research.

In the US clinical trials with human participants must be registered with the Federal Drug Agency (FDA) so it is possible to see the results of all trials. Turner talked about his 2008 study looking at antidepressant trials, where the journal version of the results supported the general view that antidepressants always beat placebo.  However when they looked at the FDA version of all of the studies of the same drugs it happened that half of the studies were positive and half and half were not positive. The published record does not reflect the reality.

The majority of the negative studies were simply not published, but 11 of the papers had been ‘spun’ from negative to positive. These papers had a median impact factor of 5 and median citations of 68 – these were highly influential articles. As Turner noted ‘HARKing is deceptively easy’.

This perspective is supported by the finding that a researcher is more likely to falsely find evidence that an effect exists than to correctly find evidence that it does not. Indeed Munafo noted that over 90% of the psychological literature finds what it purports to set out to do. Either the research being undertaken is extraordinarily mundane, or something is wrong.

Increase in retractions

So what happens when it is discovered that something that is published is incorrect? Journals do have a system which allows for the retraction of papers, and this is a practice which has been increasing over the past few years.  Research looking at why the number of retractions have increased found that it was partly due to lower barriers to publication of flawed articles. In addition papers are now being retracted for issues like plagiarism and retractions are now happening more quickly.

Retraction Watch is a service which tracks retractions ‘as a window into the scientific process’. It is enlightening reading with several stories published every day.

An analysis of correction rates in the chemical literature found that the correction rate averaged about 1.4 percent for the journals examined. While there were numerous types of corrections, chemical structures, omission of relevant references, and data errors were some of the most frequent types of published corrections. Corrections are not the same as retractions, but they are significant.

There is some evidence to show that the higher the impact factor of the journal a work is published in, the higher the chance it will be retracted. A 2011 study showed a direct correlation between impact factor and the number of retractions, with New England Journal of Medicine topping the list. This situation has led to claims that the top ranking journals publish the least reliable science.

A study conducted earlier this year demonstrated that there are no commonly agreed definitions of academic integrity and malpractice. (I should note that amongst other findings the study found 17.9% (± 6.1%) of respondents reported having fabricated research data. This is almost 1 in 5 researchers. However there have been some strong criticisms of the methodology.)

There are questions about how retractions should be managed. In the print era it was not unheard of for library staff to put stickers into printed journals notifying a retraction. But in the ‘electronic age’ asked one author in 2002 when the record can be erased, is this the right thing to do because erasing the article entirely is amending history.  The Committee on Publication Ethics (COPE) do have some guidelines for managing retractions which suggest the retraction be linked to the retracted article wherever possible.

However, from a reader’s perspective, even if an article is retracted this might not be obvious. In 2003* a survey of 43 online journals found 17 had no links between the original articles and later corrections. When present, hyperlinks between articles and errata showed patterns in presentation style, but lacked consistency. There are some good examples – such as Science Citation Index but there was a lack of indexing in INSPEC, and a lack of retrieval with SciFinder Scholar.

[*Note this originally said 2013, amended 2 September 2016]

Conclusion

All of this paints a pretty bleak picture. In some disciplines the pressure to publish novel results in high impact journals results in the academic record being ‘selectively curated’ at best. At worst it results in deliberate manipulation of results. And if mistakes are picked up there is no guarantee that this will be made obvious to the reader.

This all stems from the need to publish novel results in high impact journals for career progression. And when those high impact journals can be shown to be publishing a significant amount of subsequently debunked work, then the value of them as a goal for publication comes into serious question.

The next instalment in this series will look at gatekeeping in research – peer review.

Published 14 July 2016
Written by Dr Danny Kingsley
Creative Commons License

The case for Open Research: the authorship problem

This is the second in a blog series about why we need to move towards Open Research. The first post about the mis-measurement problem considered issues with assessment. We now turn our attention to problems with authorship. Note that as before this is a topic of research in itself – and there is a rich vein of literature to be mined here for the interested observer.

Hyperauthorship

In May last year a high energy physics paper was published with over 5,000 authors. Of the 33 pages in this article, the paper occupied nine with the remainder listing the authors. This paper caused something of a storm of protest about ‘hyperauthorship’ (a term coined in 2001 by Blaise Cronin).

Nature published a news story on it, which was followed a week later by similar stories decrying the problem. The Independent published  a story with the angle that many people are just coasting along without contributing. The Conversation’s take on the story looked at the challenge of effectively rewarding researchers. The Times Higher Education was a bit slower off the mark, in August publishing a story questioning whether mass authorship was destroying the credibility of papers.

This paper was featured in  a keynote talk given at this year’s FORCE2016 conference. Associate Professor Cassidy Sugimoto from the School of Informatics and Computing, Indiana University Bloomington spoke about ‘Structural Disruptions in the Reward System of Science’ (video here). She noted that authorship is the coin of the realm the pivot point of the whole of the scientific system and this has resulted in the growth of authors listed on a paper.

Sugimoto asked: What does ‘authorship’ mean when there are more authors than words in a document? This type of mass authorship raises concerns about fraud and attribution. Who is responsible if something goes wrong?

The authorship ‘proxy for credit’ problem

Of course not all of those 5,000 people actually contributed to the writing of the article – the activity we would normally associate with the word ‘authorship’. Scientific authorship does not follow the logic of literary authorship because of the nature of what is being written about.

In 1998 Biagioli (who has literally written the book on Scientific Authorship or at least edited it) in a paper called ‘The Instability of Authorship: Credit and Responsibility in Contemporary Biomedicine’ said that “the kind of credit held by a scientific author cannot be exchanged for money because nature (or claims about it) cannot be a form of private property, but belongs in the public domain”.

Facts cannot be copyrighted. The inability to write for direct financial remuneration in academia has implications for responsibility (addressed further down), but first let’s look at the issue of academic credit.

When we say ‘author’ what do we mean in this context? Often people are named as ‘authors’ on a paper because their inclusion will help to have the paper accepted, or it is a token thanks for providing the grant funding for the work. These are practices referred to as ‘gift authorship‘ where co-authorship awarded to a person who has not contributed significantly to the study.

In an attempt to stop some of the more questionable practices above, the International Committee of Medical Journal Editors (ICMJE) has defined what it means to be an author which says authorship should be based on:

  • a substantial contribution
  • drafting the work
  • giving final approval and
  • agreeing to be accountable for the integrity of the work.

The problem, as we keep seeing, is that authorship on a publication is the only thing that counts for reward. This means that ‘authorship’ is used as a proxy for crediting people’s contribution to the study.

Identifying contributions

Listing all of the people who had something to do with a research project as ‘authors’ on the final publication fails to credit different aspects of the labour involved in the research. In an attempt to address this, PLOS asks for the different contributions by those named on a paper to be defined on articles, with their guidelines suggesting categories such as Data Curation, Methodology, Software, Formal Analysis and Supervision (amongst many).

Sugimoto has conducted some research to find what this reveals about what people are contributing to scientific labour. In an analysis of PLOS data on contributorship, her team showed that in most disciplines the labour was distributed. This means that often the person doing the experiment is not the person who is writing up the work. (I should note that I was rather taken aback by this when it arose in interviews I conducted for my PhD).

It is not particularly surprising that in the Arts, Humanities and Social Sciences that the listed ‘author’ is most often the person who wrote the paper. However in Clinical Medicine, Biomedicine or Biology very few authors are associated with the task of writing.  (As an aside, the analysis found women are disproportionately likely to be doing the experimentation, and men are more likely to be authoring, conceiving experimentation or obtaining resources.)

So, would it not be better if rather than placing the only emphasis on authorship of journal articles in high impact journals, we were able to reward people for different contributions to the research?

And while everyone takes credit, not all people take responsibility.

Authorship – taking responsibility

It is not just the issue of the inability to copyright ‘facts of nature’ that makes copyright unusual in academia. The academic reward system works on the ‘academic gift principle’ – academics provide the writing, the editing and the peer review for free and do not expect payment. The ‘reward’ is academic esteem.

This arrangement can seem very odd to an outsider who is used to the idea of work for hire. But there are broader implications than what is perceived to be ‘fair’ – and these relate to accountability. It is much more difficult to sue a researcher for making incorrect statements than it is to sue a person who writes for money (like a journalist).

Let us take a short meander into the world of academic fraud. Possibly the biggest and certainly highly contentious case was Andrew Wakefield and the discredited (and retracted) claim that the MMR vaccine was associated with autism in children. This has been discussed at great length elsewhere – the latest study debunking the claim was published last year. Partly because of the way science is credited and copyright is handled, there were minimal repercussions for Wakefield. He is barred from practicing medicine in the UK, but enjoys a career on the talkback circuit in the US. Recently a film about the MMR claims, directed by Wakefield was briefly shown at the Tribeca film festival before protests saw it removed from the programme.

Another high profile case is Diedderik Stapel, a Dutch social psychologist who entirely fabricated his data over many years. Despite several doctoral students’ work being based on this data and over 70 articles having to be retracted there were no charges laid. The only consequence he faced was having his professorship stripped.

Sometimes the consequences of fraud are tragic. A Japanese stem cell researcher, Haruko Obokata, who fabricated her results had her PhD stripped from her. There were no criminal charges laid but her supervisor committed suicide and the funding for the centre she was working in was cut.  The work had been published in Nature which then retracted the work and wrote some editorial about the situation.

The question of scientific accountability is so urgent that there was a call last year to criminalise scientific misconduct in this paper. Indeed things do seem to be changing slowly and there have been some high profile cases where scientific fraud has resulted in criminal charges being laid. A former University of Queensland academic is currently facing fraud related charges over his fabricated results from a study into Parkinson’s disease and multiple sclerosis. This time last year, Dong-Pyou Han, a former biomedical scientist at Iowa State University in Ames, was sentenced to 57 months for fabricating and falsifying data in HIV vaccine trials. Han has also been fined US$7.2 million. In both the cases the issue is the misuse of grant funding rather than publication of false results.

The combination of great ‘reward’ from publication in high profile journals and little repercussion (other than having that ‘esteem’ taken away) has proven to be too great a temptation for some.

Conclusion

The need to publish in high impact journals has caused serious authorship issues –  resulting in huge numbers of authors on some papers because it is the only way to allocate credit. And there is very little in the way we reward researchers that adequately allows for calling researchers to take responsibility when something goes wrong, in some cases resulting in serious fraud.

The next instalment in this series will look at ‘reproducibility, retractions and retrospective hypotheses.

Published 12 July 2016
Written by Dr Danny Kingsley
Creative Commons License

The case for Open Research: the mis-measurement problem

Let’s face it. The biggest blockage we have to widespread Open Access is not researcher apathy, a lack of interoperable systems, or an unwillingness of publishers to engage (although these do each play some part) – it is the problem that the only thing that counts in academia is publication in a high impact journal.

This situation is causing multiple problems, from huge numbers of authors on papers, researchers cherry picking results and retrospectively applying hypotheses, to the reproducibility crisis and a surge in retractions.

This blog was intended to be an exploration of some solutions prefaced by a short overview of the issues. Rather depressingly, there was so much material the blog has had to be split up, with several parts describing the problem(s) before getting to the solutions.

Prepare yourself, this will be a bumpy ride. This first instalment looks at the reward system. The second instalment will consider authorship and credit. The third will look at reproducibility, retractions and retrospective hypotheses. The fourth asks if peer review is working. And the final blog will discuss some options for solving at least part of the problem.

I should note that this is not a comprehensive literature review. Every subheading of this blog series is a topic of considerable research on its own and there are many further examples available to the interested reader. I welcome debate, suggestions and links in the comments section of the blog(s).

Measurement for reward

The Journal Impact Factor

Let’s start with how researchers are measured. For decades academia has lived with the ‘Publish or Perish’ mantra which has spawned problems with poor publication practices. Today the pressure to be published in a high impact journal is stronger than ever.

A journal’s Impact Factor (JIF) averages the number of citations received by a journal in a given year divided by the number of articles published in the previous two years. For example, a journal’s JIF for a given year is calculated by taking the number of citations made that year to the articles published in the journal in the previous years and then dividing by the total number of articles (including reviews and other non-scholarly content) published in that journal in those years.

The JIF is compiled by Journal Citation Reports – which is owned by a commercial company Thompson Reuters. The company announced its sale for $3.5 billion today.

This blog will not dig in any depth into the issues with the way the JIF is calculated, although there are some serious ones (see a 2006 paper I coauthored on this topic). Neither will it explore the problem of how much the JIF is gamed – from self-citations to journals insisting on a certain number of citations to publications within the same journal. Sufficient to say that each year a number of journals are removed from the index due to this type of behaviour. The record to date was in 2013, a year which saw 66 journals struck from the list. By comparison only 18 were suppressed in the most recent report.

There have been many, many criticisms of the Journal Impact Factor and its effects on scholarship. But the criticisms put forward a decade ago to the month by PLOS still ring true. One of the issues, PLOS argued, was that because Thompson Reuters does not make public the process for choosing ‘citable’ article types, this means “science is currently rated by a process that is itself unscientific, subjective, and secretive”.

Indeed last week a news article in Science and a related news article in Nature put forward exactly the same criticism. The stories referred to a paper: “A simple proposal for the publication of journal citation distributions” posted on BioRxiv. This described some comparative research undertaken to look at whether a reanalysis of the data would provide the same results as Thompson Reuters. It didn’t. The work found the citation distributions were “so skewed that up to 75% of the articles in any given journal had lower citation counts than the journal’s average number”. The authors likened using the JIF to determine the impact of a given article to ‘guesswork‘.

Jon Tennant, in a 2015 blog stated that “The impact factor is one of the most mis-used metrics in the history of academia” and proposed an Open Letter template for researchers to “send to people in positions of power at different institutions, co-signed by as many academics as possible who believe in fairer and evidence-based assessment”. Tennant in turn references Stephen Curry’s 2012 blog which opened with the statement “The impact factor might have started out as a good idea, but its time has come and gone”.

There are many more, but I am sure you get the idea.

This is recognised as such a big problem that in 2012 the San Francisco Declaration on Research Assessment (DORA) was conceived with the intent to: ‘Put science into the assessment of research’. Over 12,000 individuals and over 700 organisations have signed the declaration to date supporting the call for a “need to assess research on its own merits rather than on the basis of the journal in which the research is published”.

If nothing else, there is clearly a problem with measuring the worth of something by considering the packaging and not the item itself. But the academy continues to use the JIF and criticisms continue to come thick and fast.

Clearly something is rotten in the state of Denmark.

Ditching the JIF

In Stephen Jay Gould’s seminal book The Mismeasure of Man where he debunks the science behind biological determinism, he criticises “the myth that science itself is an objective enterprise, done properly only when scientists can shuck the constraints of their culture and view the world as it really is”. This observation is true of any metrics we apply to the valuing of research outputs. They are not objective, and not an accurate view. Any measurement tool causes its own problems.

An example of a non-JIF type of measurement is the increased emphasis on ‘excellence’ by funders and governments (the Research Excellence Framework in the UK and Excellence in Research for Australia being two examples). But ‘excellence rhetoric’ is counterproductive to good research, according to one argument which concludes that ‘excellence’ is a “pernicious and dangerous rhetoric that undermines the very foundations of good research and scholarship”.

The insistence on excellence, it can be argued, have spawned problems with reproducibility and fraud. In other words, the same problems that the JIF has caused.

There have been many other suggestions for ways to measure researchers, such as the h-index which has its own set of issues, and the Eigenfactor Score – these are only two of a myriad of options. But as the system changes, so does researcher behaviour. A clear example was in Australia when the funding mechanism moved to a simple count of research papers rather than any assessment of the value of those papers. This resulted in a marked increase in the number of papers being produced and a concurrent decrease in the overall quality as described in ‘Modifying publication practices in response to funding formulas‘.

Clifford Lynch, the Executive Director of CNI noted in his welcome talk at the JISC-CNI event held at Wadham College, Oxford last week that using alternative metrics means we start running into issues about vendor lock-in and data confidentiality.

While alternate metrics might solve the ‘valuing the article rather than journal’ issue, they bring up problems of their own. In HEFCE’s 2015 report on metrics being used in assessment in the future noted that some indicators can be misused or ‘gamed’ – with journal impact factors, university rankings and citation counts put forward as three prominent examples. The report recommended that metrics should be updated in response to their potential effects. In deciding what metrics to use, the report recommended using the best possible data in terms of accuracy and scope, and that the data collection and analytical processes should be open and transparent to allow verification. It also suggested using a range of indicators.

Financial implications

What does this emphasis on particular publication outlets have to do with Open Access? Well a great deal as it happens. It is the big blocker to widespread change. As long as we continue with this emphasis we will not get any real traction with Open Access because it locks us into an old print paradigm of academia.

Much ink has been spilt over the cost of publication and the added cost of open access (some of it mine) which includes not just the cost of the article processing charges but the burden of administering multiple micropayments.

As I have said on numerous occasions (see here and here) funders paying for hybrid open access is expensive and has not resulted in journals flipping to gold (as a transition to fully Open Access environment) despite this being a stated aim of the process. It makes sense from a publisher’s perspective not to flip journals – why, when researchers are under pressure to publish in high impact journals, and there is a new revenue stream associated with that publishing, would you kill the proverbial goose?

Indeed, a paper earlier this year argued that “Open Access has the potential to become unsustainable for research communities if high-cost options are allowed to continue to prevail in a widely unregulated scholarly publishing market.”

The problem, it can be argued, is that the infrastructure underpinning open access is ‘path dependent’ a concept proposed in 1985 which explains how the set of decisions in the present is limited by the decisions one has made in the past, even though the contextual factors shaping the past decision no longer apply. Scholarly publishing is path-dependent, some authors argue “because it still heavily depends on a few players that occupy crucial nodes in the scientific information infrastructure. In the past, these players were scientific associations, but now these players are commercial publishing companies”.

As long as the current reward system remains, the crucial nodes will not change and we are stuck.

Conclusion

So that covers some of the problems with the way we measure our researchers, and some of the financial implications of this. The next blog in this series will cover some of the issues with authorship.

Published 11 July 2016
Written by Dr Danny Kingsley
Creative Commons License

Show me the money – the path to a sustainable Research Data Facility

Like many institutions in the UK, Cambridge University has responded to research funders’ requirements for data management and  sharing with a concerted effort to support our research community in good data management and sharing practice through our Research Data Facility. We have written a few times on this blog and presented to describe our services. This blog is a description of the process we have undertaken to support these services in the long term.

Funders expect  that researchers make the data underpinning their research available and provide a link to this data in the paper itself. The EPSRC started checking compliance with their data sharing requirement on 1 May 2015. When we first created the Research Data Facility we spoke to many researchers across the institution and two things became very clear. One was that there was considerable confusion about what actually counts as data, and the second was that sharing data on publication is not something that can be easily done as an afterthought if the data was not properly managed in the first place.

We have approached these issues separately. To try and determine what is actually required from funders beyond the written policies we have invited representatives from our funders to come to discussions and forums with our researchers to work out the details. So far we have hosted Ben Ryan from the EPSRC, Michael Ball from the BBSRC and most recently David Carr and Jamie Enoch from the Wellcome Trust and CRUK respectively.

Dealing with the need for awareness of research data management has been more complex. To raise awareness of good practice in data management and sharing we embarked on an intense advocacy programme and in the past 15 months have organised 71 information sessions about data sharing (speaking with over 1,700 researchers). But we also needed to ensure the research community was managing its data from the beginning of the research process. To assist this we have developed workshops on various aspects of data management (hosting 32 workshops in the past year), a comprehensive website, a service to support researchers with their development of their research data management plans and a data management consultancy service.

So far, so good. We have had a huge response to our work, and while we encourage researchers to use the data repository that best suits their material, we do offer our institutional repository Apollo as an option. We are as of today, hosting 499 datasets in the repository. The message is clearly getting through.

Sustainability

The word sustainability (particularly in the scholarly communication world) is code for ‘money’. And money has become quite a sticking point in the area of data management. The way Cambridge started the Research Data Facility was by employing a single person, Dr Marta Teperek for one year, supported by the remnants of the RCUK Transition Fund. It became quickly obvious that we needed more staff to manage the workload and now the Facility employs half an Events and Outreach Coordinator and half a Repository Manager plus a Research Data Adviser who looks after the bulk of the uploading of data sets into the repository.

Clearly there was a need to work out the longer term support for staffing the Facility – a service for which there are no signs of demand slowing. Early last year we started scouting around for options.  In April 2013 the RCUK released some guidance that said it was permissible to recover costs from grants through direct charges or overheads – but noted institutions could not charge twice. This guidance also mentioned that it was permissible for institutions to recover costs of RDM Facilities as other Small Research Facilities, “provided that such facilities are transparently charged to all projects that use them”.

Transparency

On the basis of that advice we established a Research Data Facility as a Small Research Facility according to the Transparent Approach to Costing (TRAC) methodology. Our proposal was that Facility’s costs will be recovered from grants as directly allocated costs. We chose this option rather than overheads because of the advantage of transparency to the funder of our activities. By charging grants this way it meant a bigger advocacy and education role for the Facility. But the advantage is that it would make researchers aware that they need to consider research data management seriously, that this involves both time and money, and that it is an integral part of a grant proposal.

Dr Danny Kingsley has argued before (for example in a paper ‘Paying for publication: issues and challenges for research support services‘) that by centralising payments for article processing charges, the researchers remain ignorant of the true economics of the open access system in the way that they are generally unaware of the amounts spent on subscriptions. If we charged the costs of the Facility into overheads, it becomes yet another hidden cost and another service that ‘magically’ happens behind the scenes from the researcher’s point of view.

In terms of the actual numbers, direct costs of the Research Data Facility included salaries for 3.2 FTEs (a Research Data Facility Manager, Research Data Adviser, 0.5 Outreach and Engagement Coordinator, 0.5 Repository Manager, 0.2 Senior Management time), hardware and hardware maintenance costs, software licences, costs of organising events as well as the costs of staff training and conference attendance. The total direct annual cost of our Facility was less than £200,000. These are the people cost of the Facility and are not to be confused with the repository costs (for which we do charge directly).

Determining how much to charge

Throughout this process we have explored many options for trying to assess a way of graduating the costing in relation to what support might be required. Ideally, we would want to ensure that the Facility costs can be accurately measured based on what the applicant indicated in their data management plan. However, not all funders require data management plans. Additionally, while data management plans provide some indication of the quantity of data (storage) to be generated, they do not allow a direct estimate of the amount of data management assistance required during the lifetime of the grant. Because we could not assess the level of support required for a particular research project from a data management plan, we looked at an alternative charging strategy.

We investigated charging according to the number of people on a team, given that the training component of the Facility is measurable by attendees to workshops. However, after investigation we were unable to easily extract that type of information about grants and this also created a problem for charging for collaborative grants. We then looked at charging a small flat charge on every grant requiring the assistance of the Facility and at charging proportionally to the size (percentage of value) of the grant. Since we did not have any compelling evidence that bigger grants require more Facility assistance, we proposed a model of flat charging on all grants, which require Facility assistance. This model was also the most cost-effective from an administrative point of view.

As an indicator of the amount of work involved in the development of the Business Case, and the level of work and input that we have received relating to it, the document is now up to version 18 – each version representing a recalculation of the costings.

Collaborative process

A proposal such as we were suggesting – that we charge the costs of the Facility as a direct charge against grants – is reasonably radical. It was important that we ensure the charges would be seen as fair and reasonable by the research community and the funders. To that end we have spent the best part of a year in conversation with both communities.

Within the University we had useful feedback from the Open Access Project Board (OAPB) when we first discussed the option in July last year. We are also grateful to the members of our community who subsequently met with us in one on one meetings to discuss the merits of the Facility and the options for supporting it. At the November 2015 OAPB meeting, we presented a mature Business Case. We have also had to clear the Business Case through meetings of the Resource Management Committee (RMC).

Clearly we needed to ensure that our funders were prepared to support our proposal. Once we were in a position to share a Business Case with the funders we started a series of meetings and conversations with them.

The Wellcome Trust was immediate in its response – they would not allow direct charging to grants as they consider this to be an overhead cost, which they do not pay. We met with Cancer Research UK (CRUK) in January 2016 and there was a positive response about our transparent approach to costing and the comprehensiveness of services that the Facility provides to researchers at Cambridge. These issues are now being discussed with senior management at CRUK and discussions with CRUK are still ongoing at the time of writing this report (May 2016). [Update 24 May: CRUK agreed to consider research data management costs as direct costs on grant applications on a case by case basis, if justified appropriately in the context of the proposed research].

We encourage open dialogue with the RCUK funders about data management. In May 2015 we invited Ben Ryan to come to the University to talk about the EPSRC expectations on data management and how Cambridge meets these requirements. In August 2015 Michael Ball from the BBSRC came to talk to our community. We had an indication from the RCUK that our proposal was reasonable in principle. Once we were in a position to show our Business Case to the RCUK we invited Mark Thorley to discuss the issue and he has been in discussion with the individual councils for their input to give us a final answer.

Administrative issues

Timing in a decision like this is challenging because of the large number of systems within the institution that would be affected if a change were to occur. In anticipation of a positive response we started the process of ensuring our management and financial systems were prepared and able to manage the costing into grants – to ensure that if a green light were given we would be prepared.  To that end we have held many discussions with the Research Office on the practicalities of building the costing into our systems to make sure the charge is easy to add in our grant costing tool. We also had numerous discussions on how to embed these procedures in their workflows for validating whether the Facility services are needed and what to do if researchers forget to add them. The development has now been done.

A second consideration is the necessity to ensure all of the administrative staff involved in managing research grants (at Cambridge this is a  group of over 100 people) are aware of the change and how to manage both the change to the grant management system and also manage the questions from their research community. Simultaneously we were also involved in numerous discussions with our invaluable TRAC team at the Finance Division at the University who helped us validate all the Facility costs (to ensure that none of the costs are charged twice) and establishing costs centres and workflows for recovering money from grants.

Meanwhile we have had to keep our Facility staff on temporary contracts until we are in a position to advertise the roles. There is a huge opportunity cost in training people up in this area.

Conclusion

As it happened, the RCUK has come back to us to say that we can charge this cost to grants but as an overhead rather than direct cost. Having this decision means we can advertise the positions and secure our staffing situation. But we won’t be needing the administrative amendments to the system, nor the advocacy programme.

It has been a long process given we began preparing the Business Case in March 2015. The consultation throughout the University and the engagement of our community (both research and funder) has given us an opportunity to discuss the issues of research data management more widely. It is a shame – from our perspective – that we will not be able to be transparent about the costs of managing data effectively.

The funders and the University are all working towards a shared goal – we are wanting a culture change towards more open research, including the sharing of research data. To achieve this we need a more aware and engaged research community on these matters.  There is much advocacy to do.

Published 8 May 2016
Written by Dr Danny Kingsley and Dr Marta Teperek
Creative Commons License

Watch this space – the first OSI workshop

It was always an ambitious project – trying to gather 250 high level delegates from all aspects of the scholarly communication process with the goal of better communication and idea sharing between sectors of the ecosystem. The first meeting of the Open Scholarship Initiative (OSI) happened in Fairfax, Virginia last week. Kudos to the National Science Communication Institute for managing the astonishing logistics of an exercise like this – and basically pulling it off.

This was billed as a ‘meeting between global, high-level stakeholders in research’ with a goal to ‘lay the groundwork for creating a global collaborative framework to manage the future of scholarly publishing and everything these practices impact’. The OSI is being supported by UNESCO who have committed to the full 10 year life of the project. As things currently stand, the plan is to repeat the meeting annually for a decade.

Structure of the event

The process began in July last year with emailed invitations from Glenn Hampson, the project director. For those who accepted the invitation, a series of emails from Glenn started with tutorials attached to try and ensure the delegates were prepared and up to speed. The emails gathered momentum with online discussions between participants. Indeed much was made of the (many) hundreds of emails the event had generated.

The overall areas the Open Scholarship Initiative hopes to cover include research funding policies, interdisciplinary collaboration efforts, library budgets, tenure evaluation criteria, global institutional repository efforts, open access plans, peer review practices, postdoc workload, public policy formulation, global research access and participation, information visibility, and others. Before arriving delegates had chosen their workgroup topic from the following list:

  • Embargos
  • Evolving open solutions (1)
  • Evolving open solutions (2)
  • Information overload & underload
  • Open impacts
  • Peer review
  • Usage dimensions of open
  • What is publishing? (1)
  • What is publishing? (2)
  • Impact factors
  • Moral dimensions of open
  • Participation in the current system
  • Repositories & preservation
  • What is open?
  • Who decides?

The 190+ delegates from 180+ institutions, 11 countries and 15 stakeholder groups gathered together at George Mason University (GMU), and after preliminary introductions and welcomes the work began immediately with everyone splitting into their workgroups. We spent the first day and a half working through our topics and preparing a short presentation for feedback on the second afternoon. There was then another working session to finalise the presentations before the live-streamed final presentations on the Friday morning. These presentations are all available in Figshare (thanks to Micah Vandegrift).

The event is trying to address some heady and complex questions and it was clear from the first set of presentations that in some instances it had been difficult to come to a consensus, let alone a plan for action. My group had the relative luxury of a topic that is fairly well defined – embargoes. It might be useful for the next event to focus on specific topics and move from the esoteric to the practical.

In addition the meeting had a team of ‘at large’ people who floated between groups to try and identify themes. Unsurprisingly, the ‘Primacy of Promotion and Tenure’ was a recurring theme throughout many of the presentations. It has been clear for some time that until we can achieve some reform of the promotion and tenure process, many of the ideas and innovations in scholarly communication won’t take hold. I would suggest that the different aspects of the reward/incentive system would be a rich vein to mine at OSI2017.

Closed versus open

In terms of outcomes there was some disquiet beforehand, by people who were not attending, about the workshop effectively being ‘closed’. This was because there was a Chatham House Rule for the workgroups to allow people to speak freely about their own experiences.

There was also some disquiet by those people who were attending about a request that the workgroups remain device-free. This was to try and discourage people checking emails and not participating. However people revert to type – in our group we all used our devices to collaborate on our documents. In the end we didn’t have much of a choice, the incredibly high tech room we were using in the modern GMU library flummoxed us and we were unable to get the projector to work.

That all said, there is every intention to disseminate the findings of the workshops widely and openly. During the feedback and presentations sessions there was considerable Twitter discussion at #OSI2016 – there is a downloadable list of all tweets in figshare – note there were enough to make the conference trend on Twitter at one point. This networked graphic shows the interrelationships across Twitter (thanks to Micah and his colleague). In addition there will be a report published by George Mason University Press incorporating the summary reports from each of the groups.

Team Embargo

Our workgroup, like all of them, represented a wide mix of interest groups. We were:

  • Ann Riley – President, Association of College and Research Libraries
  • Audrey McCulloch, Chief Executive, Association of Learned and Professional Societies
  • Danny Kingsley – Head of Scholarly Communication, Cambridge University
  • Eric Massant, Senior Director of Government and Industry Affairs, RELX Group
  • Gail McMillan, Director of Scholarly Communication, Virginia Tech
  • Glenorchy Campbell, Managing Director, British Medical Journal North America
  • Gregg Gordon, President, Social Science Research Network
  • Keith Webster, Dean of Libraries, Carnegie Mellon University
  • Laura Helmuth, incoming president, National Association of Science Writers
  • Tony Peatfield, Director of Corporate Affairs, Medical Research Council, Research Councils, UK
  • Will Schweitzer, Director of Product Development, AAAS/Science

It might be worth noting here that our workgroup was naughty and did not agree beforehand on who would facilitate, so therefore no-one had attended the facilitation pre-workshop webinar. This meant our group was gloriously facilitator and post-it note free – we just got on with it.

Banishing ghosts

We began with some definitions about what embargoes are, noting that press embargoes, publication embargoes and what we called ‘security’ embargoes (like classified documents) all serve different purposes.

Embargoes are not ‘all bad’. In the instance of press embargoes they allow journalists early access to the publication in order for them to be able to investigate and write/present informed pieces in the media. This benefits society because it allows for stronger press coverage. In terms of security embargoes they protect information that is not meant to be in the public domain. However embargoes on Author’s Accepted Manuscripts in repositories are more contentious, with qualified acceptance that these are a transitional mechanism in a shift to full open access.

The causal link of green open access resulting in subscription loss is not yet proven. The September 2013 UK Business, Innovation and Skills Committee Fifth Report: Open Access stated “There is no available evidence base to indicate that short or even zero embargoes cause cancellation of subscriptions”. In 2012 the Committee for Economic Development Digital Connections Council in The Future of Taxpayer-Funded Research: Who Will Control Access to the Results? concluded that “No persuasive evidence exists that greater public access as provided by the NIH policy has substantially harmed subscription-supported STM publishers over the last four years or threatens the sustainability of their journals”.

However there is no argument that traffic on websites for journals that rely on advertising dollars (such as medical journals) suffer when the attention is pulled to another place. This clearly potentially affects advertising revenue which in turn can impact on the financial model of those publication.

During our discussions about the differences between press embargoes and publication embargoes I mentioned some recent experiences in Cambridge. The HEFCE Open Access Policy requires us to collect Author’s Accepted Manuscripts at the time of acceptance and make the metadata about them available, ideally before publication. We respect publishers’ embargoes and keep the document itself locked down until these have passed post-publication. However we have been managing calls from sometimes distressed members of our research community who are worried that making the metadata available prior to publication will result in the paper being ‘pulled’ by the journal. Whether this has ever actually happened I do not know – and indeed would be happy to hear from anyone who has a concrete example so we can start managing reality instead of rumour. The problem in these instances is the researchers are confusing the press embargo with the publication embargo.

And that is what this whole embargo discussion comes down to. Much of the discourse and arguments about embargoes are not evidence based. There is precious little evidence to support the tenet that sits behind embargoes – which is that if publishers allow researchers to make copies of their work available open access then they will lose subscriptions. The lack of evidence does not prevent the possibility it is true however – and that is why we need to settle the situation once and for all. If there is a sustainability issue for journals because of wider green open access then we need to put some longer term management in place and work towards full open access.

It is possible the problem is not repositories, institutional or subject-based. Many authors are making the final version of their published work available in contravention of their Copyright Transfer Agreement in ResearchGate or Academia.edu. It might be that this availability of work is having an impact on researcher’s usage of work on the publishers’ sites. Given that in institutional repositories repository managers make huge efforts to comply with complicated embargoes it is quite possible that repositories are not the problem. Indeed, only a small proportion of work is made available through repositories according to the August 2015 Monitoring the Transition to Open Access report (look at ‘Figure 9. Location of online postings (including illicit postings)’ on page 38).  If this is the case, requiring institutions to embargo the Author’s Accepted Manuscripts they hold in their repositories for long periods will not make any difference. They are not the solution.

Our conclusion from our preliminary discussions was that there needs to be some concrete, rigorous research into the rationale behind embargoes to inform publishers, researchers and funders.

Our proposal – research questions

In response to this the Embargo workgroup decided that the most effective solution was to collaborate on an agreed research process that will have the buy-in of all stakeholders. The overarching question that we want to try and answer is ‘What are the impacts of embargoes on scholarly communication?’ with the goal to create an evidence base for informed discussion on embargoes .

In order to answer that question we have broken the big issue into a series of smaller questions:

  • How are embargoes determined?
  • How do researchers/students find research articles?
  • Who needs access?
  • Impact of embargoes on researchers/students?
  • Effect of embargoes on other stakeholders?

We decided that if the research found there was a case for publication embargoes then agreement on the metrics that should be used to determine the length of an embargo would be helpful. We are hoping that this research will allow standards to be introduced in the area of embargoes.

Discoverability and the issue of searching behaviour is extremely relevant in this space. Our hypothesis is if people are following publishers’ journal pages to find material then the fact that some of the same information is disbursed amongst lots of repositories means that the publisher arguments that embargoes threaten their finances are weakened. However if people are primarily using centralised search engines such as Google Scholar (which favours open versions of articles over paid ones) then that strengthens the publisher argument that they need embargoes to protect revenue.

The other question is whether access really is an issue for researchers. The March 2015 STM Report looked at the research in this area which indicate that well over 90% of researchers surveyed in separate studies said research papers were easy or fairly easy to access which appears to suggests on the face of it little problem in the way of access (look for the ‘Researchers’ access to journals’ section starting p83). Rather than repeating these surveys indicators for how much embargoes restrict access to researchers could include:

  • The usage of Request a Copy buttons in repositories
  • The number of ‘turn-aways’ from publishers platforms
  • The take-up level of Pay Per View options on publisher sites
  • The level of usage of ‘Get it Now’ – where the library obtains a copy through interlibrary loan or document delivery and absorbs the cost.

Our proposal – Research structure

The project will begin with a Literature Review and an investigation into the feasibility of running some Case Studies.

Two clear Case Studies could provide direct evidence if the publishers were willing to share what they have learned. In both cases, there has been a move from an embargo period for green OA to removing embargoes completely. In the first instance, Taylor and Francis began a trial in 2011 to allow immediate green OA for their library and information science journals, meaning that authors published in 35 library and information science journals have the right to deposit their Accepted Manuscript into their institutional repository and make it immediately available. Authors who choose to publish in these journals are no longer asked to assign copyright. They now sign a license to publish, which allows Taylor & Francis to publish the Version of Record. Additionally, authors can choose to make their work green open access with no embargoes applied. In 2014 the pilot was extended for ‘at least a further year’.

As part of the pilot, Taylor and Francis say a survey was conducted by Routledge to canvas opinions on the Library & Information Science Author Rights initiative and also investigated author and researcher behaviour and views on author rights policies, embargoes and posting work to repositories. The survey elicited over 500 responses, including: “Having the option to upload their work to a repository directly after publication is very important to these authors: more than 2/3 of respondents rated the ability to upload their work to repositories at 8, 9, or 10 out of 10, with the vast majority saying they feel strongly that authors should have this right”. There are no links to this survey that I have been able to uncover. It would be useful to include this survey in the Literature Review and possibly build on it for other stakeholders.

The second Case Study is Sage that, in 2013, decided to move to an immediate green policy. Both examples would have enough data by now to indicate if these decisions have resulted in subscription cancellations. I have proposed this type of study before, to no end. Hopefully we might now have more traction.

The Literature Review and Case Studies will then inform the development of a Survey of different stakeholders – which may have to be slightly altered depending on the audience being surveyed.  This is an ambitious goal – because the intention is to have at least preliminary findings available for discussion at the next OSI in 2017.

There was some lively Twitter discussion in the room about our proposal to do the study. Some were saying that the issue is resolved. I would argue that anyone who is negotiating the embargo landscape at the moment (such as repository managers) would strongly disagree with the position. Others referred to research already done in this space, for example the Publishing and Ecology of European Research (PEER) project. This study does discuss embargoes but approached the question with a position that embargoes are valid. The study we are proposing is asking specifically if there is any evidence base for embargoes.

Next steps

We will be preparing a project brief and our report for the OSI publication over the next couple of weeks.

The biggest issue for the project will be for us to gather funding. We have done a preliminary assessment of the time required to do the work so we could work out a ballpark figure for the fundraising goal. Note that our estimation of the number of workdays required for the project was deemed as ‘ludicrously low’ by a consultant in discussion later.

It was noted by a funder in casual discussions that because publishers have a vested interest in embargoes they should fund research that investigates their validity. Indeed Elsevier have already offered to assist financially for which we are grateful, but for this work to be considered robust and for it to be widely accepted it will need to be funded from a variety of sources. To that end we intend to ‘crowd fund’ the research in batches of $5000. The number of those batches will depend on the level of our underestimation of the time required to undertake the work (!).

In terms of governance, Team Embargo (perhaps we might need a better name…) will be working together as the steering committee to develop the brief, organise funding and choose the research team to do the work. We will need to engage an independent researcher or research group to ensure impartiality.

Wrap up summary of the workshop

There were a few issues relating to the organisation of the workshop. Much was made of the many hundreds of emails that were sent both from the organising group and also amongst the delegates before-hand. This level of preliminary discussion was beneficial but using another tool might help. It was noted that the level of email was potentially the reason why some of the delegates who were invited did not attend.

There was a logistic issue in having 190+ delegates staying in a hotel situated in the middle of a set of highways that was a 30 minute bus ride away from the conference location at George Mason University (also situated in an isolated location). The solution was a series of buses to ferry us each way each day, and to and from the airport. We ate breakfast, lunch and dinner together at the workshop location. This combined with the lack of alcohol because we were at an undergraduate American campus (where the legal drinking age is 21) gave the experience something of a school camp feel. Coming from another planned capital city (Canberra, Australia) I am sure that Washington is a beautiful and interesting place. This was not the visit to find that out.

These minor gripes aside, as is often the case, the opportunity to meet people face to face was fantastic. Because there was a heavy American flavour to the attendees, I have now met in person many of the people I ‘know’ well through virtual exchanges. It was also a very good process to work directly with a group of experienced and knowledgeable people who all contributed to a tangible outcome.

OSI is an ambitious project, with plans for annual meetings over the next decade. It will be interesting to see if we really can achieve change.

Published 24 April 2016
Written by Dr Danny Kingsley
Creative Commons License

Consider yourself disrupted – notes from RLUK2016

The 2016 Research Libraries UK conference was held at the British Library from 9-11 March on the theme of disruptive innovation. This blog pulls out some of the highlights personally gained from the conference:

  • If librarians are to be considered important – we as a community need to be strong in our grasp of understanding scholarly communication issues
  • We need to know the facts about our subscriptions to, usage of and contributions to scholarly publishing
  • We need high level support in institutions to back libraries in advocacy and negotiation with publishers
  • Scientists are rarely rewarded for being right, so the scientific record is being distorted by the scientific ecosystem
  • Society needs more open research to ensure reproducibility and robust research
  • The library of the future will have to be exponentially more customisable than the current offering
  • The information seeking behaviour of researchers is iterative and messy and does not match library search services
  • Libraries need to ‘create change to triumph’ – to be inventors rather than imitators
  • Management of open access issues need to be shared across institutions with positive outcomes when research offices and libraries collaborate.

I should note this is not a comprehensive overview of the conference, and I have blogged separately about my own contribution ‘The value of embracing unknown unknowns’. Some talks were looking at the broader picture, others specifically at library practice.

Stand your ground – tips for successful publisher negotiations

The opening keynote presentation was by Professor Gerard Meijer, President of Radboud University who conducted the recent Dutch negotiations with Elsevier.

The Dutch position has been articulated by Sander Dekker, the State Secretary  of Education who said while the way forward was gold Open Access, the government would not provide any extra money. Meijer noted this was sensible because every extra cent going into the system goes into the pocket of publishers – something that has been amply demonstrated in the UK.

All universities in the Netherlands are in top 200 universities in the world. This means all research is good quality – so even if it is only 2% of the world output, the Netherlands has some clout.

Meijer gave some salient advice about these types of negotiations. He said this work needs to be undertaken at the highest level at the universities. There are several reasons for this. He noted that 1.5 to 2 percent of university budget goes to subscriptions – and this is growing as budgets are being cut – so senior leadership in institutions should take an active position.

In addition if you are not willing to completely opt out of licencing their material then you can’t negotiate, and if you are going to opt out you will need the support of the researchers. To that end communication is crucial – during their negotiations, they would send a regular newsletter to researchers letting them know how things were going.

Meijer also stressed the importance of knowing the facts, and the need to communicate and inform the researchers about these facts and the numbers. He noted that most researchers don’t know how much subscriptions cost. They do know however about article processing charges – creating a misconception that Open Access is more expensive.

Institutions in the Netherlands spent €9.2 billion million on Elsevier publications in 2009, which rose to €11billion million* in 2014. Meijer noted that he was ‘not allowed’ to tell us this information due to confidentiality clauses. He drolly observed “It will be an interesting court case to be sued for telling the taxpayers how their money is being spent”. He also noted that because Elsevier is a public company their finances are available, and while their revenue goes up, their costs stay the same.

Apparently Wiley and Springer are willing to go into agreements. However Elsevier are saying that a global business model doesn’t match with a local business requirement. The Netherlands  has not yet signed the contract with Elsevier as they are working out the detail.

Broadly the deal is for three years, from 2016 to 2018. The plan is to grow the Open Access output from nothing to 10% in 2016, 20% in 2017, 30% in 2018 and want to do that without having to pay APCs. To achieve this they have to identify journals that we make Open Access , by defining domains where all journals in these domains we make open access.

Meijer concluded this was a big struggle – he would have liked to have seen more – but what we have is good for science. Dutch research will be open in fields where most Open Access is happening and researchers are paying APCs. Researchers can look at the long list of journals that are OA and then publish there.

*CORRECTION: Apologies for my mistyping.  Thanks to    @WvSchaik for pointing out this error on Twitter. The slide is captured in this tweet.

The future of the research library

Nancy Fried Foster from Ithaka S+R and Kornelia Tancheva from Cornell University Library spoke about research practices and the disruption of the research library. They started by noting that researchers work differently now, using different tools. The objective of their ‘A day in the life of a serious researcher’ work was exploring research practices to inform the vision of library of the future and identify improvements we could make now.

They developed a very fine-grained method of seeing what people do which focuses on what people really do in the workplace. This used a participatory design approach. Participants (who were mainly post graduates) were asked to map or log their movements in one single day where at least some of their time was engaged in research. The team then sat with the person the following day to ask them to narrate their day – and talk about seeking, finding and using information. There was no distinction between academic and non-academic activity.

The team looked at the things that people were doing and the things that the library could and will be. The analysis took a lot of time, organising into several big categories:

  • Seeking information
  • Academic activities
  • Library resources
  • Space, self management and
  • Circum-academic activities – activities allied to the researchers academic line but not central.

They also coded for ‘obstacles’ and ‘brainwork’.

The participants described their information seeking as fluid and constant – ‘you can just assume I am kind of checking my email all the time’. They also distinguished between search and research. One quote was ‘I know the library science is very systematic and organised and human behaviour is not like that’.

Information seeking is an iterative process, it is constant and not systematic. The search process is highly idiosyncratic – our subjects have developed ways of searching for information that worked for them. It doesn’t matter if it is efficient or not. They are self conscious that it is messy. ‘I feel like the librarians must be like “this is the worst thing I have ever heard”’.

Information evaluation is multi-tiered – eg: ‘If an article is talking about people I have heard of it is worth reading’. Researchers often use a mash up of systems that will work for that project. For example email is used as an information management tool.

Connectivity is important to researchers, it means you can work anywhere and switch rapidly between tasks. It has a big impact on collaboration – working with others was continuously mentioned in the context of writing. However sometimes researchers need to eliminate technology to focus.

Libraries have traditionally focused too much on search and not enough on brain work – this is a potential role for libraries. References to the library occurred throughout the process. Libraries are often thought of as a place for refuge – especially for the much needed brain work. The need for self management – enable them to manage their time prioritise the demands on their attention. Strategies depended on a complicated relationship with technology.

One of the major themes emerging from the work is search is idiosyncratic and not important, research has no closure, experts rule and research is collaboration. The implications for the future library are that the future library is a hub, not just focusing on a discovery system but connecting people with knowledge and technologies.

If we were building a library from scratch today what would it look like? There will need to be a huge amount of customisation to adjust tools to suit researchers personal preferences. The library of the future will have to be exponentially more customisable than the current offering. Libraries will have to make available their resources on customisable platforms. We need to shift from non-interoperable tools to customisation.

So if the future were here today we would think of future library – an academic hub (improving current library services) and an application store. We should take on even more of a social media aspect. Think of a virtual ‘app store’ – on an open source platform that provides the option for people to suggest short cuts – employ developers to develop these modules quickly. Take a leadership role in ensuring vendor platforms can be integrated. All library resources will speak easily to the systems our users are using. We need to provide individualised services rather than one size fits all.

Scientific Ecosystems and Research Reproducibility

The scientific reward structure determines the behaviour of researchers and that this has spawned the reproducibility crisis according to Marcus Munafo from the University of Bristol.

Marcus started by talking about the P value where the statistically significant value is 95% – that is, the chance of the hypothesis being wrong is less than five in 100. Generally, studies need to cross this threshold to get published, so there is evidence to show that original studies often suggest a large effect – however when attempted, these effects are not able to be replicated.

Scientists are supposed to be impartial observers, but in reality they need to get grants, and publish papers to get promoted to more ‘glamorous institutions’ (Marcus’ words). Scientists are rarely rewarded for being right, so the scientific record is being distorted by the scientific ecosystem.

Marcus noted it is common to overstate your data or error check your data if your first analysis doesn’t tell you what you are looking for. This ‘flexible analysis’ is quite commonplace, if we look at literature as a whole. Often there is not enough detail in the paper to allow the reproducibility of the work. There are nearly as many unique analysis pipelines as there were studies in the sample – so this flexibility in the joint analysis tool gets leveraged to get the result you want.

There is also evidence that journal impact factors are a very poor indicator of quality, indeed it is a stronger indicator of retraction than quality. The idea is that the whole science will self correct. But science won’t sort itself out in a reasonable timeframe. If you look at the literature you see that replication is the exception rather than the norm.

One study showed among 83 articles recommending effective interventions, 40 had not been replicated, and of those that had been replicated many showed the works had stronger findings in the first paper than in the replication, and some were contradicted in the replication.

Your personal investment in the field shapes your position – unconscious biases that affects all of us. If you come in as an early career scientist you get an impression that the field is more robust than it is in reality. There is hidden literature that is not citable – only by looking at this you have a balanced sense of how robust the literature is. There are many studies that make a claim in the abstract that is not supported by more impartial reading. Others are ‘optimistic’ in the abstract. The articles that describe bad news receive far fewer citations than would be expected. People don’t want to cite bad news. So is science self correcting?

We can introduce measures to help science self correct. In 2000 the requirement to register the outcome of clinical trials began. Once they had to pre-specify what the outcome would be then most of the findings were null. That is why it is a scientific ecosystem – the way we are incentivised has become distorted over the years.

Researchers are incentivised to produce a small number of papers that are eye catching.  It is understandable why you would want to focus on quality over quantity. We can give more weight to confirmatory studies and try to move away from the focus on publishing in certain types of studies. We shouldn’t be putting all our effort into high risk, high return.

What do we do about this? There can be top down measures, but individual groups can work in ways to improve the ways we work, such as adopting the open science way of working. This is not trivial – for example we can’t make data available without the consent of participants. Possible solutions include pre-registering all the plans, set up studies so the data can be made open, ensure publications are gold OA. These measures serve as a quality control method because everything gets checked because people know it is going to be made available. We come down hard on academics who make conscious mistakes – but we should be encouraging people to identify their own errors.

We need to introduce quality control methods implicitly into our daily practice. Open data is a very good step in that direction. There is evidence that researchers who know their data is going to be made open are more thorough in their checking of it. Maybe it is time for an update in the way we do science – we have statistical software that can run hundreds of analysis, and we can do text and data mining of lots of papers. We need to build in new processes and systems that refine science and think about new ways of rewarding science.

Marcus noted that these are not new problems, quoting from Reflections on the Decline of Science in England written by Babbage in 1830.

Marcus referred to many different studies and articles in his talk, some of which I have linked out to here:

Creating change to triumph: A view from Australia

The idea of creating change to triumph was the message of Jill Benn, the Librarian at the University of Western Australia. She discussed Cambietics, the science of managing change. This was a theory developed in 1985 by Barrett, with three stages:

  • Coping with change to survive
  • Capitalising on change
  • Creating change to triumph.

This last is the true challenge – to be an inventor rather than an imitator. Jill gave the Australian context. The country is 32 times bigger than UK, but has a third of the population, with 40 universities around the country. She noted that one of the reasons libraries in Australia have collaborated is the isolation.

Research from Australia counts for 4% of the world’s research output, it is the third largest export after energy, and out-performs tourism. The political landscape really affects higher education. There has been a series of five prime ministers in five years.

Australia has invested heavily in research infrastructure – mostly telescopes and boats. The Australian National Data Service was created and this has built the Research Data Australia interface – an amazing system full of data. The libraries have worked with researchers to populate the repository. There has been a large amount of capacity building. ANDS worked with libraries to build the capacities – the ’23 things’ training programme. You self register – on 1 March, 840 people had signed up for the programme.

The most recent element of the government’s agenda has been innovation. Prime Minister Turnbull has said he wanted to end the ‘publish or perish’ culture of research to increase the impact on community. There is a national innovation and science agenda and the government would not longer take into account publications for research. It is likely the next ERA (Australia’s equivalent of the REF) will involve impact in the community. The latest call is “innovation is the new black”.

There is financial pressure on the University sector – which pays in US dollars which is a problem. The emphasis on efficiency means the libraries have to show value and impact to the research sector.

Many well-developed services exist in university libraries to support research. Australian institutional repositories now have over 650K full text items, which are downloaded over 1 million times annually, there are data librarians and scholarly communication librarians. Some of the ways in which libraries have been asked to deliver capacity is CAUL and its Research Advisory Committee – to engage in the government’s agenda. There are three pillars – capacity building, engagement and advocacy, to promote the work of our libraries to bodies like Universities Australia.

Jill also mentioned the Australasian Open Access Strategy Group which has had a green rather than a gold approach. Australians are interested in open access. It is not yet clear what our role will be of institutional repositories into the future. In an environment where the government wants us to share our research.

How can we benchmark the Australian context? It is difficult. Look at our associations and about what data we might be able to share. Quote from Ross Wilkinson – yes there are individuals but the collective way Australia has managed data we are better able to engage internationally. Despite the investment into repositories in Australia – the UK outperforms Australia.

Australian libraries see themselves as genuine partners for research and we have a healthy self confidence (!). Libraries must demonstrate value and impact and provide leadership. Australian libraries have created change to triumph.

Open access mega-journals and the future of scholarly communication

This talk was given by Professor Stephen Pinfield from Sheffield University. He talked about the Open Access Mega Journal project he is working on with potentially disruptive open access journals (the Twitter handle is @oamj_project).

He began where it all began – with PLOS ONE, which is now the biggest journal in the world. Stephen noted that mega journals are full of controversy, listing comments ranging from them being the future of academic publishing, a disruptive innovation to the best possible future system.

However critics see them variously as a dumping ground, career suicide for early career researchers publishing in them and a cynical money making venture. However, Pinfield noted that despite considerable searching acknowledging what ‘people say’ is different from being able to provide attributed negative statements about mega-journals.

The open access and wide scope nature of mega-journals reverses the trend over past few years where journals have been further specialising, They are identifiable by their approach to quality control, with an emphasis on scientific soundness only rather than subjective assessments of novelty and also by their post publication metrics.

Pinfield noted that there are economies of scale for mega journals – this means that we have single set of processes and technologies. This enables a tiered scholarly publishing system. Mega-journals potentially allow highly selective journals to go open access (who often argue that they reject so much they couldn’t afford to go OA). Pinfield hypothesised that a business model could be where a layer of highly selective titles sits above a layer of moderately selective mega journals. The moderately selective journals provide the financial subsidy but the highly selective ones provide the reputational subsidy. PLOS is a good example of this symbiotic relationship.

The emphasis on ‘soundness’ in the quality control process reduces the subjectivity of judgements of novelty and importance and potentially shifts the role and the power of the gatekeepers. Traditionally the editors and editorial board members have been the arbiters of what is novel.

However this opens up some questions. If it is only a ‘soundness’ judgement then the question is whether power is shifted for good or ill? Also does the idea of ‘soundness’ translate to the Humanities? There is also the problem of an overreliance on metrics. Are the citation values of journals driven by the credibility or the visibility of the journals?

Pinfield emphasised the need for librarians to be informed and credible about their understanding of these topics. If librarians are to be considered important – we as a community need to be strong in our grasp of understanding these issues. There is an ongoing need to keep up to date and remain credible.

Working together to encourage researcher engagement and support

There were several talks about how institutions have been engaging researchers, and many of them emphasised the need to federate the workload across the institution. Chris Aware from the University of Hull discussed some work he was doing with Valerie McCutcheon on the current interaction between library and other parts of the institution in supporting OA, understand how OA is and could be embedded.

The survey revealed a desire for the management of Open Access to be more spread across the institution into the future. Libraries should be more involved in the management of the research information system and managing the REF. However Library involvement in getting Open Access into grant applications is lower – this is a research role, but it is worth asking how much this underpins subsequent activity.

As an aside Chris noted a way of demonstrating the value of something is to call it an ‘office’ – this is something the Americans do. (Indeed it is something Cambridge has done with the Office of Scholarly Communication).

Chris noted that if researchers don’t think about open access as part of the scholarly communications workflow then they won’t do it. Libraries play a key role in advocating and managing OA – so how can they work with other institutional stakeholders in supporting research?

Valerie later spoke about blurring and blending the borders between the Library and the Research Office. She noted that when she was working for Research and Enterprise (RSEO) she thought library people were nice, but she was not sure what the people do there. When she transferred to working in the Library, the perception back the other way was the same.

But the Research Office and the Library need to cooperate on shared strategic priorities. They are both looking out for changes in policy landscape they need to share information and collaborate on policy development and dissemination. They need better data quality in the research process to find solutions to create agile systems to support researchers.

At Glasgow the Library & RSEO were a good match because they had similar end uses and the same data. So this began a close collaboration between the two offices which worked together on the REF, used Enlighten. They also linked their systems (Enlighten and Research Systems) in 2010 where users can browse in the repository by the funder name. Glasgow has had a publications policy rather than an open access policy since 2008.

Valerie also noted that it was crucial to have high-level support and showed a video of Glasgow’s PVC-R singing the praises of the work the Library was doing.

The Glasgow Open Access model has been ‘Act on acceptance’ since 2013 – a simple message with minimal bureaucracy. A centralised service with ‘no fancy meetings’. Valerie also noted that when they put events on they don’t say it is a Library event, the sessions subject based not department based.

Torsten Reimer and Ruth Harrison discussed the support offered at Imperial College, where Torsten said he was originally employed for developing the College’s OA mandate but then the RCUK and the HEFCE policy came into place and changed everything. At Imperial, scholarly communications is seen as an overall concern for the College rather than specifically a Library issue.

Torsten noted the Library already had a good relationship with the departments. The Research Office is seen by researchers as a distraction from their research, but the Library is seen as helping research. However because the two areas have been able to approach everything with one single aim, this has allowed open access and scholarly support to happen across the institution and allowed the library to expand.

Imperial have one workflow and one system for open access which is all managed through Symplectic (there had been separate systems before). They have a simple workflow and form to fill in, then have a ticketing type customer workflow system plugged into Symplectic to pull information out at the back end. This system has replaced four workflows, lots of spreadsheets and much cut and pasting.

Sally Rumsey talked about how Oxford have successfully managed to engage their research community with their recently launched ‘Act on Acceptance’ communication programme.

Summary

This is a rundown of a few of the presentations that spoke to me. There were also excellent speed presentations, Lord David Willetts, the former Minister for Universities and Science spoke, we split up into workshops and there was a panel of library organisations around the world who discussed working together.

The personal outcomes from the conference include:

  • An invitation to give a talk at Cornell University
  • An invitation to collaborate with some people at CILIP about ensuring scholarly communication is included in some of the training offered
  • Discussion about forming some kind of learned society for Scholarly Communication
  • Discussion about setting up a couple of webinars – ‘how to start up an office of scholarly communication’ and ‘successful library training programmes’
  • Also lots of ideas about what to do next – the issue of language and the challenges we are facing in scholarly communication because of language deserves some investigation.

I look forward to next year.

Published 14 March 2016
Written by Dr Danny Kingsley
Creative Commons License