Tag Archives: HARKing

The case for Open Research: solutions?

This series arguing the case for Open Research has to date looked at some of the issues in scholarly communication today. Hyperauthorship, HARKing, the reproducibility crisis, a surge in retractions all stem from the requirement that researchers publish in high impact journals. The series has also looked at the invalidity of the impact factor and issues with peer review.

This series is one of an increasing cacophony of calls to move away from this method of rewarding researchers. Richard Smith noted in a recent BMJ blog criticising the current publication in journal system: “The whole outdated enterprise is kept alive for one main reason: the fact that employers and funders of researchers assess researchers primarily by where they publish. It’s extraordinary to me and many others that the employers, mainly universities, outsource such an important function to an arbitrary and corrupt system.”

Universities need to open research to ensure academic integrity and adjust to support modern collaboration and scholarship tools, and begin rewarding people who have engaged in certain types of process rather than relying on traditional assessment schemes. This was the thrust of a talk in October last year”Openness, integrity & supporting researchers“. If nothing else, this approach makes ‘nightmare scenarios’ less likely. As Prof Tom Cochrane said in the talk, the last thing an institution needs is to be on the front page because of a big fraud case. 

What would happen if we started valuing and rewarding other parts of the research process? This final blog in the series looks at opening up research to increase transparency. The argument suggests we need to move beyond rewarding only the journal article – and not only other research outputs, such as data sets but research productivity itself.

So, let’s look at how opening up research can address some of the issues raised in this series.

Rewarding study inception

In his presentation about HARKing (Hypothesising After the Results are Known) at FORCE2016 Eric Turner, Associate Professor OHSU suggested that what matters is the scientific question and methodological rigour. We should be emphasising not the study completion but study inception before we can be biased by the results.  It is already a requirement to post results of industry sponsored research in ClinicalTrials.gov – a registry and results database of publicly and privately supported clinical studies of human participants conducted around the world. Turner argues we should be using it to see the existence of studies.  He suggested reviews of protocols should happen without the results (but not include the methods section because this is written after the results are known).

There are some attempts to do this already. In 2013 Registered Reports was launched: “The philosophy of this approach is as old as the scientific method itself: If our aim is to advance knowledge then editorial decisions must be based on the rigour of the experimental design and likely replicability of the findings – and never on how the results looked in the end.” The proposal and process is described here. The guidelines for reviewers and authors are here, including the requirement to “upload their raw data and laboratory log to a free and publicly accessible file-sharing service.”

This approach has been met with praise by a group of scientists with positions on more than 100 journal editorial boards, who are “calling for all empirical journals in the life sciences – including those journals that we serve – to offer pre-registered articles at the earliest opportunity”. The signatories noted “The aim here isn’t to punish the academic community for playing the game that we created; rather, we seek to change the rules of the game itself.” And that really is the crux of the argument. We need to move away from the one point of reward.

Getting data out there

There is definite movement towards opening research. In the UK there is now a requirement from most funders that the data underpinning research publications are made available. Down under, the Research Data Australia project is a register of data from over 100 institutions, providing a single point to search, find and reuse data. The European Union has an Open Data Portal.

Resistance to sharing data amongst the research community is often due to the idea that if data is released with the first publication then there is a risk that the researcher will be ‘scooped’ before they can get those all-important journal articles out. In response to this query during a discussion with the EPSRC it was pointed out that the RCUK Common Principles state that those who undertake Research Council funded work may be entitled to a limited period of privileged use of the data they have collected to enable them to publish the results of their research. However, the length of this period varies by research discipline.

If the publication of data itself were rewarded as a ‘research output’ (which of course is what it is), then the issue of being scooped becomes moot. There have been small steps towards this goal, such as a standard method of citing data.

A new publication option is Sciencematters, which allows researchers to submit observations which are subjected to triple-blind peer review, so that the data is evaluated solely on its merits, rather than on the researcher’s name or organisation. As they indicate “Standard data, orphan data, negative data, confirmatory data and contradictory data are all published. What emerges is an honest view of the science that is done, rather than just the science that sells a story”.

Despite the benefits of having data available there are some vocal objectors to the idea of sharing data. In January this year a scathing editorial in the New England Journal of Medicine suggested that researchers who used other people’s data were ‘research parasites’. Unsurprisingly this position raised a small storm of protest (an example is here). This was so sustained that four days later a clarification was issued, which did not include the word ‘parasites’.

Evaluating & rewarding data

Ironically, one benefit of sharing data could be an improvement to the quality of the data itself. A 2011 study into why some researchers were reluctant to share their data found this to be associated with weaker evidence (against the null hypothesis of no effect) and a higher prevalence of apparent errors in the reporting of statistical results. The unwillingness to share data was particularly clear when reporting errors had a bearing on statistical significance.

Professor Marcus Munafo in his presentation at the Research Libraries UK conference held earlier this year suggested that we need to introduce quality control methods implicitly into our daily practice. Open data is a very good step in that direction. There is evidence that researchers who know their data is going to be made open are more thorough in their checking of it. Maybe it is time for an update in the way we do science – we have statistical software that can run hundreds of analysis, and we can do text and data mining of lots of papers. We need to build in new processes and systems that refine science and think about new ways of rewarding science.

So should researchers be rewarded simply for making their data available? Probably not, some kind of evaluation is necessary. In a public discussion about data sharing held at Cambridge University last year, there was the suggestion that rather than having the formal peer review of data, it would be better to have an evaluation structure based on the re-use of data – for example, valuing data which was downloadable, well-labelled and re-usable.

Need to publish null results

Generally, this series looking at the case for Open Research has argued that the big problem is the only thing that ‘counts’ is publication in high impact journals. So what happens to all the results that don’t ‘find’ anything?

Most null results are never published with a study in 2014 finding that of 221 sociological studies conducted between 2002 and 2012, only 48% of the completed studies had been published. This is a problem because not only is the scientific record inaccurate, it means  the publication bias “may cause others to waste time repeating the work, or conceal failed attempts to replicate published research”.

But it is not just the academic reward system that is preventing the widespread publication of null results – the interference of commercial interests on the publication record is another factor. A recent study looked into the issue of publication agreements – and whether a research group had signed one prior to conducting randomised clinical trials for a commercial entity. The research found that  70% of protocols mentioned an agreement on publication rights between industry and academic investigators; in 86% of those agreements, industry retained the right to disapprove or at least review manuscripts before publication. Even more concerning was  that journal articles seldom report on publication agreements, and, if they do, statements can be discrepant with the trial protocol.

There are serious issues with the research record due to selected results and selected publication which would be ameliorated by the requirement to publish all results – including null results.

There are some attempts to address this issue. Since June 2002 the Journal of Articles in Support of the Null Hypothesis has been published bi-annually. The World Health Organisation has a Statement on the Public Disclosure of Clinical Trial Results, saying: “Negative and inconclusive as well as positive results must be published or otherwise made publicly available”. A project launched in February last year by PLOS ONE is a collection focusing on negative, null and inconclusive results. The Missing Pieces collection had 20 articles in it as of today.

In January this year there were reports that a group of ten editors of management, organisational behaviour and work psychology research had pledged they would publish the results of well-conceived, designed, and conducted research even if the result was null.  The way this will work is the paper is presented without results or discussion first and it is assessed on theory, methodology, measurement information, and analysis plan.

Movement away from using the impact factor

As discussed in the first of this series of blogs ‘The mis-measurement problem‘, we have an obsession with high impact journals. These blogs have been timely, falling as they have within what seems to be a plethora of similarly focused commentary. An example is a recent Nature news story by Mario Biagioli, who argued the focus on impact of published research has created new opportunities for misconduct and fraudsters. The piece concludes that “The audit culture of universities — their love affair with metrics, impact factors, citation statistics and rankings — does not just incentivize this new form of bad behaviour. It enables it.”

In recent discussion amongst the Scholarly Communication community about this mis-measurement the suggestion that we can address the problem by limiting the number of articles that can be submitted for promotion was raised. This ideally reduces the volume of papers produced overall, or so the thinking goes. Harvard Medical School and the Computing Research Association “Best Practices Memo” were cited as examples by different people.

This is also the approach that has been taken by the Research Excellence Framework in the UK – researchers put forward their best four works from the previous period (typically about five years). But it does not prevent poor practice. Researchers are constantly evaluated for all manner of reasons. Promotion, competitive grants, tenure, admittance to fellowships are just a few of the many environments a researcher’s publication history will be considered.

Are altmetrics a solution? There is a risk that any alternative indicator becomes an end in itself. The European Commission now has an Open Science Policy Platform, which, amongst other activities has recently established an expert group to advise on the role of metrics and altmetrics in the development of its agenda for open science and research.

Peer review experiments

Open peer review is where peer review reports identify the reviewers and are published with the papers.  One of the more recent publishers to use this method of review is the University of California Press’ open access mega journal called Collabra, launched last year. In an interview published by Richard Poynder, UC Press Director Alison Mudditt notes that there are many people who would like to see more transparency in the peer review process. There is some evidence to show that identifying reviewers results in more courteous reviews.

PLOS One publishes work after an editorial review process which does not include potentially subjective assessments of significance or scope to focus on technical, ethical and scientific rigor. Once an article is published readers are able to comment on the work in an open fashion.

One solution could be that used by CUP journal JFM Rapids, which has a ‘fast-track’ section of the journal offering fast publication for short, high-quality papers. This also operates a policy whereby no paper is reviewed twice, thus authors must ensure that their paper is as strong as possible in the first instance. The benefit is it offers a fast turnaround time while reducing reviewer fatigue.

There are calls for post publication peer review, although some attempts to do this have been unsuccessful, there are arguments that it is simply a matter of time – particularly if reviewers are incentivised. One publisher that uses this system is the platform F1000Research which publishes work immediately and invites open post-publication review. And, just recently, Wellcome Open Research was launched using services developed by F1000Research. It will make research outputs available faster and in ways that support reproducibility and transparency. It uses an open access model of immediate publication followed by transparent, invited peer review and inclusion of supporting data.

Open ways of conducting research

All of these initiatives demonstrate a definite movement towards an open way of doing research by addressing aspects of the research and publication process. But there are some research groups that are taking a holistic approach to open research.

Marcus Munafo published last month a description of the experience the UK Center for Tobacco and Alcohol Studies and the MRC Integrative Epidemiology Unit at the University of Bristol over the past few years of attempting to work within an Open Science Model focused on three core areas:  study protocols, data, and publications.

Another example is the Open Source Malaria project which includes researchers and students using open online laboratory notebooks from around the world including Australia, Europe and North America. Experimental data is posted online each day, enabling instant sharing and the ability to build on others’ findings in almost real time. Indeed, according to their site ‘anyone can contribute’. They have just announced that undergraduate classes are synthesising molecules for the project. This example fulfils all of the five basic principles of open research suggested here.

The Netherlands Organisation for Scientific Research (NWO) has just announced that it is making 3 million euros available for a Replication Studies pilot programme. The pilot will concentrate on the replication of social sciences, health research and healthcare innovation studies that have a large impact on science, government policy or the public debate. The intention after this study will be to “include replication research in an effective manner in all of its research programmes”.

A review of literature published this week has demonstrated that open research is associated with increases in citations, media attention, potential collaborators, job opportunities and funding opportunities. These findings are evidence, the authors say,  “that open research practices bring significant benefits to researchers relative to more traditional closed practices”.

This series has been arguing that we should move to Open Research as a way of changing the reward system that bastardises so much of the scientific endeavour. However there may be other benefits according to a recently published opinion piece which argues that Open Science can serve a different purpose to “help improve the lot of individual working scientists”.

Conclusion

There are clearly defined problems within the research process that in the main stem from the need to publish in  high impact journals. Throughout this blog there are multiple examples of initiatives and attempts to provide alternative ways of working and publishing.

However, all of this effort will only succeed if those doing the assessing change the rules of the game. This is tricky. Often the people who have succeeded have some investment in the status quo remaining. We need strong and bold leadership to move us out of this mess and towards a more robust and fairer future. I will finish with a quote that has been attributed to Mark Twain, Einstein and Henry Ford. “If you always do what you’ve always done, you’ll always get what you’ve always got”. It really is up to us.

Published 2 August 2016
Written by Dr Danny Kingsley
Creative Commons License

The case for Open Research: reproducibility, retractions & retrospective hypotheses

This is the third instalment of ‘The case for Open Research’ series of blogs exploring the problems with Scholarly Communication caused by having a single value point in research – publication in a high impact journal. The first post explored the mis-measurement of researchers and the second looked at issues with authorship.

This blog will explore the accuracy of the research record, including the ability (or otherwise) to reproduce research that has been published, what happens if research is retracted, and a concerning trend towards altering hypotheses in light of the data that is produced.

Science is thought to progress  through the building of knowledge through questioning, testing and checking work. The idea of ‘standing on the shoulders of giants’ summarises this – we discover truth by building on previous discoveries. But scientists are very rarely rewarded for being right, they are rewarded for publishing in certain journals and for getting grants. This can result in distortion of the science.

How does this manifest? The Nine Circles of Scientific Hell describes questionable research practices that occur, ranging from Overselling, Post-Hoc storytelling, p-value Fishing, Creative use of Outliers to Non or Partial Publication of Data. We will explore some of these below. (Note this article appears in a special issue of Perspectives on Psychological Science on the Replicability in Psychological Science – which contains many other interesting articles).

Much as we like to think of science as an objective activity it is not. Scientists are supposed to be impartial observers, but in reality they need to get grants, and publish papers to get promoted to more ‘glamorous institutions’. This was the observation of Professor Marcus Munafo in his presentation ‘Scientific Ecosystems and Research Reproducibility’ at the Research Libraries UK conference held earlier this year (the link will take you to videos of the presentations). Monafo observed that scientists are rarely rewarded for being right, so the scientific record is being distorted by the scientific ecosystem.

Monafo, a Biological Psychologist at Bristol University, noted that research, particularly in the biomedical sciences, ‘might not be as robust as we might have hoped‘.

The reproducibility crisis

A recent survey of over 1500 scientists by Nature tried to answer the question “Is there a reproducibility crisis?” The answer is yes, but whether that matters appears to be debatable: “Although 52% of those surveyed agree that there is a significant ‘crisis’ of reproducibility, less than 31% think that failure to reproduce published results means that the result is probably wrong, and most say that they still trust the published literature.”

There are certainly plenty of examples of the inability to reproduce findings. Pharmaceutical research can be fraught. Some research into potential drug targets found that in almost two-thirds of the projects looked at, there were inconsistencies between published data and the data resulting from attempts to reproduce the findings. 

There are implications for medical research as well. A study published last month looked at functional MRI (fMRI), noting that when analysing data using different experimental designs they should in theory find a significance threshold of 5% (a p-value of less than 0.05  which is conventionally described as statistically significant). However they found “the most common software packages for fMRI analysis (SPM, FSL, AFNI) can result in false-positive rates of up to 70%. These results question the validity of some 40,000 fMRI studies and may have a large impact on the interpretation of neuroimaging results.”

A 2013 survey of cancer researchers found that approximately half of respondents had experienced at least one episode of the inability to reproduce published data. Of those people who followed this up with the original authors, most were unable to determine why the work was not reproducible. Some of those original authors were (politely) described as ‘less than “collegial”’.

So what factors are at play here? Partly it is due to the personal investment in a particular field. A 2012 study of authors of significant medical studies concluded that: “Researchers are influenced by their own investment in the field, when interpreting a meta-analysis that includes their own study. Authors who published significant results are more likely to believe that a strong association exists compared with methodologists.”

This was also a factor in a study Why Most Published Research Findings Are False that considered the way research studies are constructed. This work found that “for most study designs and settings, it is more likely for a research claim to be false than true. Moreover, for many current scientific fields, claimed research findings may often be simply accurate measures of the prevailing bias.”

Psychology is a discipline where there is a strong emphasis on novelty, discovery and finding something that has a p-value of less than 0.05. There is such an issue with reproducibility in psychology that there are large efforts to try and reproduce psychological studies to estimate the reproducibility of the research. The Association for Psychological Science has launched a new article type of Registered Replication Reports which consists of “multi-lab, high-quality replications of important experiments in psychological science along with comments by the authors of the original studies”.

This is a good initiative, although there might be some resistance to this type of scrutiny. Something that was interesting from the Nature survey on reproducibility was the question of what happened when researchers attempted to publish a replication study. Note that only a few of respondents had done this, possibly because incentives to publish positive replications are low and journals can be reluctant to publish negative findings. The study found that “several respondents who had published a failed replication said that editors and reviewers demanded that they play down comparisons with the original study”.

What is causing this distortion of the research? It is the emphasis on publication of novel results in high impact journals. There is no reward for publishing null results or negative findings.

HARKing problem

The p-value came up again in a discussion about HARKing at this year’s FORCE2016 conference (HARK stands for Hypothesising After the Results are Known – a term coined in 1998).

In his presentation at FORCE2016 Eric Turner, Associate Professor OHSU, spoke about HARKing (see this video 37 minutes onward).  The process is that the researcher conceives the study, writes the protocol up for their eyes only, with a hypothesis and then collects lots of other data – ‘the more the merrier’ according to Turner. Then the researcher runs the study and analyses the data. If there is enough data, the researcher can try alternative methods and can play with statistics. ‘You can torture the data and it will confess to anything’ noted Turner. At some point the p-value will come out below 0.05. Only then does the research get written up.

Turner noted that he was talking about the kind of research where the work is trying to confirm a hypothesis (like clinical trials). This is different to hypothesis-generating research.

In the US clinical trials with human participants must be registered with the Federal Drug Agency (FDA) so it is possible to see the results of all trials. Turner talked about his 2008 study looking at antidepressant trials, where the journal version of the results supported the general view that antidepressants always beat placebo.  However when they looked at the FDA version of all of the studies of the same drugs it happened that half of the studies were positive and half and half were not positive. The published record does not reflect the reality.

The majority of the negative studies were simply not published, but 11 of the papers had been ‘spun’ from negative to positive. These papers had a median impact factor of 5 and median citations of 68 – these were highly influential articles. As Turner noted ‘HARKing is deceptively easy’.

This perspective is supported by the finding that a researcher is more likely to falsely find evidence that an effect exists than to correctly find evidence that it does not. Indeed Munafo noted that over 90% of the psychological literature finds what it purports to set out to do. Either the research being undertaken is extraordinarily mundane, or something is wrong.

Increase in retractions

So what happens when it is discovered that something that is published is incorrect? Journals do have a system which allows for the retraction of papers, and this is a practice which has been increasing over the past few years.  Research looking at why the number of retractions have increased found that it was partly due to lower barriers to publication of flawed articles. In addition papers are now being retracted for issues like plagiarism and retractions are now happening more quickly.

Retraction Watch is a service which tracks retractions ‘as a window into the scientific process’. It is enlightening reading with several stories published every day.

An analysis of correction rates in the chemical literature found that the correction rate averaged about 1.4 percent for the journals examined. While there were numerous types of corrections, chemical structures, omission of relevant references, and data errors were some of the most frequent types of published corrections. Corrections are not the same as retractions, but they are significant.

There is some evidence to show that the higher the impact factor of the journal a work is published in, the higher the chance it will be retracted. A 2011 study showed a direct correlation between impact factor and the number of retractions, with New England Journal of Medicine topping the list. This situation has led to claims that the top ranking journals publish the least reliable science.

A study conducted earlier this year demonstrated that there are no commonly agreed definitions of academic integrity and malpractice. (I should note that amongst other findings the study found 17.9% (± 6.1%) of respondents reported having fabricated research data. This is almost 1 in 5 researchers. However there have been some strong criticisms of the methodology.)

There are questions about how retractions should be managed. In the print era it was not unheard of for library staff to put stickers into printed journals notifying a retraction. But in the ‘electronic age’ asked one author in 2002 when the record can be erased, is this the right thing to do because erasing the article entirely is amending history.  The Committee on Publication Ethics (COPE) do have some guidelines for managing retractions which suggest the retraction be linked to the retracted article wherever possible.

However, from a reader’s perspective, even if an article is retracted this might not be obvious. In 2003* a survey of 43 online journals found 17 had no links between the original articles and later corrections. When present, hyperlinks between articles and errata showed patterns in presentation style, but lacked consistency. There are some good examples – such as Science Citation Index but there was a lack of indexing in INSPEC, and a lack of retrieval with SciFinder Scholar.

[*Note this originally said 2013, amended 2 September 2016]

Conclusion

All of this paints a pretty bleak picture. In some disciplines the pressure to publish novel results in high impact journals results in the academic record being ‘selectively curated’ at best. At worst it results in deliberate manipulation of results. And if mistakes are picked up there is no guarantee that this will be made obvious to the reader.

This all stems from the need to publish novel results in high impact journals for career progression. And when those high impact journals can be shown to be publishing a significant amount of subsequently debunked work, then the value of them as a goal for publication comes into serious question.

The next instalment in this series will look at gatekeeping in research – peer review.

Published 14 July 2016
Written by Dr Danny Kingsley
Creative Commons License