Tag Archives: Open Acccess

Data Diversity Podcast #3 – Dr Nick H. Wise (1/4)

In our third instalment of the Data Diversity Podcast, we are joined by Dr Nick H. Wise, Research Associate in Architectural Fluid Mechanics at the Department of Engineering, University of Cambridge. As is the theme of the podcast, we spoke to Nick about his experience as a researcher, but this is a special edition of the podcast. Besides being a scientist and an engineer, Nick has made his name as a scientific sleuth who, based on an article on the blog Retraction Watch which was written in 2022, is responsible for more than 850 retractions, leading Times Higher Education to dub him as a research fraudbuster. Since then, through his X account @Nickwizzo, he has continued his investigations, tracking cases of fraud and in some cases, naming and shaming the charlatans. Nick was kind to share with us many great insights over a 90-minute conversation, and as such we have decided to release a four part-series dedicated to the topic of research integrity. 

In this four-part series, we will learn from Nick about some of the shady activities that taint the scientific publishing industry today. In part one, we learn how Nick was introduced into the world of publication fraud and how that led him to investigate the industry behind it. Below are some excerpts from the conversation, which can be listened to in full here


I have found evidence of a papermill bribing some editors and there have been many, at least tens, if not hundreds, of editors that have been let go or told to stop being editors by journals in the last year because they have been found to be compromised. This could be because of bribery or some other way of being compromised. This is what I try to uncover. – Dr Nick H. Wise


Tortured Phrases and PubPeer: Nick’s beginnings as a Scientific Sleuth  

My background is in fluid dynamics where I mostly think about fluid dynamics within buildings. For instance, I think about the air flows generated by different heating systems and things like pollutant transport such as smells or COVID which can travel with the air and interact with other each other. That was my PhD and the post-doc in the Engineering department.

About three years ago whilst trying to avoid writing my thesis, I saw a tweet from the great Elizabeth Bik, who is possibly the most famous research fraud investigator. She mostly looks at biomedical images and her great skill is she would be able to look through a paper and see photos of Western blots of microscopy slides and see if parts of an image are identical to other parts, or if the image overlaps with images from different papers. She has an incredible memory and ability to spot these images. She’s been doing this for over 10 years and has caused many retractions. I was aware of her work but there was no way for me to assist with that because it is not my area of research. I don’t have an appreciation of what these images should look like.

But about three years ago she shared a preprint written by three computer scientists on her Twitter account about a phenomenon they called ‘tortured phrases’. In doing their research and reading the literature, these computer scientists noticed that there were papers with very weird language in them. What they surmised was that to overcome plagiarism checks by software like Turnitin, people would run text through paraphrasing software. These software were very crude in that they would go word by word. For instance, it would look at a word and replace it with the first synonym it found in a thesaurus. It would do this word for word, which makes the text barely readable. However, it is novel and so it will not flag any plagiarism checking software. Eventually, if you as a publisher have outsourced the plagiarism checks to some software, and neither your editor or peer reviewer reads the text to check if it makes sense, then this will get through peer review process without any problem and the paper would get published.  

For an example of tortured phrases: sometimes there’s not only one way to say something. Particularly if English is not someone’s first language, you don’t want to be too harsh on anyone who’s just chosen a word which just isn’t what a native speaker would pick. But there are some phrases where there’s only one right way to say it. For instance, artificial intelligence is the phrase for the phenomenon you want to talk about, and if instead you use “man-made consciousness”, that’s not the phrase you need to use, particularly if the original text said artificial intelligence brackets AI, and your text says “man-made consciousness” brackets AI. It’s going to be very clear what has happened.  

The three computer scientists highlighted this phenomenon of ‘tortured phrases’, but entirely from within the computer science field. I wondered if a similar phenomenon was happening in my own field in fluid dynamics. Samples of these paraphrasing software are freely available online as little widgets so I took some standard phrases from fluid dynamics, which were the kind that would not make sense if you swapped the words around and generated a few of these tortured phrases, I googled them and up popped hundreds of papers featuring these phrases. That was the beginning for me. 

I started reporting papers with these phrases on a website called PubPeer, which is a website for post-publication peer review. I commented on these papers and started being in conversation with the computer scientists who wrote the paper on ‘tortured phrases’ because they built a tool to scrape the literature and automatically tabulate these papers featuring these phrases. They basically had a dictionary of phrases which they knew would be spat out by the software because some of this paraphrasing software are so crude, such that if you put in “artificial intelligence”, you are always going to get out “man-made consciousness” or a handful of variants. It didn’t come up with a lot of different things. If you could just search for “man-made consciousness” and it brings up many papers, you knew what has been going on. I contributed a lot of new ‘fingerprints’, which is what they call their dictionary that they would search the literature for. That is my origin story. 

On Paper Mills and the Sale of Authorships 

There is also the issue of meta-science, which has nothing to do with the text of the paper or with the data itself, but more to do with how someone may add a load of references through the paper which are not relevant, or they are all references to one person or a colleague. In that way you would be gaming the system to boost profiles, careers, and things like H-index. Because having more publications and more citations is so desirable, there is a market for this. It is easy to find online advertisements for authorship of scientific papers ranging from $100 to over $1000, depending on the impact factor of the journal, and the position of authorship you want: first authorship, seventh authorship, or whether you want to be the corresponding author, these sorts of factors. Likewise, you can buy citations.  

There are also organizations known as paper mills. For example, as an author I might have written the paper and want, or need, to make some money and so I go to this broker and say: I want to sell authorships, I’ll be author number six, but I can sell the first five authorships. Can you put me in touch with someone selling authorships? At the same time, there are people who go to them saying I want to buy an authorship, and they put two and two together acting as a middleman. Also, some of these paper mills do not want to wait for someone to come to them with a paper – they will write papers to order. They have an in-house team of scientific writers who produce papers. This does not necessarily mean that the paper is bad. Depending on where they want the paper to publish, the paper might have to be good if it has to get published. So, they will employ people with degrees, qualified people or PhD students who need to earn some money, and then they will sell the authorships and get the papers published. This is a big business. 

There is a whole industry behind it, and something I have moved onto investigating quite a lot is where these papers are going. When I identify these papers, I try to find out where they are being published, how they’re being published, who is behind them, who is running these paper mills, who is collaborating with them. Something I found out which resulted in an article in Science was that paper mills want to guarantee acceptance as much as they can. If a paper is not accepted, it creates a lot of work for them and it means a longer time before their customers get what they paid for. For example, if a paper that they wrote and sold authorships for gets rejected, they’re going to have to resubmit it to another journal. So something paper mills will do is they will submit a paper to 10 journals at once and publish with whichever journal gave them the easiest time. But still, they want to try and guarantee acceptance and one way to do that is to simply bribe the editor. I have found evidence of a papermill bribing some editors and there have been many, at least tens, if not hundreds, of editors that have been let go or told to stop being editors by journals in the last year because they have been found to be compromised. This could be because of bribery or some other way of being compromised. This is what I try to uncover.

Although I’m not fighting this alone, it can feel like that. Publishers are doing things to some extent and they’re doing things that they can’t tell you about as well. And then there’s other people like me investigating this in their free time or as a side project. Not enough of us are doing it because it is a multi-million-dollar industry that is generating these papers. More papers are being published than ever before so it is a big fight.


Stay tuned as we release the rest of the conversation with Nick over the next month. In the next post, we get Nick’s take on the peer review process and fake research data, and I ask his opinion on where the fault lies in the publication of fraudulent research. 

Blood: in short supply?

Two years ago (almost to the day) we called out Blood for their misleading open access options that they offered to Research Council and Charity Open Access Fund (COAF) authors. Unfortunately, little has changed since then:

Neither of these routes is sufficient to comply with either Research Councils’ or COAF’s open access policies which require that the accepted text be made available in PMC within 6 months of publication, or that the published paper is available immediately under a CC BY licence.

At the time, we called on Blood to change their offerings or we would advise Research Councils and COAF funded authors to publish elsewhere. And that’s exactly what’s happened:

Figure 1. All articles published in Blood since 2007 which acknowledge MRC, Wellcome, CRUK or BHF funding. Data obtained from Web of Science.

Over the last two years we’ve seen a dramatic decline in the number of papers being published in Blood by Medical Research Council (MRC), Wellcome Trust, Cancer Research UK (CRUK) and British Heart Foundation (BHF) researchers. The number of papers published in Blood that acknowledge these funders in now at its lowest point in over a decade.

It’s important to remember that the 23 papers published in Blood in 2017 are all non-compliant with the open access policies of Research Councils and COAF, and if these papers acknowledge Wellcome Trust funding then those researchers may also be at risk of losing 10% of their total grant. If you are funded by Research Councils or one of the COAF members, please consider publishing elsewhere. SHERPA/FACT confirms our assessment:

Sign the open letter

We’re still collecting signatures for our open letter to the editor of Blood in the hope that they’ll reconsider their open access options. Please join us by adding your name.

Cambridge Open Access spend 2013-2018

Since 2013, the Open Access Team has been helping Cambridge researchers, funded by Research Councils UK (RCUK) and the consortium of biomedical funders which make up the Charity Open Access Fund (COAF), to meet their Open Access obligations. Both RCUK (now part of UKRI) and COAF have Open Access policies which have a preference for ‘gold’, i.e. the published work should be Open Access immediately at the time of publication. Implementing these policies has come at a significant cost. In this time, Cambridge has been awarded just over £10 million from RCUK and COAF to implement their Open Access policies, and the Open Access Team has diligently used this funding to maximum effect.

Figure 1. Comparison of combined RCUK/COAF grant spend and available funds, April 2013 – March 2018.

Initially, expenditure was slow which allowed the Open Access Team to maintain a healthy balance that could guarantee funding for almost any paper which met a few basic requirements. However, since January 2016 expenditure has gradually been catching up on the available funds which has made funding decisions more difficult (specifically Open Access deals tied to multi-year publisher subscriptions). In the first three months of 2018 average monthly expenditure on the RCUK block grant alone exceeded £160,000. We are quickly reaching the point where expenditure will outstrip the available grants.

One technical change which has particularly affected our management of the block grants was RCUK’s decision last year to move away from a direct cash award (which could be rolled over year to year) to a more tightly managed research grant. In the past, carrying over underspend has given us some flexibility in the management of the RCUK funds, whereas the more restrictive style of research grant will mean that any underspend will need to be returned at the end of the grant period, while any overspend cannot be deferred into the next grant period. As we are now dealing with a fixed budget, the Open Access Team will need to ensure that expenditure is kept within the limits of the grant. This is difficult when we have no control over where or when our researchers publish.

Funding from COAF (which is also managed as though it is a research grant) has generally matched our total annual spend quite closely, but the strict grant management rules have caused some problems, especially in the transition period between one grant and another. However, unlike RCUK, the Wellcome Trust will provide supplementary funding in addition to the main COAF award if it is exhausted, and the other COAF partners have similar procedures in place to manage Open Access payments beyond the end of the grant.

Where does it all go?

Most of our expenditure (91%) goes on article processing charges (APCs), as perhaps one might expect, but the block grants are also used to support the staff of the Open Access Team (3%), helpdesk and repository systems (2%), page and colour charges (2%), and publisher memberships (1%) (where this results in a reduced APC). The majority of APCs we’ve paid go towards hybrid journals, which represent approximately 80% of total APC spend.

So let’s take a look at which publishers have received the most funds. We’ve tried to match as much of our raw financial information we have to specific papers, although some of our data is either incomplete or we can’t easily link a payment back to a specific article, particularly if we look back to 2013-2015 when our processes were still developing. Nonetheless, the average APC paid over the last 5 years was £2,291 (inc. 20% VAT), but as can be seen from Table 1, average APCs have been rising year on year at a rate of 7% p.a., significantly higher than inflation. Price increases at this rate are not sustainable in the long term – by 2022 we could be paying on average £3000 per article.

Table 1. Average APC by publication year of article (where known).

Year of publication Average APC paid (£)
2013  £1,794
2014  £1,935
2015  £2,044
2017  £2,187
2018  £2,336

Elsevier has been by far the largest recipient of block grant funds, receiving 29.4% of all APC expenditure from the RCUK and COAF awards (over £2.5 million), though only accounting for 25.5% of articles. In the same time SpringerNature also received in excess of £1 million (which as we’ll see below has mostly been spent on two titles). With such a substantial set of data we can now begin to explore the relative value that each publisher offers. Take for example Taylor & Francis (£107,778 for 120 articles) compared to Wolters Kluwer (£119,551 for 35 articles). Both publishers operate mostly hybrid OA journals and yet the relative value is significantly different. What is so fundamentally different between publishers that such extreme examples as this should exist?

Table 2. Top 20 publishers by combined total RCUK/COAF APC spend 2013-2018.

Value of APCs paid Number of APCs paid Avg. APC paid
Publisher £ % N % £
Elsevier £2,559,736 29.4% 971 25.5% £2,636
SpringerNature £1,050,774 12.1% 402 10.6% £2,614
Wiley £808,847 9.3% 279 7.3% £2,899
American Chemical Society £411,027 4.7% 251 6.6% £1,638
Oxford University Press £379,647 4.4% 169 4.4% £2,246
PLOS £267,940 3.1% 168 4.4% £1,595
BioMed Central £245,006 2.8% 153 4.0% £1,601
Institute of Physics £189,434 2.2% 98 2.6% £1,933
Royal Society of Chemistry £156,018 1.8% 106 2.8% £1,472
BMJ Publishing £144,001 1.7% 68 1.8% £2,118
Company of Biologists £140,609 1.6% 50 1.3% £2,812
Wolters Kluwer £119,551 1.4% 35 0.9% £3,416
Taylor & Francis £107,778 1.2% 120 3.2% £898
Frontiers £103,011 1.2% 61 1.6% £1,689
Cambridge University Press £77,139 0.9% 38 1.0% £2,030
Royal Society £73,890 0.8% 52 1.4% £1,421
Society for Neuroscience £69,943 0.8% 26 0.7% £2,690
American Society for Microbiology £63,056 0.7% 36 0.9% £1,752
American Heart Association £53,696 0.6% 14 0.4% £3,835
Optical Society of America £39,463 0.5% 17 0.4% £2,321
All other articles £1,654,228 19.0% 690 18.1% £2,397
Grand Total £8,714,794 100.0% 3,804 100.0% £2,291

Next, journal level metrics. The most popular journal that we pay APCs for is Nature Communications, followed closely by Scientific Reports. Both of these are SpringerNature titles, and indeed these two titles make up the bulk of our total APC spend with SpringerNature. Yet these two journals represent significantly different approaches to Open Access. Nature Communications, along with Cell and Cell Reports, are some of the most expensive routes to making research publications Open Access, whereas Scientific Reports and PLOS One sit at the lower end of the spectrum. It is interesting that we haven’t seen a particularly popular Open Access journal fill the niche between Nature Communications and Scientific Reports.

Figure 2. APC number and total spend by journal. In the last five years, nearly £450,000 has been spent on articles published in Nature Communications.


Managing the future

While the OA block grants have kept pace with overall expenditure so far, continuing monthly expenditure of £160,000 would risk overspending on the RCUK grant for 2018/19. To counter this possible outcome the University has agreed a set of funding guidelines to manage the RCUK (from now on known as Research Councils) and COAF awards. For Research Councils’ funded papers the new guidelines place an emphasis on fully Open Access journals and hybrid journals where the publisher is taking a sustainable approach to managing the transition to Open Access. We’ve spent a lot of money over the last five years, yet it’s not clear that the influx of cash from RCUK and COAF has had any meaningful impact on the overall publishing landscape. Many publishers continue to reap huge windfalls via hybrid APCs, yet they are not serious about their commitment to Open Access.

In the future, we’ll be demanding better deals from publishers before we support payments to hybrid journals so that we can effect a faster transition to a fully Open Access world.

Published 22 October 2018
Written by Dr Arthur Smith
Creative Commons License