Tag Archives: publishing

Data Diversity Podcast #3 – Dr Nick H. Wise (1/4)

August 19, 2024Uncategorizedjournals, Open Acccess, open data, Open Research, publishing, publishing houses, research integrity, scholarly communicationLutfi Bin Othman

In our third instalment of the Data Diversity Podcast, we are joined by Dr Nick H. Wise, Research Associate in Architectural Fluid Mechanics at the Department of Engineering, University of Cambridge. As is the theme of the podcast, we spoke to Nick about his experience as a researcher, but this is a special edition of the podcast. Besides being a scientist and an engineer, Nick has made his name as a scientific sleuth who, based on an article on the blog Retraction Watch which was written in 2022, is responsible for more than 850 retractions, leading Times Higher Education to dub him as a research fraudbuster. Since then, through his X account @Nickwizzo, he has continued his investigations, tracking cases of fraud and in some cases, naming and shaming the charlatans. Nick was kind to share with us many great insights over a 90-minute conversation, and as such we have decided to release a four part-series dedicated to the topic of research integrity.

In this four-part series, we will learn from Nick about some of the shady activities that taint the scientific publishing industry today. In part one, we learn how Nick was introduced into the world of publication fraud and how that led him to investigate the industry behind it. Below are some excerpts from the conversation, which can be listened to in full here.

I have found evidence of a papermill bribing some editors and there have been many, at least tens, if not hundreds, of editors that have been let go or told to stop being editors by journals in the last year because they have been found to be compromised. This could be because of bribery or some other way of being compromised. This is what I try to uncover. – Dr Nick H. Wise

Tortured Phrases and PubPeer: Nick’s beginnings as a Scientific Sleuth

My background is in fluid dynamics where I mostly think about fluid dynamics within buildings. For instance, I think about the air flows generated by different heating systems and things like pollutant transport such as smells or COVID which can travel with the air and interact with other each other. That was my PhD and the post-doc in the Engineering department.

About three years ago whilst trying to avoid writing my thesis, I saw a tweet from the great Elizabeth Bik, who is possibly the most famous research fraud investigator. She mostly looks at biomedical images and her great skill is she would be able to look through a paper and see photos of Western blots of microscopy slides and see if parts of an image are identical to other parts, or if the image overlaps with images from different papers. She has an incredible memory and ability to spot these images. She’s been doing this for over 10 years and has caused many retractions. I was aware of her work but there was no way for me to assist with that because it is not my area of research. I don’t have an appreciation of what these images should look like.

But about three years ago she shared a preprint written by three computer scientists on her Twitter account about a phenomenon they called ‘tortured phrases’. In doing their research and reading the literature, these computer scientists noticed that there were papers with very weird language in them. What they surmised was that to overcome plagiarism checks by software like Turnitin, people would run text through paraphrasing software. These software were very crude in that they would go word by word. For instance, it would look at a word and replace it with the first synonym it found in a thesaurus. It would do this word for word, which makes the text barely readable. However, it is novel and so it will not flag any plagiarism checking software. Eventually, if you as a publisher have outsourced the plagiarism checks to some software, and neither your editor or peer reviewer reads the text to check if it makes sense, then this will get through peer review process without any problem and the paper would get published.

For an example of tortured phrases: sometimes there’s not only one way to say something. Particularly if English is not someone’s first language, you don’t want to be too harsh on anyone who’s just chosen a word which just isn’t what a native speaker would pick. But there are some phrases where there’s only one right way to say it. For instance, artificial intelligence is the phrase for the phenomenon you want to talk about, and if instead you use “man-made consciousness”, that’s not the phrase you need to use, particularly if the original text said artificial intelligence brackets AI, and your text says “man-made consciousness” brackets AI. It’s going to be very clear what has happened.

The three computer scientists highlighted this phenomenon of ‘tortured phrases’, but entirely from within the computer science field. I wondered if a similar phenomenon was happening in my own field in fluid dynamics. Samples of these paraphrasing software are freely available online as little widgets so I took some standard phrases from fluid dynamics, which were the kind that would not make sense if you swapped the words around and generated a few of these tortured phrases, I googled them and up popped hundreds of papers featuring these phrases. That was the beginning for me.

I started reporting papers with these phrases on a website called PubPeer, which is a website for post-publication peer review. I commented on these papers and started being in conversation with the computer scientists who wrote the paper on ‘tortured phrases’ because they built a tool to scrape the literature and automatically tabulate these papers featuring these phrases. They basically had a dictionary of phrases which they knew would be spat out by the software because some of this paraphrasing software are so crude, such that if you put in “artificial intelligence”, you are always going to get out “man-made consciousness” or a handful of variants. It didn’t come up with a lot of different things. If you could just search for “man-made consciousness” and it brings up many papers, you knew what has been going on. I contributed a lot of new ‘fingerprints’, which is what they call their dictionary that they would search the literature for. That is my origin story.

On Paper Mills and the Sale of Authorships

There is also the issue of meta-science, which has nothing to do with the text of the paper or with the data itself, but more to do with how someone may add a load of references through the paper which are not relevant, or they are all references to one person or a colleague. In that way you would be gaming the system to boost profiles, careers, and things like H-index. Because having more publications and more citations is so desirable, there is a market for this. It is easy to find online advertisements for authorship of scientific papers ranging from $100 to over $1000, depending on the impact factor of the journal, and the position of authorship you want: first authorship, seventh authorship, or whether you want to be the corresponding author, these sorts of factors. Likewise, you can buy citations.

There are also organizations known as paper mills. For example, as an author I might have written the paper and want, or need, to make some money and so I go to this broker and say: I want to sell authorships, I’ll be author number six, but I can sell the first five authorships. Can you put me in touch with someone selling authorships? At the same time, there are people who go to them saying I want to buy an authorship, and they put two and two together acting as a middleman. Also, some of these paper mills do not want to wait for someone to come to them with a paper – they will write papers to order. They have an in-house team of scientific writers who produce papers. This does not necessarily mean that the paper is bad. Depending on where they want the paper to publish, the paper might have to be good if it has to get published. So, they will employ people with degrees, qualified people or PhD students who need to earn some money, and then they will sell the authorships and get the papers published. This is a big business.

There is a whole industry behind it, and something I have moved onto investigating quite a lot is where these papers are going. When I identify these papers, I try to find out where they are being published, how they’re being published, who is behind them, who is running these paper mills, who is collaborating with them. Something I found out which resulted in an article in Science was that paper mills want to guarantee acceptance as much as they can. If a paper is not accepted, it creates a lot of work for them and it means a longer time before their customers get what they paid for. For example, if a paper that they wrote and sold authorships for gets rejected, they’re going to have to resubmit it to another journal. So something paper mills will do is they will submit a paper to 10 journals at once and publish with whichever journal gave them the easiest time. But still, they want to try and guarantee acceptance and one way to do that is to simply bribe the editor. I have found evidence of a papermill bribing some editors and there have been many, at least tens, if not hundreds, of editors that have been let go or told to stop being editors by journals in the last year because they have been found to be compromised. This could be because of bribery or some other way of being compromised. This is what I try to uncover.

Although I’m not fighting this alone, it can feel like that. Publishers are doing things to some extent and they’re doing things that they can’t tell you about as well. And then there’s other people like me investigating this in their free time or as a side project. Not enough of us are doing it because it is a multi-million-dollar industry that is generating these papers. More papers are being published than ever before so it is a big fight.

Stay tuned as we release the rest of the conversation with Nick over the next month. In the next post, we get Nick’s take on the peer review process and fake research data, and I ask his opinion on where the fault lies in the publication of fraudulent research.

Rights retention built into Cambridge Self-Archiving Policy

April 3, 2023Publishingpublishing, rights retention, self-archiving policyNiamh Tumelty

We’re delighted to announce that the University of Cambridge has a new Self-Archiving Policy, which took effect from 1 April 2023. The policy gives researchers a route to make the accepted version of their papers open access without embargo under a licence of their choosing (subject to funder requirements). We believe that researchers should have more control over what happens to their own work and are determined to do what we can to help them to do that.

This policy has been developed after a year-long rights retention pilot in which more than 400 researchers voluntarily participated. The pilot helped us understand the implications of this approach across a wide range of disciplines so we could make an informed decision. We are also not alone in introducing a policy like this – Harvard has been doing it since 2008, cOAlition S have been a catalyst for development of similar policies, and we owe a debt of gratitude to the University of Edinburgh for sharing their approach with us.

Some of the issues that cropped up during the pilot were outlined by Samuel Moore, our Scholarly Communications Specialist, in an earlier post on the Unlocking Research blog. The patterns we saw at that stage continued throughout the year-long pilot – there was no issue for most articles, but some publishers caused confusion through misinformation or by presenting conflicting licences for the researchers to sign. We do recognise that there are costs involved in high quality publishing, and we are willing to cover reasonable costs (while noting our concerns around inequities in scholarly publishing). The fact is that some publishers are trying to charge the sector multiple times for the same content – subscription fees, OA fees, other admin fees – all while receiving free content courtesy of researchers that are usually funded by the taxpayer and charity funders.

Many researchers and funders are understandably becoming firmer in their convictions that publicly funded research should be openly and publicly available. We are fortunate that at Cambridge we are in a position to support this through our support for diamond publishing initiatives (in which the costs of publishing are absorbed for example by universities and no fees are charged to the reader or the author), through read and publish agreements negotiated on behalf of the UK higher education sector and through payment of costs associated with publishing in fully open access venues. Rights retention gives researchers a back-up plan for when other routes are not available to them, e.g. when a journal moves unexpectedly out of a read and publish agreement or a publisher does not offer any publishing route that meets their funder requirements.

This is not the end goal, we have work to do to reach an equitable approach to global scholarly publishing, and we can learn a lot especially from how South America approaches these issues. We welcome opportunities to work together with others around the world to create a more sustainable and equitable future for scholarly communications.

Read more about the new Cambridge Self-Archiving Policy on the Cambridge Open Access website.

Open Research in the Humanities: The Future of Scholarly Communication

July 18, 2022Humanities, Open Research at Cambridge Conferencecopyright, Humanities, journals, monographs, open access, publishingadmin

Authors: Emma Gilby, Matthias Ammon, Rachel Leow and Sam Moore

This is the second of a series of blog posts, presenting the reflections of the Working Group on Open Research in the Humanities. Read the opening post here. The working group aimed to reframe open research in a way that was more meaningful to humanities disciplines, and their work will inform the University of Cambridge approach to open research. This post considers the future of scholarly communication from a humanities perspective.

PILLAR ONE: THE FUTURE OF SCHOLARLY COMMUNICATION

This first pillar deals with ‘open access’ narrowly understood: the future of the publication landscape, and the question of the sustainability and viability of different publication models in an open access world.

Opportunities

The open access initiative in general values a wide range of contributions to academic life. The arts and humanities thrive on long-term, multi-scale, conversational, collaborative, interdisciplinary projects; all cultural work can be so defined. Any move towards research diversity therefore works in the favour of the arts and humanities.

Open Research aims first at opening out ‘traditional’ research content, such as that published in journals and monographs. Thus it aims also to demystify the existing publication process. In general, it prioritizes the wide dissemination of public-facing research. Further, it allows us to envisage new forms of publication, such as the use of dynamic images and data visualisation as already undertaken in investigative journalism.¹ Other examples of new Open Access formats include semi-public peer-to-peer review and the opportunity for readers to highlight passages and contribute to a crowd-sourced index of terms.²

Support required

In the immediate and short term, A&H colleagues require institutional support to understand and get to grips with the current routes to open access within academic publishing, which present various advantages and challenges. For more detail see Plan S and the History Journal Landscape, A Royal Historical Society Guidance Paper https://royalhistsoc.org/policy/publication-open-access/plan-s-and-history-journals/

Current routes to OA in scholarly publishing include:

Paying directly for article or book processing charges levied by publishers. This is easy if one’s research falls among the very small percentage of A&H research that is funded by the research councils, who allow for such fees, but otherwise challenging.

Taking advantage of a ‘read and publish’ deal set up between a publisher and an institution. This is easy if one is at the right institution at the right time, but otherwise challenging. There is also confusion amongst colleagues about what happens when these time-limited, transitional deals expire: will publishers revert to simple processing charges (see above)? Or will all published material by then be fully OA (see below)?

The self-deposit in an OA institutional repository of a manuscript that is accepted for publication and peer reviewed but that has not been edited or typeset by the publisher in any way. This is easy with the right systems in place, but problematic because it neglects the import of the editing process in A&H research. Without undergoing this process, ‘accepted manuscripts’ are very vulnerable to errors, especially in the case of the very many scholars who regularly work in languages that are not their first, or in the case of early career scholars who are less familiar with critical processes and how to evidence them, or in the case of colleagues with various kinds of disabilities such as dyslexia. Other issues also abound with the deposit of manuscripts in repositories. In cases where scholars receive an acceptance that is subject to improvement, the final ‘date of acceptance’ is ambiguous for legal purposes. And in cases where the work in question uses copyrighted material, further legal issues emerge about when and how it may be possible to circulate this. In all these senses, then, many A&H colleagues simply dislike the thought of their ‘accepted manuscript’ circulating. In the case of institutional repositories, there seems to be a direct and obvious tension between the goals of open research and quality control.

Publishing with a fully OA journal or academic publisher that does not require a processing charge. This is obviously the most straightforward and therefore best route to OA, but raises the fundamental question of how such work is conducted and funded. The notion of the ‘scholar-led’ press, established and monitored by scholars themselves, presupposes that academics can somehow fit the work of the professional editor, copy editor, translator or type setter etc. into their spare time. In addition, many OA journals rely on charitable donations. Fundraising is also a skilled business: will universities’ development directors and offices be diverted to do the work of seeking these charitable donations? Is it possible for existing publishing houses and presses to construct a sustainable business model that allows for free and open publishing, while overlaying their own professional services onto the scholarly work provided by academics? Can already successful enterprises such as Open Book Publishers in Cambridge³ be ‘scaled up’? The members of the working group have not seen any impact assessments or pilot studies considering which of the current forms of scholarly communication will simply die out in the absence of subscription and royalty income. We would like to see evidence-based impact assessments as a matter of priority. In general, it is unclear whether even the largest and most prestigious scholarly societies will survive the loss of income that will result from a move to OA. As one member of our group put it, ‘the research is not open if it is dead’.

Many questions remain, above and beyond those already evoked:

The situation with respect to the goal of publishing of all academic monographs freely and openly remains extremely fluid, and all the enquiries we were able to make in the working group confirmed that this is an area of great uncertainty. Academic books require considerable up-front investment by publishers, and it is vital that this labour and expertise is properly supported in an open access model. How to ensure that open access books do not entail a race to the bottom in terms of editorial and production standards?
Researchers and publishers will also have to think carefully about content such as book reviews, notices, short discussion pieces, author interviews and so on: content that is useful to the discipline, but peripheral to the article form and that would not generally appear in a repository, for example.
The place of UK debates in the global publishing industry is unclear. Like all scholarly publishing, A&H publishing is international in nature and most journals and presses will draw from as wide an international field as possible. How will the editor of a UK-based journal, responding to the OA requirements of UK decision-making bodies, deal with international authors who are not subject to the same requirements or set of priorities? How will an international editor deal with UK academics?⁵ These questions come up repeatedly in conversations with colleagues.
Scholarly societies in the arts and humanities do not charge a fortune for their journals, and also offer conferences, communities and support (financial and otherwise) for early-career scholars. To analyse the costs and benefits of access to their publications, it will be necessary to look across cost centres within any given institution. To offer a worked example of library costs from 2019, ‘the bundled UK cost for 2020 the RHS’s Transactions and its Camden book series is £205 (this is a maximum figure, excluding all discounts). In the financial year 1 July 2018-30 June 2019, RHS awarded (for example) £2,781.56 to support ECR researchers at York University and £3,177.16 to support ECR researchers at Oxford.’⁶ So it would be useful to see studies of the rate of institutional return on investment in publications by university libraries.
Concerns about licensing were already well documented and summarized by Peter Mandler in 2014: ‘For one thing, we do not have full ownership of our texts ourselves – we use others’ words and images, often by permission. For another, we have our own norms of how best to incorporate one work within another – e.g. by quotation – which derivative use denies. Most important is our moral right (long acknowledged in law and ethics) to protect the integrity of our work. By all means read and disseminate our work free of charge, but do not change it as you are doing so – write your own work.’⁶
Concerns about distortions allowed by CC BY in the reuse of oral history interviews and other sensitive/polemical content are important for many A&H colleagues as they are for our colleagues in the social sciences.
Evidence of predatory publishers simply reusing content from repositories is starting to emerge, seemingly justifying concerns about CC BY as opposed to CC BY- NC-ND or CC BY-ND.⁷

Footnotes

¹See for instance a project on the takeover of real estate by the Church of Scientology in Clearwater, Florida: https://projects.tampabay.com/projects/2019/investigations/scientology-clearwater-real-estate, or a series of investigative articles on the post-9/11 burgeoning of the US intelligence services collected here: https://www.washingtonpost.com/people/william-m-arkin/

²Matthew Gold & Lauren Klein, eds. Debates in the Digital Humanities (2012), https://dhdebates.gc.cuny.edu

³ ‘We are a nonprofit independent publisher with no institutional backing. Open Book relies on sales and donations to continue publishing high-quality and free to read titles. We gratefully acknowledge the generous support of The Polonsky Foundation, the Thriplow Charitable Trust, the Jessica E. Smith and Kevin R. Brine Charitable Trust, The Progress Foundation and the Dutch Research Council (NWO).’ https://www.openbookpublishers.com

⁴ See the following testimony: ‘The bi-lingual, topic-specific journal I edit…draws articles from authors across the world and is published in Switzerland. Hence, specific OA requirements pertaining to UK-based authors will be considered in setting OA policy but will probably not be a determining factor. Hence, if strict requirements are introduced around OA in relation to UK funders, this may serve to reduce the possibility for UK-based authors to submit articles to my journal. This would obviously be an issue for the journal but would also be one for UK academics also, as it would result a more limited range of potential publication outlets.’ Margot Finn, Plan S and the History Journal Landscape, A Royal Historical Society Guidance Paper, pp. 47-8.

⁵ Plan S and the History Journal Landscape, A Royal Historical Society Guidance Paper, p. 69, n. 110.

⁶ Peter Mandler, ‘Open Access: a Perspective from the Humanities’, Insights 27 (2), 2014, http://doi.org/10.1629/2048-7754.89

⁷ Guy Lavender, Jane Secker and Chris Morrison, ‘ What happens when you find your open access PhD thesis for sale on Amazon?’, 8^th July 2021, https://blogs.lse.ac.uk/impactofsocialsciences/2021/07/08/what-happens-when-you-find-your-open-access-phd-thesis-for-sale-on-amazon/