Tag Archives: accepted manuscript

Manuscript detectives – submitted, accepted or published?

March 27, 2018Uncategorizedaccepted manuscript, licences, open access, publishers, scholarly communicationArthur Smith

In the blog post “It’s hard getting a date (of publication)”, Maria Angelaki discussed how a seemingly straightforward task may turn into a complicated and time-consuming affair for our Open Access Team. As it turns out, it isn’t the only one. The process of identifying the version of a manuscript (whether it is the submitted, accepted or published version) can also require observation and deduction skills on par with Sherlock Holmes’.

Unfortunately, it is something we need to do all the time. We need to make sure that the manuscript we’re processing isn’t the submitted version, as only published or accepted versions are deposited in Apollo. And we need to differentiate between published and accepted manuscripts, as many publishers – including the biggest players Elsevier, Taylor & Francis, Springer Nature and Wiley – only allow self-archiving of accepted manuscripts in institutional repositories, unless the published version has been made Open Access with a Creative Commons licence.

So it’s kind of important to get that right…

Explaining manuscript versions

Manuscripts (of journal articles, conference papers, book chapters, etc.) come in various shapes and sizes throughout the publication lifecycle. At the onset a manuscript is prepared and submitted for publication in a journal. It then normally goes through one or more rounds of peer-review leading to more or less substantial revisions of the original text, until the editor is satisfied with the revised manuscript and formally accepts it for publication. Following this, the accepted manuscript goes through proofreading, formatting, typesetting and copy-editing by the publisher. The final published version (also called the version of record) is the outcome of this. The whole process is illustrated below.

Identifying published versions

So the published version of a manuscript is the version… that is published? Yes and no, as sometimes manuscripts are published online in their accepted version. What we usually mean by published version is the final version of the manuscript which includes the publisher’s copy-editing, typesetting and copyright statement. It also typically shows citation details such as the DOI, volume and page numbers, and downloadable files will almost invariably be in a PDF format. Below are two snapshots of published articles, with citation details and copyright information zoomed in. On the left is an article from the journal Applied Linguistics published by Oxford University Press and on the right an article from the journal Cell Discovery published by Springer Nature (click to enlarge any of the images).

Published versions are usually obvious to the eye and the easiest to recognise. In a way the published version of a manuscript is a bit like love: you may mistake other things for it but when you find it you just know. In order to decide if we can deposit it in our institutional repository, we need to find out whether the final version was made Open Access with a Creative Commons (CC) licence (or in rarer cases with the publisher’s own licence). This isn’t always straightforward, as we will now see.

Published Open Access with a CC licence?

When an article has been published Open Access with a CC licence, a statement usually appears at the bottom of the article on the journal website. However as we want to deposit a PDF file in the repository, we are concerned with the Open Access statement that is within the PDF document itself. Quite a few articles are said to be Open Access/CC BY on their HTML version but not on the PDF. This is problematic as it means we can’t always assume that we can go ahead with the deposit from the webpage – we need to systematically search the PDF for the Open Access statement. We also need to make sure that the CC licence is clearly mentioned, as it’s sometimes omitted even though it was chosen at the time of paying Open Access charges.

The Open Access statement will appear at various places on the file depending on the publisher and journal, though usually either at the very end of the article or in the footer of the first page as in the following examples from Elsevier (left) and Springer Nature (right).

A common practice among the Open Access team is to search the file for various terms including “creative”, “cc”, “open access”, “license”, “common” and quite often a combination of these. But even this isn’t a foolproof method as the search may retrieve no result despite the search terms appearing within the document. The most common publishers tend to put Open Access statements in consistent places, but others might put them in unusual places such as in a footnote in the middle of a paper. That means we may have to scroll through a whole 30- or 40-page document to find them – quite a time-consuming process.

Identifying accepted versions

The accepted manuscript is the version that has gone through peer-review. The content should be the same as the final published version, but it shouldn’t include any copy-editing, typesetting or copyright marking from the publisher. The file can be either a PDF or a Word document. The most easily recognisable accepted versions are files that are essentially just plain text, without any layout features, as shown below. The majority of accepted manuscripts look like this.

However sometimes accepted manuscripts may at first glance appear to be published versions. This is because authors may be required to use publisher templates at the submission stage of their paper. But whilst looking like published versions, accepted manuscripts will not show the journal/publisher logo, citation details or copyright statement (or they might show incomplete details, e.g. a copyright statement such as © 20xx *publisher name*). Compare the published version (left) and accepted manuscript (right) of the same paper below.

As we can see the accepted manuscript is formatted like the published version, but doesn’t show the journal and publisher logo, the page numbers, issue/volume numbers, DOI or the copyright statement.

So when trying to establish whether a given file is the published or accepted version, looking out for the above is a fairly foolproof method.

Identifying submitted versions

This is where things get rather tricky. Because the difference between an accepted and submitted manuscript lies in the actual content of the paper, it is often impossible to tell them apart based on visual clues. There are usually two ways to find out:

Getting confirmation from the author
Going through a process of finding and comparing the submission date and acceptance date of the paper (if available), mostly relevant in the case of arXiv files

Getting confirmation from the author of the manuscript is obviously the preferable and time-saving option. Unfortunately many researchers mislabel their files when uploading them to the system, describing their accepted/published version file as submitted (the fact that they do so when submitting the paper to us may partly explain this). So rather than relying on file descriptions, having an actual statement from the author that the file is the submitted version is better. Although in an ideal world this would never happen as everyone would know that only accepted and published versions should be sent to us.

A common incarnation of submitted manuscripts we receive is arXiv files. These are files that have been deposited in arXiv, an online repository of pre-prints that is widely used by scientists, especially mathematicians and physicists. An example is shown below.

Clicking on the arXiv reference on the left-hand side of the document (circled) leads to the arXiv record page as shown below.

The ‘comments’ and ‘submission history’ sections may give clues as to whether the file is the submitted or accepted manuscript. In the above example the comments indicate that the manuscript was accepted for publication by the MNRAS journal (Monthly Notices of the Royal Astronomical Society). So this arXiv file is probably the accepted manuscript.

The submission history lists the date(s) on which the file (and possible subsequent versions of it) was/were deposited in arXiv. By comparing these dates with the formal acceptance date of the manuscript which can be found on the journal website (if published), we can infer whether the arXiv file is the submitted or accepted version. If the manuscript hasn’t been published and there is no way of comparing dates, in the absence of any other information, we assume that the arXiv file is the submitted version.

Conclusion

Distinguishing between different manuscript versions is by no means straightforward. The fact that even our experienced Open Access Team may still encounter cases where they are unsure which version they are looking at shows how confusing it can be. The process of comparing dates can be time-consuming itself, as not all publishers show acceptance dates for papers (ring a bell?).

Depositing a published (not OA) version instead of an accepted manuscript may infringe publisher copyright. Depositing a submitted version instead of an accepted manuscript may mean that research that hasn’t been vetted and scrutinised becomes publicly available through our repository and possibly be mistaken as peer-reviewed. When processing a manuscript we need to be sure about what version we are dealing with, and ideally we shouldn’t need to go out of our way to find out.

Published 27 March 2018
Written by Dr Melodie Garnier

Half-life is half the story

October 16, 2015Uncategorizedaccepted manuscript, Elsevier, embargo, half life, journals, open access, publishers, Springer, WileyOffice of Scholarly Communication

This week the STM Frankfurt Conference was told that a shift away from gold Open Access towards green would mean some publishers would not be ‘viable’ according to a story in The Bookseller. The argument was that support for green OA in the US and China would mean some publishers will collapse and the community will ‘regret it’.

It is not surprising that the publishing industry is worried about a move away from gold OA policies. They have proved extraordinarily lucrative in the UK with Wiley and Elsevier each pocketing an extra £2 million thanks to the RCUK block grant funds to support the RCUK policy on Open Access.

But let’s get something straight. There is no evidence that permitting researchers to make a copy of their work available in a repository results in journal subscriptions being cancelled. None.

The September 2013 UK Business, Innovation and Skills Committee Fifth Report: Open Access stated “There is no available evidence base to indicate that short or even zero embargoes cause cancellation of subscriptions”. In 2012 the Committee for Economic Development Digital Connections Council in The Future of Taxpayer-Funded Research: Who Will Control Access to the Results? concluded that “No persuasive evidence exists that greater public access as provided by the NIH policy has substantially harmed subscription-supported STM publishers over the last four years or threatens the sustainability of their journals”

I am the first to say that we should address questions about how the scholarly publishing landscape is shifting with systematic data gathering, analysis and discussion. We need to look at trends over time and establish what they mean for the ongoing stability of the scholarly literary corpus. But consistently evoking the ‘green open access equals cancellation so we should have longer embargoes’ argument is not the solution.

Let’s put this myth to bed once and for all.

The half life argument

Publishers have been trying to use the half-life argument for some time to justify extending their embargo periods on the author’s accepted manuscript. Embargoes are how long after publication before the manuscript (the author’s Word or LaTeX document, usually saved as a pdf) can be made available in the author’s institutional or a subject-based repository.

The half life of an article is the time it takes for articles to reach half their total number of downloads.

The argument goes along the lines of ‘if articles have a longer half life then they should be kept under embargo for longer’ because, according to a blog published at the beginning of this year by Alice Meadows Open access at Elsevier 2014 in retrospect and a look at 2015: “If an embargo period falls too far below the period it takes for a journal to recoup its costs, then the journal’s survival will be jeopardized.”

The problem with this argument is that there has been, and continues to be, no evidence that permitting authors to make work available in a repository leads to journal cancellations. It is ironic that the consistent line on this issue from the publishers has been that the half–life argument is helping ‘set evidence-based policy settings of embargo periods’.

The half-life spectre was raised again at this week’s STM meeting by Philip Carpenter, executive vice president of research at Wiley where he noted that only 20% of Wiley journal usage occurred in the first 12 months after publication and referred to a 12 month embargo offering only ‘limited protection’ according to The Bookseller.

Evidence for the green = cancellation argument

The need for longer embargoes – 1

The way the ‘evidence’ for this argument has been presented is telling. There is a particular paragraph in Meadow’s blog that is worth republishing in full:

How long those embargo periods should be before manuscripts become publicly accessible is a key issue. To help set evidence-based policy settings of embargo periods, we have contributed to growing industry data. Findings of a recent usage study demonstrated that there is variation in usage half-lives both within and between disciplines. This finding aligned with a study by the British Academy, which also found variation in half-lives between disciplines – and half-lives longer than those previously suggested.

Despite looking like links to two separate items (which gives the impression of more ‘evidence’), the first two links in the section above to ‘industry data’ and to a ‘recent usage study’ both lead to the SAME November, 25, 2013 study by Phil Davis into journal half life usage that started the whole shebang off. The study looked at the usage patterns of over 2800 journals found that only 3% of the journals had half-lives of 12 months or less. The fewest journals with this short half-life were in the Life Sciences (1%) and the highest in engineering (6%).

While in no way criticising the findings of that study, it should be pointed out that the author clearly states that the study was funded by the Professional & Scholarly Publishing (PSP) division of the Association of American Publishers (AAP). The work has not been peer reviewed or published in the literature.

The British Academy report Open Access Journals in the Humanities and Social Sciences does not appear to be available online any longer.

Now, there is no dispute that there are differences in usage patterns of articles between disciplines. This is a reflection of differing communication norms and behaviours. But there is a huge logic jump to then conclude that therefore we need to increase embargo periods. Peter Suber went into some detail on 11 January 2014 (yes, we have been swinging around on this one for a while now) explaining the logical flaw in the argument. At the time Kevin Smith also noted in a blog “Half-lives, policies and embargoes” that “we should not accept anything that is presented as evidence just because it looks like data; some connection to the topic at hand must be proved”.

The need for longer embargoes – 2

Meadow’s blog went on to say:

There are real-world examples where embargo periods have been set too low and the journal has become unviable. For example, as published in the The Scholarly Kitchen, the Journal of Clinical Investigation lost about 40 percent of its institutional subscriptions after adopting a 0-month embargo period in 1996, so it was forced to return to a subscription model in 2009. Similar patterns have been seen with other journals.

The issue referred to here has nothing to do with the half life of research papers that are being made available open access through a repository. This refers to a journal that went to a GOLD Open Access model in 1996 (publishing open access and relying on non-subscription revenue sources), but eventually decided they needed to impose a subscription again in 2009. Not only is this example entirely unrelated to the embargo issue for green Open Access, it happened six years ago. Note the blog does not link to other ‘similar patterns’. They do not exist.

Green policies mean cancellations

The half-life argument has replaced previous, even less substantial ‘evidence’ provided by the publishing industry in 2012. The study was cited as evidence for the argument that “short embargo periods are likely to lead to significant cancellations” by Wiley in a 2013 blog post Open Access – Keeping it Real and by Springer in an interview published as Open Access – Springer tightens rules on self archiving.

The study was conducted by the Association of Learned and Professional Society Publishers (ALPSP). However the study, which was written up and published online had some major methodological issues. It consisted of a single poorly worded question:

“If the (majority of) content of research journals was freely available within 6 months of publication, would you continue to subscribe? Please give a separate answer for a) Scientific, Technical and Medical journals and b) Humanities, Arts and Social Sciences Journals if your library has holdings in both of these categories.”

An analysis of the study highlighted methodological criticisms. The work was not peer reviewed. But there are deeper questions about the motivation behind the survey. The researcher was the Chair of the ALPSP Research Committee and was on the steering committee for the Publishers Research Coalition, raising questions about her (and the study’s) objectivity. There are several other issues relating to the validity of the researcher.

What is the real problem?

There is no doubt that open access policies are causing disruption to publisher’s funding models. That is hardly surprising and in some cases may well be the intent of the policy. But presenting spurious arguments to try and maintain the status quo is not moving this discussion forward.

The point is we do need evidence. If green OA is causing cancellations then let’s collect some numbers and talk about the issues:

How does this affect the scholarly communication system?
What are the implications?
Does this mean publishers will fold (unlikely in the short term)?
Will some journals close (possibly)?
Is that a problem?
Perhaps we need to consider issues relating to the reward system and what is valued?

But I will give the last word to the person who caused me to write this blog in the first place – Philip Carpenter, executive vice-president of research at Wiley who, according to The Bookseller said at the STM meeting: “We’ll need to think hard about what factors influence library purchasing decisions; we don’t know enough [about that]”.

Hear, hear.

Published 16 October 2015
Written by Dr Danny Kingsley

Unlocking Research

Open Research at Cambridge