Tag Archives: licences

Manuscript detectives – submitted, accepted or published?

In the blog post “It’s hard getting a date (of publication)”, Maria Angelaki discussed how a seemingly straightforward task may turn into a complicated and time-consuming affair for our Open Access Team. As it turns out, it isn’t the only one. The process of identifying the version of a manuscript (whether it is the submitted, accepted or published version) can also require observation and deduction skills on par with Sherlock Holmes’.

Unfortunately, it is something we need to do all the time. We need to make sure that the manuscript we’re processing isn’t the submitted version, as only published or accepted versions are deposited in Apollo. And we need to differentiate between published and accepted manuscripts, as many  publishers – including the biggest players Elsevier, Taylor & Francis, Springer Nature and Wiley  – only allow self-archiving of accepted manuscripts in institutional repositories, unless the published version has been made Open Access with a Creative Commons licence.

So it’s kind of important to get that right… 

Explaining manuscript versions

Manuscripts (of journal articles, conference papers, book chapters, etc.) come in various shapes and sizes throughout the publication lifecycle. At the onset a manuscript is prepared and submitted for publication in a journal. It then normally goes through one or more rounds of peer-review leading to more or less substantial revisions of the original text, until the editor is satisfied with the revised manuscript and formally accepts it for publication. Following this, the accepted manuscript goes through proofreading, formatting, typesetting and copy-editing by the publisher. The final published version (also called the version of record) is the outcome of this. The whole process is illustrated below.

Identifying published versions

So the published version of a manuscript is the version… that is published? Yes and no, as sometimes manuscripts are published online in their accepted version. What we usually mean by published version is the final version of the manuscript which includes the publisher’s copy-editing, typesetting and copyright statement. It also typically shows citation details such as the DOI, volume and page numbers, and downloadable files will almost invariably be in a PDF format. Below are two snapshots of published articles, with citation details and copyright information zoomed in. On the left is an article from the journal Applied Linguistics published by Oxford University Press and on the right an article from the journal Cell Discovery published by Springer Nature (click to enlarge any of the images).


Published versions are usually obvious to the eye and the easiest to recognise. In a way the published version of a manuscript is a bit like love: you may mistake other things for it but when you find it you just know. In order to decide if we can deposit it in our institutional repository, we need to find out whether the final version was made Open Access with a Creative Commons (CC) licence (or in rarer cases with the publisher’s own licence). This isn’t always straightforward, as we will now see.

Published Open Access with a CC licence?

When an article has been published Open Access with a CC licence, a statement usually appears at the bottom of the article on the journal website. However as we want to deposit a PDF file in the repository, we are concerned with the Open Access statement that is within the PDF document itself. Quite a few articles are said to be Open Access/CC BY on their HTML version but not on the PDF. This is problematic as it means we can’t always assume that we can go ahead with the deposit from the webpage – we need to systematically search the PDF for the Open Access statement. We also need to make sure that the CC licence is clearly mentioned, as it’s sometimes omitted even though it was chosen at the time of paying Open Access charges.

The Open Access statement will appear at various places on the file depending on the publisher and journal, though usually either at the very end of the article or in the footer of the first page as in the following examples from Elsevier (left) and Springer Nature (right).


A common practice among the Open Access team is to search the file for various terms including “creative”, “cc”, “open access”, “license”, “common” and quite often a combination of these. But even this isn’t a foolproof method as the search may retrieve no result despite the search terms appearing within the document. The most common publishers tend to put Open Access statements in consistent places, but others might put them in unusual places such as in a footnote in the middle of a paper. That means we may have to scroll through a whole 30- or 40-page document to find them – quite a time-consuming process.

 Identifying accepted versions

The accepted manuscript is the version that has gone through peer-review. The content should be the same as the final published version, but it shouldn’t include any copy-editing, typesetting or copyright marking from the publisher. The file can be either a PDF or a Word document. The most easily recognisable accepted versions are files that are essentially just plain text, without any layout features, as shown below. The majority of accepted manuscripts look like this.

However sometimes accepted manuscripts may at first glance appear to be published versions. This is because authors may be required to use publisher templates at the submission stage of their paper. But whilst looking like published versions, accepted manuscripts will not show the journal/publisher logo, citation details or copyright statement (or they might show incomplete details, e.g. a copyright statement such as © 20xx *publisher name*). Compare the published version (left) and accepted manuscript (right) of the same paper below.


As we can see the accepted manuscript is formatted like the published version, but doesn’t show the journal and publisher logo, the page numbers, issue/volume numbers, DOI or the copyright statement.

So when trying to establish whether a given file is the published or accepted version, looking out for the above is a fairly foolproof method.

Identifying submitted versions

This is where things get rather tricky. Because the difference between an accepted and submitted manuscript lies in the actual content of the paper, it is often impossible to tell them apart based on visual clues. There are usually two ways to find out:

  • Getting confirmation from the author
  • Going through a process of finding and comparing the submission date and acceptance date of the paper (if available), mostly relevant in the case of arXiv files

Getting confirmation from the author of the manuscript is obviously the preferable and time-saving option. Unfortunately many researchers mislabel their files when uploading them to the system, describing their accepted/published version file as submitted (the fact that they do so when submitting the paper to us may partly explain this). So rather than relying on file descriptions, having an actual statement from the author that the file is the submitted version is better. Although in an ideal world this would never happen as everyone would know that only accepted and published versions should be sent to us.

A common incarnation of submitted manuscripts we receive is arXiv files. These are files that have been deposited in arXiv, an online repository of pre-prints that is widely used by scientists, especially mathematicians and physicists. An example is shown below.

Clicking on the arXiv reference on the left-hand side of the document (circled) leads to the arXiv record page as shown below.

The ‘comments’ and ‘submission history’ sections may give clues as to whether the file is the submitted or accepted manuscript. In the above example the comments indicate that the manuscript was accepted for publication by the MNRAS journal (Monthly Notices of the Royal Astronomical Society). So this arXiv file is probably the accepted manuscript.

The submission history lists the date(s) on which the file (and possible subsequent versions of it) was/were deposited in arXiv. By comparing these dates with the formal acceptance date of the manuscript which can be found on the journal website (if published), we can infer whether the arXiv file is the submitted or accepted version. If the manuscript hasn’t been published and there is no way of comparing dates, in the absence of any other information, we assume that the arXiv file is the submitted version.


Distinguishing between different manuscript versions is by no means straightforward. The fact that even our experienced Open Access Team may still encounter cases where they are unsure which version they are looking at shows how confusing it can be. The process of comparing dates can be time-consuming itself, as not all publishers show acceptance dates for papers (ring a bell?).

Depositing a published (not OA) version instead of an accepted manuscript may infringe publisher copyright. Depositing a submitted version instead of an accepted manuscript may mean that research that hasn’t been vetted and scrutinised becomes publicly available through our repository and possibly be mistaken as peer-reviewed. When processing a manuscript we need to be sure about what version we are dealing with, and ideally we shouldn’t need to go out of our way to find out.

Published 27 March 2018
Written by Dr Melodie Garnier
Creative Commons License

Is CC-BY really a problem or are we boxing shadows?

Comments from researchers and colleagues have indicated some disquiet about the Creative Commons (CC-BY) licence in some areas of the academic community. However, in conversation with some legal people and contemporaries at other institutions (some of these exchanges are replicated at the end of the blog) one of the observations was that generally academics are not necessarily cognizant with what the licences offer and indeed what protections are available under regular copyright.

To try and determine whether this was an education and advocacy problem or if there are real issues we had a roundtable discussion on 29 February at Cambridge University attended by about 35 people who were a mixture of academics, administrators, publishers and legal practitioners. The discussion centred on some of the objections raised in the information circulated before the meeting (which is summarised at the end of this blog). For ease of description each objection is addressed in turn.


Creative Commons provide a series of licences that people who create work can add to their work which tell users what they can or cannot do with it. There are a range of licenses that run from no restrictions at all CC-0 to fairly restrictive CC-BY-NC-ND-SA* where the user must attribute the author, not amend the work, cannot make any financial gain from it and must put the same licence on anything they produce using this work.

There are increasing requirements from funders such as the Wellcome Trust and RCUK in the UK that any work published open access must have a Creative Commons Attribution (CC-BY) licence attached to it. The rationale behind this is that research needs to be available for other researchers to both read and reuse, but also to text and data mine without fear of copyright breaches. Work that is available under a CC-BY licence can be easily incorporated into course reading lists without copyright complications.

* Note added 8 March – a comment has been sent through is that the CC-BY-NC-ND-SA is impossible to apply because the share-alike and no derivatives clauses are mutually exclusive and cannot be applied together. See this explanation.

Summary of the discussion

The general feeling in the discussions was that academics do want to share their work but they don’t want things to be used incorrectly. The outcome of the discussion was that while there are some confusions in this area, and we could do some work on advocacy and educational materials there are also some specific cases where CC-BY has the potential to cause issues.  In a small number of cases issues have actually occurred.

Is CC-BY a problem? For whom?

We should note here that CC-BY only affects a proportion of research published in the UK. While all research is potentially affected by the HEFCE requirement to make work available, the route preferred is through placing a copy in a repository. So this discussion affects only those researchers who have a specific grant from the Charities Open Access Fund (Wellcome Trust) or the RCUK. Humanities researchers tend not to hold grants, and for those that do, it is their articles, not their monographs that are affected by this requirement.

While there are some actual concrete examples of issues for researchers in the Arts and Humanities, many of the problems discussed here are what could happen. There was a comment from a scientific publisher that the sciences also had some concerns about CC-BY when it was first introduced, but none of the concerns have actually come to fruition. Another person noted there have been hundreds of thousands of pieces of content published under CC-BY licences, with very few known problem cases or harm. This is telling. The question was raised: Are we just repeating myths?

On the other hand, just because issues haven’t happened yet does not mean that it would not be a serious problem should they did occur. One of the questions at the end of the discussion was: “Are the ethical norms of society strong enough to stop these concerns happening?” It would appear that to date they have been in the sciences.

Moral rights

CC-BY is an attribution licence. This means the moral right for the originator of the work to be identified is retained. However the moral right for the integrity of the research is not protected. The discussion centred around this.

If someone uses work under a CC-BY licence and makes alterations to it, they do need to indicate they have changed a work but not how they have altered it. The concern in the group was that the work could be altered so the meaning is entirely changed and it would still be attributed to the original author.

Authors can object to the derogatory treatment of their work. The recourse of being able to ask to have the originator’s name taken off the work was not seen as satisfactory because then the person who has adapted the work is potentially able to publish the work, which is based substantially on someone else’s work, as their own.

That said, one comment was that academic works are always open to interpretation, whether quoted or not and whether available under a CC-BY licence or not.


The area of translations does appear to have some concrete examples of problems caused by CC-BY for Humanities & Social Science authors. One of the issues is it is very difficult to check a translation unless the original author can read the language into which their work has been translated.


Of all of the areas of discussion, plagiarism raised the most opinions. The accusation that CC-BY somehow ‘encourages’ plagiarism is often levelled. Some arguments are that making work available under a Creative Commons licence protect authors against plagiarism rather than encourage it. Works available in the public domain are far more easily identified as the original work than something published on paper and held on a library shelf, for example.

There was a debate about what actually constitutes plagiarism. One opinion was that ‘It’s plagiarism unless it’s in quotes’. However while the use of quote marks would protect the integrity of the work, there is nothing legally wrong with a derivative use of a work that is available under CC-BY – legally this is not plagiarism.

Nothing about the CC-BY licence overrides UK law about fair dealing. One of the lawyers present noted that academics don’t understand the details of copyright. Academics want full protection but also full sharing. In the world of the internet there’s a free-for-all – people copy-and-paste from wherever they want. No-one respects licences, so an academic work is not necessarily protected under current rules.

It was noted that plagiarism occurs all the time, even when articles are all rights reserved and under traditional copyright. And while Open Access publishing does make plagiarism easier (regardless of the licence), it doesn’t change the underlying principle that it’s unethical. Ethical behaviour in academia sits separately from copyright law.

Sensitive information

The area of sensitive information seems to have the strongest case for not using a CC-BY licence. Researchers working in areas that might contain sensitive information – such as medical or criminal areas – spend a great deal of time ensuring that their findings are presented sensitively and ensuring their distribution is appropriate. The concern with CC-BY licences mean that these findings can be misconstrued which would be damaging to the researcher and could go back to the participants and affect them. If presented in the wrong way, altered research outputs could affect not just their research but also participants.

There is an issue about the dialogue between the people that are being studied and if they have any moral rights about how the information is being used.

An example that was given was in anthropology, working with a community of Native Americans in northern California, who released sensitive data and stories from their cultural past which they want to be accessed. However because they have been exploited in the past they wanted some form of restriction on how these things can be reused. This is an example where a CC-BY licence would not be appropriate.

An oral historian discussed the type of work they do with subjects talking about traumatic periods of their life. In these cases the researcher enters in a covenant with them about how their work can be used. This would not be able to be dealt with ethically under a CC-BY licence. The issue is about subsequent control over reuse of research, with concern about it being co-opted and used in another context.

The question about ethical use of material was raised again, with someone noting that no matter what licence it is available under you can’t control what people do with your work if they disagree with you.

Items containing third party copyright

Being required to publish work under a CC-BY licence does cause problems for people whose work contains a large amount of 3rd party material. This is because the burden on the author to obtain permissions for all of the works would be both time consuming and expensive. May researchers have raised questions about whether they can even do their work if they’re required to publish under CC-BY.

That said, if researchers are themselves using CC-BY works this issue is mitigated because they automatically have permission to use the material. This raises the question; does CC-BY make it more difficult or easier?


There were some examples raised where a series of works that were freely available had been packaged up and sold. This raised the question: Who is being harmed in commercial exploitation of academic works?

Academics do not publish in journals for money, so the originator of a work that is subsequently sold on is not personally losing a revenue stream. There was a distinction between the academic and non-academic publishing environment. It was agreed that the person buying these works are being scammed. The concern is that people are being exploited by being made to pay for things that should be freely available.

The discussion moved to whether a Non Commercial licence would solve this problem. The issue here is the confusion over the definition of ‘commercial’ in this context. An institution that has a revenue stream from student fees could be seen to be commercial and therefore unable to include CC-BY-NC items on their reading lists.

It was noted that CC-BY–NC-ND is extremely restrictive about ways works can be used.

Academic freedom

The discussion several times touched on the broader issue of the government putting an increasing number of requirements against researchers. The questions raised were: “Does someone who is fronting up with the money have the rights to enforce a particular licence? What about the subjects of a study?”

There is supposed to be arms length between funders and universities but a concern is that funding bodies want to have more power to tell academics what to work on.

Next steps

In summary, the discussion indicated that CC-BY licences do not encourage plagiarism, or issues with commercialism within academia (although there is a broader ethical issue). However in some cases CC-BY licences could pose problems for the moral integrity of the work and cause issues with translations. CC-BY licenses do create challenges for works containing sensitive information and for works containing third party copyright.

There is an expectation amongst the academic community that people behave ethically and within cultural norms.

As agreed with the group we have published this blog post which summarises the discussions held this week. In discussions about the Open Access Policy Framework for the University it would be helpful to include a statement that there is concern about CC-BY licences for some disciplines and types of research.

Background information sent to participants prior to the discussion

Commentary on CC-BY in published reports

The issue of the CC-BY licenses was a recurrent theme in A review of the RCUK review of implementation of its OA policy (March 2015). Many arts, humanities and social science disciplines hold ‘principled and practical objections to the use of CC-BY licences’ (p18). This is partly because work under a CC-BY license ‘could be both used commercially in ways of which the author does not approve and also might not be properly acknowledged as their work’ (pp19-20).

The Royal Historical Society evidence to the RCUK review noted that humanities scholars have particular objections to certain kinds of ‘derivative use’ that amount to the encouragement of plagiarism. Because the ‘attribution’ requirement in CC BY is very loose, it is possible for a reuser of a humanities article to alter it and reissue it under their own name, specifying only that it is an adaptation of the original, but without specifying how it has been adapted. In this way reusers may adopt the style, argument and ‘personality’ of the original work under their own name (and even copyright it). This represents a violation of the specific moral right of the author to the integrity of the work, and the only recourse offered to the author by CC BY is to have their name removed from the attribution (which makes the violation worse). This kind of re-use is as likely to degrade as to enhance the public benefit of the research.

The British Academy’s response to the Commons Select Committee (2013) noted that many articles in HSS subjects are the product of single-author scholarship, where there is more of a claim on ‘moral rights’ that are not adequately protected under an unrestricted CC-BY licence. There were also concerns about commercial reuse of work that contains third party copyright, involving complicated permissions. The response suggests that it should be possible to vary Creative Commons licences according to the usages and requirements of different subject areas – and that an ‘Attribution-NonCommercial-NoDerivs’ licence (CC-BY-NC-ND) may very often be more appropriate

Notes on an April 2013 Royal Historic Society position changing workshop on CC-BY and Humanities (chaired by Peter Mandler) noted that the editors of a number of history journals have suggested that the CC-BY licence facilitates and promotes commercial re-use and uses akin to plagiarism; that the licence therefore amounts to an infringement of authors’ moral and intellectual property rights; and that it is likely to damage the quality of education.

The HistoryUK Submission to the 2013 Business, Innovation and Skills Committee Enquiry on Open Access Publishing raised issues about the loss of protection of intellectual property, the dangers associated with allowing derivative works in sensitive areas of research, and the possible increased costs or embargos publishers may feel compensate for the transfer of a commercial asset to a third party.

Comments from researchers and administrators

In preparation for the round table, Danny Kingsley asked her community across the sector what kinds of objections different people in an administrative or library role had heard from researchers. These are summarised below.

English researcher at Cambridge – “I would prefer not to make my work, produced with the benefit of public funding, available in a form that would allow others to exploit it commercially, as the simple CC-BY licence does. My preference would be for the CC BY-NC-SA licence.”

Research Information Specialist – One question to ask here is whether traditional publishing models – such as signing over copyright itself – are really more beneficial to authors, and of course to weigh the risk of a negative CC experience against the benefits of positive ones.

Concerns raised in discussion with academics in the Humanities (reflected in two responses)

  1. A belief that CC BY encourages plagiarism
  2. That content licenced under CC BY is not monitored for copyright and other infringement to the same extent as more restrictive licences (a misguided belief that publishers actively monitor use and reuse of content I think)
  3. I have also heard the more vague concern about ideas being manipulated or twisted in some way and then re-published under the author’s name
  4. That encouraging reuse, especially derivatives, means the author has no control over what people do with the information (and therefore are associated with something that they would rather not be)

Advice provided on Creative Commons and licensing

Published 3 March 2016
Written by Dr Danny Kingsley, with thanks to Dr Philip Boyes and Dr Joyce Heckman for their notes.

Creative Commons License