Monthly Archives: March 2018

Manuscript detectives – submitted, accepted or published?

March 27, 2018Uncategorizedaccepted manuscript, licences, open access, publishers, scholarly communicationArthur Smith

In the blog post “It’s hard getting a date (of publication)”, Maria Angelaki discussed how a seemingly straightforward task may turn into a complicated and time-consuming affair for our Open Access Team. As it turns out, it isn’t the only one. The process of identifying the version of a manuscript (whether it is the submitted, accepted or published version) can also require observation and deduction skills on par with Sherlock Holmes’.

Unfortunately, it is something we need to do all the time. We need to make sure that the manuscript we’re processing isn’t the submitted version, as only published or accepted versions are deposited in Apollo. And we need to differentiate between published and accepted manuscripts, as many publishers – including the biggest players Elsevier, Taylor & Francis, Springer Nature and Wiley – only allow self-archiving of accepted manuscripts in institutional repositories, unless the published version has been made Open Access with a Creative Commons licence.

So it’s kind of important to get that right…

Explaining manuscript versions

Manuscripts (of journal articles, conference papers, book chapters, etc.) come in various shapes and sizes throughout the publication lifecycle. At the onset a manuscript is prepared and submitted for publication in a journal. It then normally goes through one or more rounds of peer-review leading to more or less substantial revisions of the original text, until the editor is satisfied with the revised manuscript and formally accepts it for publication. Following this, the accepted manuscript goes through proofreading, formatting, typesetting and copy-editing by the publisher. The final published version (also called the version of record) is the outcome of this. The whole process is illustrated below.

Identifying published versions

So the published version of a manuscript is the version… that is published? Yes and no, as sometimes manuscripts are published online in their accepted version. What we usually mean by published version is the final version of the manuscript which includes the publisher’s copy-editing, typesetting and copyright statement. It also typically shows citation details such as the DOI, volume and page numbers, and downloadable files will almost invariably be in a PDF format. Below are two snapshots of published articles, with citation details and copyright information zoomed in. On the left is an article from the journal Applied Linguistics published by Oxford University Press and on the right an article from the journal Cell Discovery published by Springer Nature (click to enlarge any of the images).

Published versions are usually obvious to the eye and the easiest to recognise. In a way the published version of a manuscript is a bit like love: you may mistake other things for it but when you find it you just know. In order to decide if we can deposit it in our institutional repository, we need to find out whether the final version was made Open Access with a Creative Commons (CC) licence (or in rarer cases with the publisher’s own licence). This isn’t always straightforward, as we will now see.

Published Open Access with a CC licence?

When an article has been published Open Access with a CC licence, a statement usually appears at the bottom of the article on the journal website. However as we want to deposit a PDF file in the repository, we are concerned with the Open Access statement that is within the PDF document itself. Quite a few articles are said to be Open Access/CC BY on their HTML version but not on the PDF. This is problematic as it means we can’t always assume that we can go ahead with the deposit from the webpage – we need to systematically search the PDF for the Open Access statement. We also need to make sure that the CC licence is clearly mentioned, as it’s sometimes omitted even though it was chosen at the time of paying Open Access charges.

The Open Access statement will appear at various places on the file depending on the publisher and journal, though usually either at the very end of the article or in the footer of the first page as in the following examples from Elsevier (left) and Springer Nature (right).

A common practice among the Open Access team is to search the file for various terms including “creative”, “cc”, “open access”, “license”, “common” and quite often a combination of these. But even this isn’t a foolproof method as the search may retrieve no result despite the search terms appearing within the document. The most common publishers tend to put Open Access statements in consistent places, but others might put them in unusual places such as in a footnote in the middle of a paper. That means we may have to scroll through a whole 30- or 40-page document to find them – quite a time-consuming process.

Identifying accepted versions

The accepted manuscript is the version that has gone through peer-review. The content should be the same as the final published version, but it shouldn’t include any copy-editing, typesetting or copyright marking from the publisher. The file can be either a PDF or a Word document. The most easily recognisable accepted versions are files that are essentially just plain text, without any layout features, as shown below. The majority of accepted manuscripts look like this.

However sometimes accepted manuscripts may at first glance appear to be published versions. This is because authors may be required to use publisher templates at the submission stage of their paper. But whilst looking like published versions, accepted manuscripts will not show the journal/publisher logo, citation details or copyright statement (or they might show incomplete details, e.g. a copyright statement such as © 20xx *publisher name*). Compare the published version (left) and accepted manuscript (right) of the same paper below.

As we can see the accepted manuscript is formatted like the published version, but doesn’t show the journal and publisher logo, the page numbers, issue/volume numbers, DOI or the copyright statement.

So when trying to establish whether a given file is the published or accepted version, looking out for the above is a fairly foolproof method.

Identifying submitted versions

This is where things get rather tricky. Because the difference between an accepted and submitted manuscript lies in the actual content of the paper, it is often impossible to tell them apart based on visual clues. There are usually two ways to find out:

Getting confirmation from the author
Going through a process of finding and comparing the submission date and acceptance date of the paper (if available), mostly relevant in the case of arXiv files

Getting confirmation from the author of the manuscript is obviously the preferable and time-saving option. Unfortunately many researchers mislabel their files when uploading them to the system, describing their accepted/published version file as submitted (the fact that they do so when submitting the paper to us may partly explain this). So rather than relying on file descriptions, having an actual statement from the author that the file is the submitted version is better. Although in an ideal world this would never happen as everyone would know that only accepted and published versions should be sent to us.

A common incarnation of submitted manuscripts we receive is arXiv files. These are files that have been deposited in arXiv, an online repository of pre-prints that is widely used by scientists, especially mathematicians and physicists. An example is shown below.

Clicking on the arXiv reference on the left-hand side of the document (circled) leads to the arXiv record page as shown below.

The ‘comments’ and ‘submission history’ sections may give clues as to whether the file is the submitted or accepted manuscript. In the above example the comments indicate that the manuscript was accepted for publication by the MNRAS journal (Monthly Notices of the Royal Astronomical Society). So this arXiv file is probably the accepted manuscript.

The submission history lists the date(s) on which the file (and possible subsequent versions of it) was/were deposited in arXiv. By comparing these dates with the formal acceptance date of the manuscript which can be found on the journal website (if published), we can infer whether the arXiv file is the submitted or accepted version. If the manuscript hasn’t been published and there is no way of comparing dates, in the absence of any other information, we assume that the arXiv file is the submitted version.

Conclusion

Distinguishing between different manuscript versions is by no means straightforward. The fact that even our experienced Open Access Team may still encounter cases where they are unsure which version they are looking at shows how confusing it can be. The process of comparing dates can be time-consuming itself, as not all publishers show acceptance dates for papers (ring a bell?).

Depositing a published (not OA) version instead of an accepted manuscript may infringe publisher copyright. Depositing a submitted version instead of an accepted manuscript may mean that research that hasn’t been vetted and scrutinised becomes publicly available through our repository and possibly be mistaken as peer-reviewed. When processing a manuscript we need to be sure about what version we are dealing with, and ideally we shouldn’t need to go out of our way to find out.

Published 27 March 2018
Written by Dr Melodie Garnier

Skills in scholarly communication – needs & development

March 23, 2018UncategorizedOffice of Scholarly Communication

This blog post is part of the write-up of an investigation into the background of people working in scholarly communication, with a specific focus on skills.

Introduction

Library staff need to have a wide range of skills in order to undertake their roles. Whatever type of library they work in and whatever their individual role there is a range of both generic and specialist skills which staff need to acquire over the course of their career. In the Office of Scholarly Communication our focus is on making sure library staff are equipped to work in research support roles but we also have a wider interest in who makes up the global scholarly communication workforce.

In late 2016 we conducted a survey to find out more about this issue. We were slightly overwhelmed by the popularity of the survey which gathered over 500 responses from people who self-identified as working in scholarly communication which we defined as:

The process by which academics, scholars and researchers share and publish their research findings with the wider academic community and beyond. This includes, but is not limited to, areas such as open access and open data, copyright, institutional repositories and research data management.

You can read a summary of some of the findings from this research here but we wanted to delve a little deeper and look at which skills scholarly communication staff felt they needed and how they developed them. This blog post looks at that question.

Which skills?

Rather than come up with yet another list of skills that staff should or could have we made the decision to use an existing list from UKeIG – the UK eInformation Group of CILIP. This list is comprehensive in its coverage and we felt that it would provide a good basis for future comparisons as well as providing a list with which the community would be familiar. The list is of course not exhaustive and respondents were invited to add any additional skills which they felt were relevant to their roles.

Skills for current roles

Respondents were asked to highlight the skills which they used in their current roles. Their responses are summarised in Figure 1 (all figures can be viewed at higher resolution by clicking on them).
Figure 1 Skills used in current roles

Institutional repository (management/curation) (72%) and Copyright (63%) were the skills most used, closely followed by Open Access – content discovery (59%) and Understanding metrics (55%).

Some skills were used with much less frequency such as Resource Description and Access (RDA) (10%), Post-cancellation access and archiving (9%) and Mobile technology (8%). Under the option Other skills specified by respondents included knowledge of open educational resources, educating faculty and students about how to get published and electronic theses.

Skills for future roles

Respondents were also asked to select the skills they felt would be important for the future of the profession. The results are summarised in Figure 2:
Figure 2 Future skills

The top four selections had a similar number of responses: Innovations in academic publishing (51%), Research data management (50%), Understanding the user experience (47%) and Copyright (46%). It is interesting to note that Copyright is the only skill to appear in the top five of both current and future skills.

The other end of the scale again included RDA (6%) and Post-cancellation access (7%) as well as working with standards (6%). Under the option Other skills included instruction and education, developing strategic partnerships and gumption!

Developing these skills

What we really wanted to know was how people working in scholarly communication developed these skills – through their formal education, on the job training or self-directed learning. Survey respondents were asked how they had developed the skills included on the UKeIG list, and their responses can be seen in Figure 3 below:
Figure 3 Complete skill list

Almost all of the respondents had some level of either undergraduate or postgraduate education, with 71% either holding or working towards a postgraduate qualification in library and information science. Given this, it is surprising to note that so few felt that they had developed the skills they needed for their role through formal education. This gap could perhaps be attributed to the fact that 74% of respondents have held their qualifications for a significant amount of time and so these subjects were not offered at the time. They would have had little choice but to learn these skills on the job or in their own time as it was unlikely to be practical to return to formal education.

Generic skills on the list scored much higher with participants for formal education, perhaps because library school courses are designed to produce well-rounded information professionals able to work in a variety of sectors and so cover the skills that are most likely to be of use in a broad career.

Looking at the results in more detail we can see that a potential skills gap is being created. Looking at the top five skills respondents’ have identified as using in their current role we can see that the levels of formal learning for each are low (Figure 4).
Figure 4 How are current skills developed?

There is evidence that this skills gap could continue into the future. Figure 5 shows the top five skills respondents think will be of most importance to the future of the profession. Again the numbers developing these skills through formal education are low, showing that those working in scholarly communication are having to rely on either on the job or self-directed learning to develop the skills they identify as being important to the future of the profession.

Figure 5 How are future skills developed?

The results of this analysis seem to tie in with previously shared results which showed that just over half of respondents with an LIS qualification (56%) felt that this did not equip them with knowledge of the scholarly communication process.

Next steps

We will continue to analyse the results of the survey to find out more about how those working in scholarly communication have developed their skill sets and how they see future offerings being delivered. In the meantime the OSC is part of a group which is looking to tackle the provision of dedicated scholarly communication in the UK. As well as sharing our discussions on this blog you can talk to us at various events. We have already visited RLUK and are scheduled to present at LILAC and CILIP Careers Day so do come and chat to us if you have a chance!

Published 23 March 2018
Written by Claire Sewell

Scare campaigns, we have seen a few

March 15, 2018UncategorizedAAUP, academic freedom, copyright, embargoes, Merton, Norms of Science, Press embargoesOffice of Scholarly Communication

In a sister post, I identified the latest scare offensive in the ongoing discussions around open access as: ‘restricting choice of publication’. In this, there is an implied threat from editorial boards and publishers that if the UK Scholarly Communication Licence (UKSCL) were to be in place, then these journals would refuse to publish articles from affected researchers.

In this post I want to look at other threats that have been or are lurking in the shadows in the open access debate. The first is tied fairly closely to the ‘restricting choice of publication’ threat.

The new scare – threats to ‘Academic Freedom’

The term ‘Academic Freedom’ comes up a fair bit in discussions about open access. In his tweet sent during the Researcher to Reader conference*, one of my Advisory Board colleagues Rick Anderson tweeted this comment:

“Most startling thing said to me in conversation at the #R2RConf:
“I wonder how much longer academic freedom will be tolerated in IHEs.” (Specific context: authors being allowed to choose where they publish.)

In this blog I’d like to pick up on the ‘Academic Freedom’ part of the comment (which is not Rick’s, he was quoting).

Academic Freedom, according to a summary in the Times Higher Education is primarily that “Academic freedom means that both faculty members and students can engage in intellectual debate without fear of censorship or retaliation”.

This definition was based on the American Association of University Professors’ (AAUP) Statement on Academic Freedom which includes, quite specifically, “full freedom in research and in the publication of results”.

Personally I read that as meaning academics should be allowed to publish, not that they have full freedom in choosing where.

Rick has since contacted the AAUP to ask for clarification on this topic. Last Friday, he tweeted that the AAUP has declined to revisit the 1940 statement to clarify the ‘freedom in publication’ statement in light of evolution of scholarly communication since 1940.

The reason why the Academic Freedom/ ‘restricting choice of publication’ threat(s) is so concerning to the research community has changed over time. In the past it was essential to be able to publish in specific outlets because colleagues would only read certain publications. Those publications were effectively the academic ‘voice’. However today, with online publication and search engines this argument no longer holds.

What does matter however is the publication in certain journals is necessary because of the way people are valued and rewarded. The problem is not open access, the problem is the reward system to which we are beholden. And the commercial publishing industry is fully aware of this.

So let’s be clear. Academic Freedom is about freedom of expression rather than freedom of publication outlet and ties into Robert Merton’s 1942 norms of science which are:

“communalism”: all scientists should have common ownership of scientific goods (intellectual property), to promote collective collaboration; secrecy is the opposite of this norm.
universalism: scientific validity is independent of the sociopolitical status/personal attributes of its participants
disinterestedness: scientific institutions act for the benefit of a common scientific enterprise, rather than for the personal gain of individuals within them
organized scepticism: scientific claims should be exposed to critical scrutiny before being accepted: both in methodology and institutional codes of conduct.

If a publisher is preventing a researcher from publishing in a journal based on their funding or institutional policy rather than the content of the work being submitted then this is entirely in contravention of all of Robert Merton’s norms of science. But the publisher is not, as it happens, threatening the Academic Freedom of that author.

While we are here, let’s have a quick look at some of the other threats to researchers invoked in the last few years.

Historic scare 1 – Embargoes are necessary for sustainability

In the past the publishing industry has tried to claim research on half-life usage of research articles as ‘evidence’ for the “green open access = cancellations” argument. This sounds plausible except for the lack of any causal link between green open access policies and library subscriptions. The argument here is that embargoes are necessary for the ‘sustainability’ (read profit) of commercial publishers.

We should note the British Academy’s own 2014 finding that “libraries for the most part thought that embargoes for author-accepted manuscripts had little effect on their acquisition policies” and that any real cancellation issue was “the rising cost of journals at a time of budgetary constraint for libraries. If that continues, journals will be cancelled anyway, whether posted manuscripts are available or not.”

My debunking of this claim dates back to 2015 although it did raise its head again loudly in 2017 during discussions around the UKSCL. It is not uncommon for a researcher to express concern about their chosen journal’s viability because of open access. The message has been successfully pushed through to the research community.

Historic scare 2 – The need for full copyright

Copyright is supposed to protect the content creator. The argument I hear repeatedly about why publishers need authors to sign their copyright over to publishers is so they can ‘protect the author’s rights’. But when people sign their copyright away to another entity, copyright becomes a purely economic tool for financial exploitation by that entity.

There is no doubt publishers protect their own copyright. Indeed owning it allows maximum freedom to make money from the content (and prevent anyone else from doing so). But strangely whenever I have asked for examples of publishers stepping in to protect an author’s rights as the result of a copyright transfer agreement, there has been no response.

However it is not uncommon for a researcher to tell you that this is one of the protections that publishers offer them. I defer to Lizzie Gadd here who has published thoughts around the distinctions between copyright culture and scholarly culture. She notes how many academics have been led by publishers to believe that the current copyright culture supports scholarly culture to a far greater extent than it actually does.

Historic scare 3 – Press embargoes

The HEFCE open access policy requires the collection and deposit of work within three months of acceptance (although the first two years of the policy pushed this timeline out to three months from publication). This means that work is deposited into repositories, and the metadata that exists – the title, the authors, the intended journal and the abstract – is made available before publication. The work itself (and we are talking about the Author’s Accepted Manuscript, not the final Version of Record) is under an infinite embargo which will be set when the work is published. This process has its own problems, discussed elsewhere.

In 2016 there was a blow up about the metadata about an article being in the public domain before publication. Our office received multiple concerned calls by researchers asking us to remove records from the repository until publication because of fear that having that metadata available was in contravention of the embargo rules. They were concerned the journal would refuse to publish their paper. When we investigated, not only was this not publisher policy but if anyone had been threatened in this manner the publishers we contacted requested we forward the information so they could follow up.

It demonstrates how spooked academics can be by their editors/journals/publishers.

Exhausting

This latest ‘restricting choice of publication’ threat is just another in a long line of implied threats that the scholarly communication community is having to manage. Each time a new one looms we need to identify the source, develop evidence and information to counter the threat and try and work with our research community to reassure them.

Between this, and the huge amount of time we have to spend identifying dates of publication or managing publisher and funder policies or keeping track of the funds that are being spent in this space, we are exhausted.

But perhaps that’s the point?

Published 15 March 2018
Written by Dr Danny Kingsley

* Note: In the past two years I have written a precis up about the Researcher to Reader event with summaries, see: ‘It is all a bit of a mess’ Observations from Researcher to Reader conference and ‘Be nice to each other’ – the second Researcher to Reader conference. Time pressure means I may not be able to do that this year, but see the Twitter hashtag for the event.

Unlocking Research

Open Research at Cambridge