Category Archives: Uncategorized

A Fast-Track Route to Open Access

In the last two years, since the REF 2021 open access policy came into force, the Open Access Team has received an ever increasing number of manuscript submissions for archiving in Apollo, Cambridge’s institutional open access repository.

We have been thinking long and hard about ways to cope with the workload, by scrutinising existing practices and streamlining workflows, because we want to provide the best possible service to our researchers, commensurate with the University’s world leading research.

This blog introduces what is perhaps the greatest overhaul of our workflows since the service began: a new ‘Fast Track’ deposit system.

Work it harder

Before the start of the REF OA policy (2014-2016), the Open Access Team would process and manually curate every manuscript submission we received. Authors could expect an initial response within 1-2 working days, after which (usually within a month) we would archive their manuscript in Apollo.

A simplified workflow for a typical manuscript was:

  1. Manuscript uploaded by submitter in Symplectic Elements.
  2. Item created in Apollo (DSpace) workflow
  3. Helpdesk ticket created (Zendesk).
  4. Open Access Team reviews manuscript, advises submitter and makes a decision.
  5. Open Access Team archives the manuscript in Apollo and informs submitter.

Both the decision (4) and archive (5) steps take time. For each manuscript we would need to decide whether the files we received could be archived, what funder open access policies were at play and the open access options available from the publisher. We could then advise authors about their open access choices.

To archive a manuscript the process was broadly the following:

  1. Review the helpdesk ticket (Zendesk) for the open access decision.
  2. Enter as many publication details as possible in Symplectic Elements.
  3. Retrieve the submission from the Apollo (DSpace) deposit workflow.
  4. Add licence and metadata to the record.
  5. Review the submission and approve for archiving.
  6. Move the item to the relevant departmental collection and apply an appropriate embargo (if required).
  7. Finally, update the helpdesk ticket and send the original submitter a link to their Apollo record.

Each manuscript took on average 18 minutes to archive, which, besides being manually tedious and prone to error, was extremely time-consuming. Add to this the time required to make the initial decision and each manuscript submission could easily take 30 minutes for the Open Access Team to fully process from start to finish, especially if an open access fee had to be paid.

Fast-forward two years and with the rate of new manuscript submissions now peaking at over 1,300 per month, simply processing manuscripts for the REF would require more than four full-time staff members. Whilst these manual processes were viable for a handful of submissions a day, they became unwieldy at scale.

Make it better

Our first attempt at speeding up our open access system began in August 2017. To start we made a number of operational changes to reduce the time spent processing manuscript submissions:

  • We would rely entirely on the metadata present in Symplectic Elements to populate the Apollo records (i.e. we would not curate manual records).
  • The Open Access Team would no longer update the helpdesk records, instead internal record keeping would be automated as much as possible.

Unfortunately, the number of steps in the Apollo workflow was still roughly the same as the previous process, but with one key difference: a new field to record what we call the ‘Fast Track’ decision. There were seven Fast Track options:

  • Submitted
  • Proof
  • Published (not open access)
  • Published (open access)
  • Accepted (published)
  • Accepted (not published)
  • Other

The first six options represent the vast bulk of all manuscripts received by the Open Access Team, and ‘Other’ option simply acts as a catch-all for anything else. By simply knowing what sort of manuscript has been uploaded much of the decision and archiving process can be automated. However, the agent still needed to retrieve the item from the Apollo workflow, check the version of the file and publication status of the paper, add some metadata fields, approve the item, and move it to an appropriate collection.

Figure 1. The Apollo workflow page of a typical manuscript submission, with the addition of the new ‘Fast Track’ field.

The choice of Fast Track decision leads to four possible outcomes which would ‘trigger’ actions in our Zendesk helpdesk:

  • Submitted, proof, published (not open access)
    • Email submitter, ask for accepted manuscript
  • Published (open access)
    • Archive in Apollo (no embargo) ⇒ Email submitter Apollo link
  • Accepted (published), accepted (not published)
    • Archive in Apollo (embargoed) ⇒ Email submitter Apollo link
  • Other
    • Refer to Open Access Team

Despite being a much faster process, it was still manually tedious. It could also require up to 33 actions from agents (29 mouse clicks) and 14 web pages to be loaded, still not very user friendly. However, the time to archive had decreased from 18 to 9 minutes – a 50% reduction from the previous fully manual system.

Do it faster

So what if all the steps involved in processing a manuscript submission could be reduced to the absolute minimum, and be actionable within a single webpage? After a short development sprint, the Open Access Team launched the ‘Fast Track Deposits’ interface last September. A snapshot of the user interface is shown below.

Figure 2. The Fast Track interface. Choosing one of the options in blue is enough to fully archive a manuscript, or process it for further action by the submitter or the Open Access Team.

At the top of the page, the agent can see a ‘publication summary’ including the item title, the journal title, and publisher DOI if available. Both the item title and publisher DOI are hyperlinked, so that the agent can Google-search the item or land on the publisher’s webpage with a single mouse click.

The agent must first inspect the file and check that it is a suitable version (i.e. either the accepted version or the open access published version). If wrongly labelled, they must relabel the file via a dropdown menu, and add/delete files as appropriate. The agent then ‘describes’ the manuscript (i.e. decides whether it is the accepted, published, submitted or proof version) and submits their decision. The decision determines the trigger behaviour in the automatically populated helpdesk ticket. The agent is then free to move on to the next item.

If the decision is ‘accepted’ or ‘published open access’, the item is deposited and the submitter is automatically notified via email. For submitted, proof, and non-OA published versions, the author receives an automatic email asking for the accepted manuscript. Items are archived in the repository under a generic collection, and any forthcoming publication details are added to the record via external source information in Elements.

To see just how efficient Fast Track is we’ve prepared a short demonstration video which captures some of the key features:

Video 1. Real-time demonstration of the Fast Track system.

Makes us stronger

Agents therefore need only make one decision: identify the file version. But the real ingenuity of the Fast Track system is that embargoes can be set automatically by:

  1. Taking into account the decision made by the agent (e.g. no embargo if published open access);
  2. Detecting publication status and publication dates from Elements; and
  3. Retrieving journals’ embargo policies via Orpheus (you can learn more about Orpheus in our previous blog post).

In some cases, usually because we don’t know the publication date, we can’t determine the embargo length of an accepted manuscript. In such cases we apply a 36 month embargo from the date of the Fast Track decision. We know that this embargo won’t always be correct, however, we routinely check manuscripts in Apollo and update embargoes accordingly.

Figure 3. Simplified overview of the Fast Track process. The key decision is to determine the type of manuscript that has been submitted. Everything else is handled automatically.

Since launching Fast Track the average time to process a manuscript is 1-2 minutes. More than 8,000 items have been processed since launching the phase two Fast-Track interface. If items processed under the phase one effort are included, the number goes up to just over 14,000. And since a picture speaks a thousand words, Figure 4 below shows the effect produced by the new interface launched in September on our backlog of unprocessed submissions.

Figure 4. Historical change in the number of unprocessed open access manuscript submissions. The total number of outstanding manuscript submissions peaked at nearly 2,400 in September 2018. Immediately after launching the Fast Track website the backlog dropped dramatically and was completely eliminated by March 2019.

We will continue to develop Fast Track to further streamline our processing of manuscripts. We have already started to partner with librarians and administrators across the University to leverage the collective knowledge about open access which now exists within the University’s professional academic services.

Get in contact: If you are running a DSpace repository and would like to implement Fast Track to work alongside your existing workflows email us at support@repository.cam.ac.uk

Published 23 April 2019
Written by Dr Mélodie Garnier and Dr Arthur Smith
Creative Commons License

Multiplicity, the unofficial theme of Researcher to Reader 2019

For the past four years at the end of February, publishers, librarians, agents, researchers, technologists and consultants have gathered in London for two days of discussions around the concept of ‘Researcher to Reader’. This blog is my take on what I found the most inspiring, challenging and interesting at the 2019 event. There wasn’t a theme this year per se, but something that did repeatedly arise from where I was standing was the diversity of our perspectives. This is a word that has taken a specific meaning recently, so I am using ‘multiplicity’ instead :

  • The principles of Plan S are calling for multiple business models for open access publishing, according to Dr Mark Schiltz
  • There is now great range in the approaches researchers take to the writing process, as described by Dr Christine Tulley
  • Professor Siva Umpathy described the disparity of standards of living in India which has a profound effect on whether students can engage with research regardless of talent
  • In order to ensure reproducibility of research, we need multiplicity in the research landscape with larger number of smaller research groups working on a wide array of questions, argued Professor James Evans
  • Cambridge University Press is trying to break away from the Book/Journal dichotomy, diversifying with a long-form publication called Cambridge Elements
  • SpringerNature and Elsevier are expanding their business models to encroach into data management and training (although the analogy starts to fall apart here – what this actually represents is a concentration of the market overall).

Anyway, that gives you an idea of the kinds of issues covered. The conference programme is available online and you can read the Twitter conversation from the event (#R2Rconf). Read on for more detail.

The 2019 meeting was, once again, a great programme. (I say that as a member of the Advisory Board, I admit, but it really was).

The Plan S-shaped elephant in the room

Both days began with a bang. The meeting opened with a keynote from Dr Mark Schiltz – President at Science Europe and Secretary General & Executive Head at the Luxembourg National Research Fund – talking about “Plan S and European Research”.

Schiltz explained he felt the current publishing system is a barrier to ensuring the outcomes of research are freely available, noting that hiding results is the antithesis of the essence of science. There was a ‘duty of care’ for funders to invest public funds well to support research. He suggested that there has been little progress in increasing open access to publications since 2009. In terms of the mechanisms of Plan S, he emphasised there are many compliant routes to publication and Plan S “is not about gold OA as the only publication model, it is about principles”. He also noted that there are plans to align Plan S principles with those of OA2020.

As is mentioned in the Plan S principles. Schiltz ended by arguing for the need to revise the incentivisation system in scholarly communications through mechanisms such as DORA. This is the “next big project” for funders, he said.

Catriona McCallum from Hindawi noted DORA is the most vital component for Plan S to work and therefore we need a proper roadmap.  She asked if there was a timeline for how funders will make changes to their own systems for evaluating research and grant applications, as this is an area where societies and funders should work together. Schiltz responded that this process is about making concrete changes to practice, not just policy. There is no timeline but there has been more attention on this than ever before. He noted that Dutch universities are meeting next year to redefine tenure/promotion standards which will be interesting to follow. McCallum observed it could take decades if there is no timeline upfront.

One of the early questions from the audience was from a publisher asking why mirror journals were not permitted under Plan S because they are not hybrid journals. Schiltz disagreed, saying if the journals have the same editorial board then it is effectively hybrid because readers will still need to subscribe to the other half, as they would for hybrid. Needless to say, the publisher disagreed.

The question about why Plan S architects didn’t consult with learned societies before going public was not particularly well answered. Schiltz talked about the numbers of hybrid journals being greater than pure subscription journals now and there was concern that hybrid becomes dominant business model. He said we need an actual transition to gold OA, which is all very well but doesn’t actually answer the question. He did note that: “We do not want learned societies to become collateral damage of Plan S”. He acknowledged that many learned societies use surpluses from their publishing businesses to fund good work. But he did ask: “Is the use of thinly spread library budget to subsidise learned societies’ philanthropic activities appropriate, and to what degree? This is not sustainable”.

So, how do researchers approach the writing process?

Professor Christine Tulley, Professor of English at the University of Findlay, Ohio spoke about “How Faculty Write for Publication, Examining the academic lifecycle of faculty research using interview and survey data”. Tulley is involved with training researchers in writing and publishing among other roles. She has published a book called How Writing Faculty Write, Strategies for Process, Product and Productivity based on her research with top researchers who research about writing. She is also collaborating on De Gruyter survey of researchers on writing (with whom she co-facilitated a workshop on this topic, discussed later in this blog).

Tulley’s first observation is that academics think ‘rhetorically’. Regardless of discipline, her findings in the US show that thinking about where you want publish and the community you want to reach is more important to academics than coming up with an idea. Tulley noted that in the past, the process was that academics wrote first then decided where to publish. But this is not the case now, where instead authors consider readership in the first instance, asking themselves what is the best medium to reach that audience. This is a focus on what can be a narrow audience that an author wants to hit – it is not a matter of ‘reach the world’ but can be as few as five important people. This can limit end publication options.

She also observed that after the top two or three journals, then their rank matters less. Because of this, newer journals/ open access publications can attract readers and submissions, particularly through early release, which is more important that ‘official publication’ she observed. This does talk to the recent increase in general interest in preprints.

In a statement that set the hearts of the librarians in the audience aflutter, Tulley spoke about librarians as “tip-off providers”, being especially useful for early online release of research before the indexing kicks in. She noted that academics view librarians as scholarly research ‘Partners’ rather than ‘Support’. We have also had this discussion within the UK library community.

Equity of access to education

It is always really interesting to hear perspectives from elsewhere – be that across the library/researcher/publisher divides, or across global ones. Two talks at the event were very interesting as they described the situation in India and Bangladesh, highlighting how some issues are shared worldwide and others are truly unique.

Prof Siva Umpathy, Director of the Indian Institute of Science Education, Bhopal, spoke first, emphasising that he was giving his personal opinion, not that of the Indian government. He noted that taxpayers pay for higher education in India and this is the case for most of the global south – fees to students are much less common. This means education is seen as a social responsibility of government.

Umpathy noted that 40% of the population in India is currently under 35 years old. infrastructure and opportunities vary significantly within India let alone across the whole ‘global south’. In some areas of India, the standard of living is equivalent to London. In other areas there is no internet connection. This affects who can engage with research, some very bright students from small villages are at a disadvantage. Even the kind of information that might be available to students in India about where to study and how to apply can be uneven affecting ambitions regardless of how talented the student might be. He described the incredibly competitive process to gain a place in a university, consisting of applications, exams and interviews.

In India, when someone is paying to publish a paper it gives an impression that the work is not as high a quality, after all, if you have good science you shouldn’t have to pay for publication. I should note this is not unique to India – witness an article that was published in The Times Literary Supplement the day after this talk that entirely confuses what open access monograph publishing is about (“Vain publishing – The restrictions of ‘open access’”).

Beyond impressions there are practical issues – bureaucrats don’t understand why an academic would pay for open access publication, why they wouldn’t publish in the ‘best’ mainstream journals, therefore funding in India does not allow for any payment for publishing. This is despite India being a big consumer of open access research. This has practical implications. If India were to join Plan S and mandated OA, it will likely reduce the number of papers he is able to publish by half, because there’s no government funding available to cover APCs.

He called for the need to train and editors and peer reviewers and the importance of educating governments, funders and evaluators and suggested that peer-reviewers are given APC discounts to encourage them to review more for journals. This, of course is an issue in the Global North too. Indeed when we ran some workshops on Peer Review late last year. They were doubly subscribed immediately.

Global reading, local publishing – Bangladesh

Dr Haseeb Irfanullah, a self described ‘research communications enthusiast’ spoke about what Bangladesh can tell us about research communications. He began by noting how access to scientific publications has been improved by the Research 4 Life Partnership and INASP. These innovations for increasing access to research literature to global south over past few years have been a ‘revolution’. He also discussed how the Bangladesh Journals Online project has helped get Bangladeshi journals online, including his journal, Bangladesh Journal of Plant Taxonomy. This helps journals get journal impact factors (JIF).

However, Bangladesh journal publishing is relatively isolated, and is ‘self sustaining’. Locally sourced content fulfils the need. Because promotion, increments and recognition needs are met with the current situation (universities don’t require indexed journals for promotion), then this means there is little incentive to change or improve the process. This seems to be example of how a local journal culture can thrive when researchers are subject to different incentives, although perversely the downside is that they & their research are isolated from international research. A Twitter observation about the JIF was “damned if you do or damned if you don’t”.

He also noted that it is ‘very cheap to publish a journal as everyone is a volunteer’, prompting one person on Twitter to ask: “Is it just me or is this the #elephantintheroom we need to address globally?” Irfanullah has been involved in providing training for editors, workshops and dialogues on standards, mentorship to help researchers get their work published, as well as improving access to research in Bangladesh. He concluded that these challenges can be addressed; for example, through dialogue with policymakers and a national system for standards.

Big is not best when it comes to reproducibility

Professor James Evans, from the Department of Sociology at Chicago University (who was a guest of Researcher to Reader in 2016) spoke on why centralised “big science” communities are more likely to generate non-replicable results by describing the differences between small and large teams. His talk was a whirlwind of slides (often containing a dizzy array of graphics) at breath-taking speed.

The research Evans and his team undertake looks at large numbers of papers to determine patterns that identify replicability and whether the increase in the size of research teams and the rise of meta research has any impact. For those interested, published papers include “Centralized “big science” communities more likely generate non-replicable results” and “Large Teams Have Developed Science and Technology; Small Teams Have Disrupted It”.

Evans described some of the consequences when a single mistake is reused and appears in multiple subsequent papers, ‘contaminating’ them. He used an example of the HeLa cell* in relation to drug gene interactions. Misidentified cells resulted in ‘indirect contamination’ of the 32,755 articles based on them, plus the estimated half a million other papers which cited these cells. This can represent a huge cost where millions of dollars’ worth of research has been contaminated by a mistake.

The problem is scientific communities use the same techniques and methods, which reduces the robust nature of research. Increasingly overlapping research networks with exposure to similar methodologies and prior knowledge – research claims are not being independently replicated. Claims that are highly centralised on star scientists, repeat collaborations & overlapping methods are far less robust and lead to huge distortion in the literature. the larger the team, the more likely their output will support and amplify rather than disrupt prior work. if there is an overlap, e.g. between authors or methodologies, there is more likely to be agreement.

Making the analogy of the difference between Slumdog Millionaire vs Marvel movies, Evans noted that independent, decentralised, non-overlapping claims are far more likely to be robust, replicable & of more benefit to society. It is effectively a form of triangulation. Smaller, decentralised communities are more likely to conduct independent experiments to reproduce results, producing more robust results. Small teams reach further into the past and looks to more obscure and independent work. Bigger is not better – smaller teams are more productive, innovative & disruptive because they have more to gain & less to lose than larger teams.

Large overlapping teams increase agglomeration around the same topics. The research landscape is seeing a decrease in small teams, and therefore a decrease in independence. These types of group receive less funding & are ‘more risky’ because they are not part of the centralised network.

Evans described a disruption to the scientific narrative building on what has incrementally happened before is effectively Thomas Kuhn’s The Structure of Scientific Revolutions from the 1960s. But “disruption delays impact” – there is a tendency of research teams to keep building on previous successes (which come with an existing audience) rather than risking disruption and consequent need for new audiences etc. In addition, the size of the team matters, one of their findings has been that each additional person on a team reduces the likelihood of research being disruptive. But disruption requires different funding models -with a taste for risk.

Evans noted that you need small teams simultaneously climbing different hills to find the best solution, rather than everyone trying to climb the same hill.  This analogy was picked up by Catriona MacCallum who noted that publishers are actually all on the big hill which means they are in the same boat and trying to achieve the same end goal (hence the mess we are now in). So how do publishers move across to the disruptive landscape with lots of higher hills?

*The HeLa cell is an immortal cell line used in scientific research. It is the oldest and most commonly used human cell line. It is called HeLa because it came from a woman called Henrietta Lacks.

Sci Hub – harm or good?

The second day opened with a debate about Sci Hub on the question of “Is Sci-Hub is doing more good than harm to scholarly communication?”.

The audience was asked to vote whether they ‘agreed’ or ‘disagreed’ with the statement. In this first vote 60% of the audience disagreed and 40% agreed. Note this could possibly reflect attendance at the conference of publishers as the largest cohort of 51% of the attendees, or alternatively be a reflection of the slightly problematic wording of the question. More than one person observed on Twitter that they would have appreciated a ‘don’t know’ or ‘neither good nor bad’ options.

The debate itself was held between Dr Daniel Himmelstein, Postdoctoral Fellow at the University of Pennsylvania (in the affirmative – that SciHub is doing good) and Justin Spence, Partner and Co-Founder at Publisher Solutions International (in the negative – that SciHub is doing harm). I have it on good authority the debate will be written up separately, so won’t do so here. One observation I noted was – the question did not define to whom or what the ‘harm’ was being done. The argument against appeared focused on harm to the market but the argument for was discussing benefit to society.

The discussion was opened up to the room but the comment that elicited a clap from the audience was from Jennifer Smith at St George’s University in London who asked if Elsevier’s profits are defensible when there are people on fun runs raising money for charities who are not anticipating their fundraising cash is going to publisher shareholders rather than supporting research. The question she asked is: “who is stealing from whom?”.

At the end of the debate the audience was asked to vote again at which point, 55% disagreed and 45% agreed meaning Himmelstein won over 5% of the audience. This seems surprising given that it seems very rare to actually change anyone’s mind.

But is it a book or a journal?

Nisha Doshi spoke about Cambridge Elements – a publication format that straddles the Book and Journal formats. It was interesting to hear about some of the challenges Cambridge University Press has faced. These ranged from practical in terms of which systems to use for production which seem to be very clearly delineated as either journal system or book systems. CUP is using several book systems, plus ISBNs, but also using ScholarOne for peer review for this project. Other issues have been philosophical. Authors and many others continue to ask “is it a journal or a book?”. CUP have encouraged authors to embed audio and video in their Cambridge Elements, but are not seeing much take-up so far which is interesting given the success of Open Book Publishers.

Doshi listed the lessons CUP has learned through the process of trying to get this new publication form off the ground. It was interesting to see how far Cambridge Elements has come. In October 2017 as part of our Open Access Week events, the OSC hosted CUP to talk about what was described at this point as their “hybrid books and journals initiative“.

What’s the time Mr Wolf?

In 2016, Sally Rumsey and I spoke to the library communities at our institutions (Oxford and Cambridge, respectively) with a presentation: “Watch out, it’s behind you: publishers’ tactics and the challenge they pose for librarians”. Our warnings have increasingly been supported with publisher activity in the sector over the past three years. Two presentations at Researcher to Reader were along these lines.

In the first instance, Springer Nature presented on their Data Support Services which are a commercial offering in direct competition to the services offered by Scholarly Communication departments in libraries. I should note here that Elsevier also charge for a similar service through their Mendeley Data platform for institutions.

Representing an even further encroachment, the second presentation by Jean Shipman from Elsevier was about a new initiative which is training librarians to train researchers about data management. The new Elsevier Research Data Management Librarian Academy (RDMLA) has an emphasis on peer to peer teaching. Elsevier developed a needs assessment for RDM training, assessed library competencies, and library education curriculum before developing the RDMLA curriculum for RDM training. Example units include research data culture, marketing the program to administrators, and an overview of tools such as for coding. Elsevier moving into the training/teaching space is not new, they have had the ‘Elsevier Publishing Campus’ and ‘Researcher Academy’ for some time. But those are aimed at the research community. This new initiative is formally stepping directly into the library space.

Empathy mapping as a workshop structure

One of the features of Researcher to Reader is the workshops which are run in several sessions over the two day period. In all there is not much more time available than a traditional 2.5 – 3 hour workshop prior to the main event, but this format means there is more reflection time between sessions and does focus the thinking when you are all together.

I attended a workshop on “Supporting Early-Career Scholarship” asking: How can librarians, technologists and publishers better support early career scholars as they write and publish their work?

Ably facilitated by Bec Evans, Founder at Prolifiko with Dee Watchorn, Product Engagement Manager at De Gruyter and Christine Tulley, the workshop used a process called Empathy Mapping. Participants were given handouts with comments made by early career researchers during interviews about the writing process as part of a research programme by Prolifiko. This helped us map out the experience of ECRs from their perspective rather than guessing and imposing our own biases.

We were asked to come up with a problem – for my group it was “How can we help an ECR disseminate their first paper beyond the publication process?” And we were then asked to find a solution. Our group identified that these people need to understand the narrative of their work that they can then take through blogs, presentations, Twitter and other outlets. Our proposal was to create an online programme that only allowed 5 minutes for recording (in the way Screencastify only allows 10 minutes) an understandable explanation of their research that they can then upload for commentary by peers in a safe space before going public.

And so, to end

It is helpful to have different players together in a room. This is really the only way we can start to understand one another. As an indicator of where we are at, we cannot even agree on a common language for what we do – in a Twitter discussion about how SciHub is meeting an ‘ease of access’ need that has not been met by publishers or libraries, it became clear that while in the library space we talk about the scholarly publishing *ecosystem*, publishers consider libraries to be part of the scholarly publishing *industry*.

One tweet from a publisher was: “Good to hear Christine Tulley talk about why academics write and what it is important to them at #R2RConf . We don’t want to, but publishers too often think generically about authors as they do about content”. While slightly confronting (authors are not only their clients, but also provide the content for *free*, so should perhaps be treated with some respect), it does underline why it is so essential that we get researchers, librarians and publishers into the same room to understand one other better.

All the more reason to attend Researcher to Reader 2020!

Published 4 March 2019
Written by Dr Danny Kingsley
Creative Commons License

Plan S – links, commentary and news items

The discussions around Plan S are voluminous. On 8 February 2019, the opportunity to provide feedback on Plan S closed.

We were attempting to maintain a list of commentary and news stories on Plan S at the end of one of our blogs: Most Plan S principles are not contentious. This grew so large that we moved the list into this dedicated blog.

As of 01 April, new links have not been added due to resourcing issues – however, let us know at info@osc.cam.ac.uk if we have missed anything from the period 10 February – 01 April that should be added.

Please note that there is a list on the Open Access Tracking Project using the tag “oa.plan_s”  which is crowd sourced and updated in real time, so is more comprehensive than this effort. There is also a comprehensive Reddit list curated by Jon Tennant available. A smaller list (but with different links) is also available.

Relevant documents from Science Europe

Commentary, news stories & press releases

These are presented here in reverse order of publication (most recent first).

Commentary in 2018

Published  10 February 2019
Compiled by Dr Danny Kingsley
Creative Commons License

2018 That Was The Year That Was

In what has now become a tradition, we are sending out our annual summary of the activities of the Office of Scholarly Communication. Our first year, in 2015, the summary was a stock take of where we were at. By the following year, 2016, we were implementing a strategy. What followed in 2017 was a year of numbers.  Last year was really a year of consolidation allowing us now for the first time in four years to take a step back and breathe.

REF, what REF?

It is impossible to be in this space in the UK and not be highly focused on the Research Excellence Framework. While our team has been working towards the REF for four years, suddenly during 2018 the community really took notice. This is fantastic from an advocacy perspective but it has posed some other issues.

We are facing a tsunami, where deposits to the repository have more than doubled (and in some months quadrupled) what we were receiving a couple of years ago, with an almost stable staff number during that time. It is now not uncommon for us to receive well over 1000 deposits in a given month. This has meant that for the first half of 2018 the ever-present backlog of articles to be uploaded to the repository rose to 4,000, of which 60% were potentially claimable for the REF. We also were holding over 1000 records that needed to be updated with publication details.

To address this we have worked hard to streamline our processing. In addition to stretching our helpdesk system’s ability to classify and sort to the limit, we have developed two systems to almost automate our deposit process.

The first is to create a solution to automate journal policies, called Orpheus. This was released as an Open Source resource in January.  The second is a web application we are calling ‘Fast-Track’ which reduces the processing time of deposits dramatically by presenting the decisions that need to be made on each Open Access deposit in a simple interface. We launched the application internally (for the Open Access Service) late last year. The drastic reduction of processing time from an average of 20 minutes per record to one to two minutes means since September we have processed more than 3000 items from the backlog reducing our numbers that are pending processing at time of this blog going live to 600 and falling fast.

Question time

We plan to engage the wider library and administrative community with Fast-Track in early 2019, with the double benefit of exposing others to at least one aspect of open access in a simple and accessible manner, and allowing the staff from the Open Access Service to focus on supporting researchers with REF and Open Access related queries. These queries are relentless by the way, with 400 to 600 per month on the Open Access queue alone. Even with the automation and system management of repeat queries, we spend close to 100 person hours a week managing the Open Access helpdesk alone. A project for 2019 will be an analysis of the queries to see if there is a solution to reducing the number of queries before they come to us.

However, in reflection of the advice that came from our repository community late last year on ‘What would you have liked to know when you started in scholarly communication?‘ we also hold tight the philosophy that “It’s not all about the REF”.

Wide readership

The downloads from our repository, Apollo, point to the whole reason why we are trying to make Cambridge research available in the first place.  According to Jisc’s IRUS-UK service, our readers come from all over the world, with the US the biggest user. We experienced over 2.2 million downloads of material from the repository in 2018, of which close to 1 million (927,114) were of articles and nearly 40,000 were of datasets.

We are still feeling something of the ‘Hawking effect‘ – in 2018, Professor Stephen Hawking’s PhD thesis received 424,141 downloads, representing more than half of our total theses downloads for 2018 of 740,441. Of those theses our most downloaded  in order were:

The OSC’s Request a Copy service continues to provide access to embargoed content in Apollo. In 2018, the service managed more than 4600 requests, with nearly twice as many requests for theses than journal articles. All of these have been processed by hand, which takes approximately 27 person hours per week. In 2019 we are planning to enhance our systems so that we can continue to provide access to Cambridge’s research outputs in a less manual manner.  Find out more about the requests we receive in our blog post, What do you want, and why do you want it? An update on Request a Copy.

Policy changes

In a space that is already well known as fast moving, 2018 broke all records in relation to the pace of policy change.

On 1 April 2018, UK Research and Innovation (UKRI – not to be shortened to the phonetic acronym ‘you cry’) came into existence, subsuming RCUK and transforming part of HEFCE into Research England. UKRI is undertaking a review of their Open Access policy.

In 2018 Cambridge was an invited contributor to the Wellcome Trust Open Access Policy Review Consultation and Expert Evidence Gathering Session, which resulted in a new Open Access policy for Wellcome Trust released in November.

The issue of allowing articles uploaded to arXiv and other subject repositories to be compliant with REF rules remained a problem at the beginning of 2018. After several years of consultation with arXiv and across the sector about technological solutions, the paper ‘arXiv and the REF open access policy’ was published in April. In July, there was a change to the UKRI policy on the acceptance of works deposited to preprint servers and subject repositories. This is good news for a significant section of our research community but does require careful handling of workflows .

In August, Cambridge brought in new funding guidelines which support publishers showing progression towards open access. This is equally addressing Cambridge’s positive moves to an open future and the need to sensibly manage the UKRI block grant fund allocation.

Obviously, Plan S, announced in September, remains a hot topic. We have published a few blogs on the topic (see below) and continue to hold a watching brief on it.

Thesis news

This past year has seen a huge amount of work by the OSC to implement and consolidate the policy and processes around the agreement by the Board of Graduate Studies on the new access levels to theses, cementing in the policy for deposit of digital theses across the University, and agreement from UKRI that this is compliant with their Training Grant policies.

Our advocacy continues on the digitisation and release of our vast store of physical-only theses, which is only possible with the permission of the author. To that end, in 2018 we released a series of short films, My thesis, open access and me to demonstrate the benefit of open access to those considering it. We also released a brilliant comic strip drawn by one of our librarian colleagues and Data Champions, Clare Trowell – it is available online. These activities were dubbed (in possibly the best pun of 2018) as: ​”The theses formerly know as prints” project.

We have continued our digitisation of alumni theses, through the support of the Arcadia Foundation and have now digitised 200 of these theses and made them open access. This includes the work we have been doing with the Scott Polar Research Institute to collect and digitise their full corpus of theses. We also finally went live with an online eSales system for the 1,400 theses we had digitised from the British Library’s expansive microfilm archive.

All this activity has resulted in a huge increase in the number of theses in the repository. When the Office of Scholarly Communication first began in 2015 there were 700 theses in the repository. We have now exceeded 6,000.

Data news

The biggest news for the Research Data Management Facility in 2018 was the consolidation of the staff onto ongoing permanent contracts. After a process that has lasted several years we are delighted (and relieved) that Dr Lauren Cadwallader is in post as the RDMF Manager, and Dr Sacha Jones has joined us as the RDMF Coordinator.

We extend a huge thanks to Clair Castle who joined us for much of 2018 to keep the RDMF rolling while the staffing was being resolved, in particular for her work with the Data Champions. During 2018 we ran a successful second call for applications to the Data  Champions programme, resulting in a cohort of 50 people across the University participating. We also worked on a series of postcards and cartoons (again with our talented colleague Claire Trowell)  to promote the Data Champion programme. There is a full description of these resources in the Cartooning the Data Champions blog post.

The numbers around our datasets continue to be impressive. According to Institutional Repositories Usage Statistics UK (IRUS UK), Apollo contains ~30% of all datasets (across 140 repositories) in the UK, that amounts to more than 1,500 research datasets of which more than 70% are linked to RCUK funding.

Systems

We continued to integrate our Apollo repository with other systems to allow further automation of processes. The repository is now minting DOIs, displaying Altmetric information when available, linking to ORCID and our metadata is harvested into CORE and listed on IRUS-UK.

While these linkages have improved our offerings, it does mean we cannot upgrade the repository until we both upgrade Elements, and Repository Tools, the repository-Elements connector.

Upgrades are also constrained by REF timeframes given the need for stability of our systems in the run up to REF2021, so we need to make a decision early in 2019 about whether we push further ahead or put everything on hold until our REF return is secured.

Training

Training continues to be a big focus for the OSC, with a strong move towards online training in 2018. This is explained in some details in this related blog. We are also invested in a group that is looking at the question of competencies and associated training in scholarly communication which was loosely titled the Scholarly Communication professional development Group.

Now newly named the “Scholarly Communication Competencies Coalition” (SC3), we will be launching an online presence early in 2019. Some of the activities of the group in 2018 included developing an online resource to try and showcase what this area is like as a place of work, see In their own words: working in scholarly communication. We also investigated the skills required to work in scholarly communication.

Outreach activities

The OSC puts a great deal of effort into sharing our work with our library, University and wider communities. We have welcomed over 800 attendees this year, plus more who have watched recordings of events and webinars of the 57 events we ran for researchers, librarians and the wider Cambridge community.

Open Access Week, as always, was a very big week for everyone in the team, during which four very engaging speakers joined us for a lively event, Is Open Research really changing the world?, to question if research outputs really are available to everyone when they are made open access.

We continue to improve and update our online resources, relaunching the open access website as a one-stop-shop for our research community to help demystify the process of meeting funder OA requirements and making a manuscript REF eligible. We also released a compilation of the best copyright resources on the web, featuring everything from training session slides to videos. As a bit of fun we published our second annual Advent calendar in December.

We have also considered the effort we put into our outreach activities, which has meant we are going to approach the decisions about which events we livestream and film differently.

And you, our readers

We continued to blog enthusiastically through our Unlocking Research and Open Research: Adventures from the frontline blogs, managing 35 blog posts this year. There was a glitch with the analytics for the first four months of 2018, with the system rebooting on 7 April this year. Even with a third of the year missing, this blog enjoyed 25,000 visits over the past nine months.

Looking at where visitors to the blog have originated, it is interesting to note that the number of ‘organic search’ readers remains consistent throughout the year, whereas the direct links are clearly affected by our own promotions or through discussions elsewhere.

Our most popular blog, with 1741 visits since 7 April, was “What does a researcher do all day?”  This is a perennial favourite – published on 1 February 2016, it was also the second most popular blog last year.

In order, our other popular blogs with over 1000 visits each were

Projects and plans

There have also been other interesting side projects we have undertaken in our ‘spare time’. We started a research project to understand what we contribute to the scholarly literature , what we pay and what we get out of it, to assist decision making about subscriptions and other expenditure across the University. We hope to write up and release some findings from this project soon. We have also been conducting a Text and Data Mining Test Kitchen Project to help define what a TDM service might look like within the library, and work will continue in 2019.

As always this remains an interesting and dynamic area to work in and we are looking forward to another exciting year!

Published  25 January 2019
Written by Dr Danny Kingsley
Creative Commons License

Orpheus, an Open Source solution for journal policies

As anyone who administers an institutional repository can tell you, repeatedly looking up journals’ policies and attributes is a pain in the neck. We have discussed this problem a few times, noting in 2017 the complex embargo situation and the confusion about publication dates. Indeed it has been clear since 2013 that this is so complicated it is unrealistic to expect researchers to navigate this situation. This means considerable amounts of repository staff time are typically spent traversing a confusing landscape of complex, inconsistent and fluid policies.

To stop or at least mitigate this pain, wouldn’t it be great if those policies and attributes were available in a structured, machine-readable format, so that the burden of retrieving and using such information could be transferred from people to repository software?

(Given, an even better solution would be, of course, for publishers to have simpler and standardised policies across their journals, but there is little indication that this will happen any time soon – see links above.)

Our solution – Orpheus

JISC are currently working on and will shortly release version 2 of their SHERPA services, which have enormous potential for providing machine-readable data on embargo periods and at least some of the other attributes we need. However, circa two years ago we decided that, in face of increasing demand for our services, we could not afford a wait to automate our workflows. Besides, we reasoned that any external solution would be unlikely to cover all the journal attributes we rely on beyond embargo periods, and to be updated at the frequency we require.

So, in the last trimester of 2017, I set out to develop a database that could store in a strictly structured and machine-readable format all bits of information from journals, publishers and conferences that we repeatedly look up. This would replace the time the team behind Cambridge’s DSpace repository Apollo was spending retrieving and manually applying those data to each deposited item.

Orpheus (named after the son of Apollo in Greek mythology) was thus born in January 2018. To mark its first birthday, we have just turned Orpheus into an Open Source project and released the code at https://github.com/osc-cam/orpheus.

In this blog post, I will provide an overview of Orpheus’ main features and of how we have been using it to increase the efficiency of our repository and services.

Supported attributes and available interfaces

On the web interface for editors and users, attributes are listed, for each journal, publisher and conference, in a detailed view that looks like this:

Orpheus currently supports the following attributes of journals/publishers:

  • name, synonyms, URL and, for journals, ISSNs and publisher
  • revenue model (subscription, hybrid, fully Open Access)
  • gold OA policy (article processing charges, licence choices, etc)
  • green OA policy (allowed versions and outlets, embargo period, licence, etc)
  • Europe PMC participation (whether or not the journal deposits papers in EPMC)
  • deals/discounts (whether the journal is included in an institutional deal such as Springer Compact or offers any discounts)
  • contacts (e-mail addresses for queries

Orpheus’ RESTful API exposes journal attributes in JSON format and its response can be tailored to facilitate integration with repositories platforms and other systems. For instance, the screenshot below shows only the attributes that we feed into Apollo and/or our helpdesk system (on the left below).

  

Like every project written in Django, Orpheus includes an additional web interface for administrators to manage users and permissions, and to perform bulk operations such as updating or deleting multiple entries at once. It looks pretty too (as seen on the right above).

Current coverage

Orpheus includes parsers that allowed substantial datasets of journals and their attributes to be imported into the system, saving the Cambridge team the effort of populating the database from scratch. Data was imported from:

Orpheus currently has almost 40,000 journal entries belonging to more than 8,000 publishers (“preferred names”; the larger number of “total entries” includes synonyms).

While we may derive some satisfaction in achieving comprehensive coverage and including journals such as هیدروژیولوژی and Демографија, what really matters to us in terms of maximising the efficiency of our services is databasing those journals and conferences that Cambridge academics most often publish in.

A quick analysis of journal names contained in all Apollo submissions received since 2014 (29,598 submissions) reveals that we are now able to match 83% of those to a record in Orpheus and retrieve embargo periods, APC value and licencing information for, respectively, 72, 48 and 37% of past submissions. These results are encouraging, especially considering that (1) ’journal name’ in this dataset of past submissions includes conference names and strings that do not correspond to true journals, such as 13 entries for ’TBC’ (to be completed); and (2) for new submissions, our system tries to find matches by ISSN and eISSN before attempting matches by name, so we have a better chance of matching “Hepatology (Baltimore, Md.)” to the right journal than this analysis would suggest.

Integrations with Apollo and Zendesk

Without digging into the technical details of the integration of Orpheus with Apollo (to be honest, I would not be able to go into detail here, for the integration with Apollo was fully implemented by my colleague Agustina Martinez-Garcia), it suffices to say that Apollo has been querying Orpheus and successfully applying embargoes to many of the c. 900 submissions we receive per month (we received, on average, 892 monthly submissions in 2018).

Orpheus has also been integrated with our helpdesk system (powered by Zendesk) via “Orpheus Lookup”, a small Open Source application available here. This enables relevant information about journals to be embedded in our helpdesk interface (see right hand side pane of screenshot below), facilitating the job of advising researchers on how to comply with their funders’ Open Access policies. The app also allows us to populate the relevant helpdesk ticket fields (see left hand side pane of screenshot) with one click. Information in these fields may then be processed by a Zendesk macro (also Open Source), to produce tailored auto-reply messages that can be further customised by the staff member.

In summary, our experience indicates that the benefits of integration of an institutional repository with an auxiliary database providing machine-readable representations of frequently required attributes of journals, conferences and publishers outweigh the costs of development and maintenance of the system. Other institutions or consortia interested in automating the processes of looking up and applying those attributes to repository records may benefit from hosting an instance of Orpheus.

If you are interested in more detail about the Orpheus integration, please email us on info@repository.cam.ac.uk and we will be happy to help.

Published  22 January 2019
Written by Dr Andre Sartori
Creative Commons License

Cartooning the Data Champions

Clair Castle, Librarian at the Department of Chemistry, describes how during her secondment to the Office of Scholarly Communication (OSC) as Research Data Coordinator, she collaborated with Clare Trowell, Data Champion and Marshall Librarian at the Faculty of Economics, to design some cartoons to use to advocate for the Data Champions Programme.

I have been collaborating with the OSC on various RDM (Research Data Management) activities since it was established in 2015. I was fortunate enough to be appointed on secondment to the OSC from May to October 2018, as Research Data Coordinator. One of my main responsibilities was to manage the Data Champions Programme (with which I was already involved in my department).

Data Champions are volunteers who advise members of the research community on proper handling of research data. In this, they promote good research data management (RDM) and support Findable, Accessible, Interoperable, and Re-usable (FAIR) research principles.

Data Champions form a network across different schools and departments of the University of Cambridge as well as affiliated institutes. The Data Champion Programme is open to all University members interested in research data handling, for example researchers (from PhD students to PIs), data managers, IT professionals, librarians, and data scientists.

Demonstrating the value of RDM

The Data Champions have bimonthly Forum meetings where they have the opportunity to hear speakers on RDM related topics, speak about their own RDM activities, and network. At the May 2018 Forum meeting Dr Danny Kingsley (Head of the OSC) led a stakeholder analysis exercise to try and work out: a) why RDM is of value to different stakeholders, b) their possible objections to RDM, and c) what responses a Data Champion could formulate to these objections. The idea being that if a Data Champions was stuck in a lift with one of these stakeholders, or sat next to someone at a college dinner or a meeting for example, and are having a conversation about RDM, and that person raised an objection to it, this could be rebutted with a suitable response prepared in advance.

Stakeholders included were:

  • PhD students
  • PostDocs
  • Early Career Researchers
  • Principal Investigators
  • Undergraduate students
  • Masters students
  • University administration (e.g. research grant administrators, librarians)
  • University committee structure
  • Vice Chancellor
  • Funders
  • Members of the public.

We were divided into groups, each of which represented a particular stakeholder, and wrote down our thoughts on (a)-(c) as above on post-it notes. Unfortunately we ran out of people to write anything about the members of the public as stakeholders.

I collated what was written on the post-it notes into a table and this was discussed at the following Data Champions Forum meeting in July. Ideas were invited from everybody about how we should feature this information for best usage and as practical resource for RDM advocacy.

 

One idea from Dr Lauren Cadwallader (Research Data Facility Manager) was a cartoon design for use on small postcards or on posters and she asked if anyone could draw. It was at this point that Clare Trowell stuck her hand up – as she is also an artist!

Drawing up a plan

One of the main ideas behind the cartoons was that the Research Data Team wanted to create an ‘advocacy’ resource in the Data Champions’ Google Drive. Data Champions could then use them in posters, training sessions etc. that they would design themselves. The first use for the cartoons would be on postcards to promote the Data Champions Programme and the RDM services that the Research Data team offer.

I arranged to meet Clare a couple of times for a cup of tea and a chat about what would be required, and to catch up on progress, and we established the following:

  • Timescale – Clare wanted to complete the project by the end of the Summer Vacation due to the term-time commitments there would be at Economics in the Michaelmas Term.
  • Licensing – We agreed on the Creative Commons licence CC-BY-NC-ND (which only allows others to download your works and share them with others as long as they credit you, but they can’t change them in any way or use them commercially). Clare wanted to retain her copyright in her cartoons so she could use them for promotion on her personal website. She also wanted to prevent others from profiting from them as she did this work pro bono for the OSC. She was also concerned that without “No Derivatives” it might be possible to make disrespectful adaptations. She is not concerned about profiting from the designs herself.
  • Costs – postcards would be free to print by the University Library, where the OSC is based. Clare volunteered her services for free but we did remunerate her for the materials she used. I would be designing the postcard template as part of my usual role.
  • Workload – Clare felt that around 8 scenarios would be manageable for her to draw in the time available. I asked her to draw one more that could be specifically used to encourage people to become a Data Champion.
  • Cartoon content – we debated whether we should we have 3 or 4 ‘boxes’ in a strip. I would provide text statements for Clare to illustrate. We agreed to use speech bubbles to contain the text, as is traditional with cartoon characters when speaking.
  • Stakeholders – which should we focus on? We describe the cartoon characters we finally decided on below. We needed some of the 8 postcards to be appropriate for STEM or HASS disciplines, or both. They should therefore feature a variety of characters that could be used in different situations.

The next step was for me to identify themes from the objections to RDM and the responses to them in the stakeholder analysis exercise and to translate them into scenarios for the cartoons:

My summary of themes looked like this:

  • Fear of being scooped
  • Unable to share
  • Unwilling to share
  • Time and effort
  • Cost
  • Waste of time

Meet the cast

Literally as we were talking, Clare started drawing and we eventually came up with a range of characters that we took from the stakeholder analysis exercise results. We grouped post-docs and early career researchers together, and the PhD and Masters students together, in order to rationalise the numbers involved. We left out undergraduates and funders, as they aren’t a priority for advocacy at the moment. (Please note this image is licensed CCBY-NC-ND, attribution Clare Trowell).

Clare also invented ‘Corporate Man’ (very popular with the Data Champions and the Research Data Team!) and two Data Champion characters. Clare tried very hard to be as diverse as possible, in order to represent the Data Champions inclusively. Her inspiration for the characters has tended to come from real-life people she has encountered.

Here are some of the final scenarios I devised for Clare to illustrate. I found it was easier to include just three boxes in a strip – represented above by the number of columns. I had minimal space for text so I needed to be quite concise, as well as having to imagine scenarios that would be immediately understood. This was challenging but really enjoyable. I also received some useful feedback from Danny and Lauren at this stage.

Postcard design

The cartoons were scanned (using a high quality flatbed scanner at Economics) from the hand-drawn originals to create digital images in PDF and TIFF format. These files were too large to send to me by email so Clare made a few trips to the OSC with a memory stick!

I started off designing the postcards in Canva but this has quite a limited editing capacity (especially for cropping and resizing the images) so I moved on to using Inkscape. In contrast to Canva, this is free, open source graphic design software, which other members of the OSC had used previously. It has the advantage that anyone will be able to use this to amend the designs in future. I was given lots of advice and help but I really ended up learning as I went along due to the limited time available – a steep learning curve! Inkscape’s main output is in SVG format but images can be converted to PDF.

The nice thing about hand-drawn cartoons is that they don’t have completely straight lines, but this made it a bit difficult to orientate the drawings on the postcards. I did the best I could but I quite like the ‘hand-made’ feel of the final designs.

For the content on the reverse of the postcards I updated a version of the current Research Data postcard that the team were giving out at training sessions and other events. This provides links to sources of help and guidance on sharing research data, and to the Research Data Management website and social media accounts. It would now include a link to the Data Champions programme.

Feedback

The September Data Champions Forum meeting included a general discussion on the possible branding of the Data Champions programme. As part of this, Clare introduced her cast of characters and I shared a compilation of all the scenarios in a ‘comic strip’.

I also printed off some prototype postcards so that everyone could see what they could look like. The feedback was positive and just a few final tweaks were suggested, including creating more space on the reverse for people to write a message and an address, so it can be actually posted, and adding the headline ‘Ask a Data Champion’.

Cartoons as an advocacy tool

The final designs were just about ready in time for the beginning of the new academic year when we knew Data Champions would be inducting new students and staff and doing RDM training in their institutions. I uploaded the designs to the Data Champions Google Drive, and numbered them from 1-9. Data Champions could then choose which they would like printed copies of and request the designs and amounts required via an online form. We sent them out in the internal post.

The initial print run was 100 of each design, most of which were sent out to Data Champions upon request. We received requests for sometimes a small number of each design or larger numbers of a few designs. We needed to make a further print run of 50 each of a couple of scenarios: “Check out this course on research data management” and the “Data Champion Wanted” designs, as they proved to be particularly popular for use at induction and training events.

The Research Data team now distributes the postcards at all RDM training sessions and, if there is a choice, they are apparently more popular with the usual, more formal research data ones, perhaps because of their more informal nature? I think colourful illustrations of people do tend to stand out more.

At forum meetings we discussed the possibility of using the cartoons in the following contexts:

  • Producing short videos that could include role-play.
  • Interactive feature on a website (e.g. objections to RDM as a word cloud/speech bubbles, hover over an objection to RDM to see a rebuttal for it)
  • Memes on social media.
  • Insert postcard in the welcome packs for students or as a flyer, and on Powerpoint slides for use in foyers/on TV screens.
  • Using the #askadatachampion Twitter hashtag alongside cartoons.
  • Pokemon-like game – collect all the different cards!
  • Animation with cartoons, potentially for use on the OSC YouTube channel. See Powtoon and Adobe Character Animator which creates moving images from 2D drawings for ideas.

Outcomes

Cartooning in the world of libraries and publishing is increasing; one example is the cartoon abstract of the Research Support Ambassador programme at Cambridge University paper written by Claire Sewell and Danny Kingsley. As well as drawing the cartoons for the Data Champion postcards, Clare has drawn one for use by the OSC to promote the digitisation of theses at the University. Cartoons and drawings offer an interesting alternative to the traditional, perhaps more formal ways of communicating.

This project has proved to be an innovative and fun way for the Research Data team/the OSC to collaborate with its stakeholders, and to promote the Data Champions programme and theses digitisation. One significant outcome has been the role of the cartoons in the wider discussion of branding by the OSC that followed, and which is ongoing.

There were challenging issues around the technical side of designing the cartoons but this can be improved upon in future. The Data Champions will soon have an impressive set of designs they can use to promote their RDM activities.

I thank Lauren for steering me through the process and her and my OSC colleagues for imparting their Inkscape skills. I also thank Clare for being such a good collaborator and allowing us to use her talents to create these eye-catching postcards.

NOTE: All the cartoons are available on the RDM website.

Published 10 January 2019
Written by Claire Castle, with contribution from Clare Trowell
Creative Commons License

Moving online: training librarians in 2018

As we move into 2019 it is a good time to look back at another year spent training the library community, both in Cambridge and more widely. Over the last 12 months, the Office of Scholarly Communication has held nearly 50 training sessions for Cambridge staff on topics ranging from navigating copyright issues to the mechanics of the publishing process.  

Face to face

We have continued to deliver high-quality face-to-face training sessions on many topics. Sometimes sessions just work better when participants are all together in a room, especially if there are a lot of activities. For example, our sessions looking at Research Data Management and Data Management Plans are designed to be interactive and so wouldn’t really work in any other format. Feedback from sessions tells us that participants really value the chance to meet other librarians and hear their perspectives on things.

Cambridge has more than 100 libraries including faculties, departments, colleges and connecting institutions. Many staff do not get to meet each other unless working on a specific project and even working in the same university it can be hard to avoid becoming too focused on local issues. Attending workshops and other training sessions allows conversations to happen and several people have told us that they really value the chance to connect with their colleagues. 

Webinars to the rescue

Of course, librarians are very busy people so sometimes it’s just not possible for them to attend sessions in-person. Working in small teams often means that staff are unable to leave the library to go to training, especially when travel time and family commitments are factored into the equation.

To help with this we introduced webinars as a delivery method in 2017. This means that staff can either attend training sessions remotely or catch up with a recording.  Because of the success of this project we have continued to deliver sessions via webinar in 2018 and feedback from attendees tells us we are doing something right! Several people have commented that they have attended sessions online which they would otherwise not have been able to make but others have had some suggestions for improvement.

It can be hard to carve time out a busy schedule to attend even an hour-long webinar so there needs to be some incentive like an activity so people get the benefit of attending live. We have taken this on board and tried to build in interactive elements where appropriate. The main lesson we have learnt about webinars is that they are particularly useful for information delivery sessions which would usually involve someone standing at the front of the class delivering a talk. People can easily listen to this at their desk and/or ask questions through the webinar chat box without having to leave work.

Most of these webinars are shared with a Cambridge audience only but a few have been released more widely such as our talk on How to Spot a Predatory Publisher. As discussed in our previous post on advertising videos we have discovered that naming our content something that people are likely to Google is a great way to increase hits! 

Increasing discoverability

As we offer more and more webinars we are starting to think about the best way to collate and share these. Although they can be useful resources, people need to know where to find them without having to hunt around. One of our priorities for 2019 is to gather both our webinars and online resources together to create a mini-hub where library staff can go to find more information.

These resources include webinar recordings but also the results of two other training projects from 2018: our Research in 3 Minutes videos and our Scholarly Communication Information Booklets. Research in 3 Minutes in a series of short videos which outline basic concepts in scholarly communication. Most of these areas can be quite complicated and terminology laden and these videos aim to provide an accessible introduction. They can also be uploaded for display on screens around the library or on other webpages to engage users. We started to create Information Booklets when we realised that all librarians love a handout (at least in our experience!).

These four-page booklets can be viewed online or printed out and offer a more in-depth look at areas we are often asked about, for example what exactly is a Creative Commons license? There are six booklets in the series so far, covering everything from the publication lifecycle to academic social networking and we aim to add more in 2019. 

Online learning

One of our biggest forays into online learning took place with the Research Support Ambassador programme. This is an annual programme aimed at educating library staff on the core elements of research support and in previous years it has been run both face-to-face and via webinar.

This year we decided to do something different and used Moodle to create a completely online course. Participants were able to work though modules including video content, quizzes and discussions to test their understanding of the concepts. Each module was assessed by an activity which allowed learners to put their new knowledge into practice by undertaking a research support task. Examples of this included assessing a data management plan and attempting to spot a predatory publisher.

Overall the course was completed by 20 participants who gave us a lot of positive feedback on the format as well as suggestions for improvements. In the next few years this is something we would like to expand on, perhaps to those outside Cambridge… 

Beyond the University

That doesn’t mean we have neglected non-Cambridge librarians this year. In March our Research Support Skills Coordinator delivered two well-attended sessions on Moving Into Research Support with CILIP. The original session was so popular that we had to add a second and attendees came from around the UK to hear how they could get involved in this exciting new area. There was also a return visit to CILIP HQ in London for their 2018 Careers Day where attendees were introduced to the wonders of working in research support (including dealing with penguin poop and breaking the internet).

We also contributed to a range of other events such as LILAC 2018 and Dawson Day held in the summer – both of which gave us a chance to talk about the need for training in scholarly communication literacy for library staff. 

All in all 2018 has been a very busy year for training but we will not be slowing down in 2019. We have plans to expand our online training offer and deliver even more face-to-face sessions for our community. Who knows what this blog will contain this time next year? Readers had better stay tuned to find out! 

Published 8 January 2019
Written by Claire Sewell
Creative Commons License 

Book Review: Scholarly Communication – what everyone needs to know®

As we wind down towards the last days of 2018, thoughts go to gifts for family and friends. Here, as our last minute gift idea to you, is a book that should be under the tree of every scholarly communication aficionado.

The following book review appeared in Research Fortnight on 15th September 2018 with the title ‘New readers start here‘. It was edited by John Whitfield and is reproduced here with permission.

Book Review

It is odd to be reviewing a book that stresses the importance of “positive reviews in…prestigious publications” to potential sales and publishers’ reputations. Nonetheless, it is safe to say that Scholarly Communication: What everyone needs to know, by Rick Anderson, is excellent.

Scholarly communication is a complex and fast-growing area. Even those working in it find keeping up to date a challenge. The challenge is much greater for those working more widely in research and academia, let alone the general public. The market is ripe for an understandable, generalist overview that explains what scholarly communication is and why anyone should care.

To address the latter point, Anderson notes in his introduction that “there are issues related to scholarly communication about which it would make sense for all of us to know something”. His argument is that decisions made worldwide on health, environment, economics and so on are all underpinned by academic research, reported through the scholarly communication system.

Anderson does a masterful job of distilling the stakeholders, issues and facts into an understandable whole. The discussions about open access and controversies and problems are handled sensitively; a challenge given the wide range of perspectives in this area.

The chapter on copyright is particularly helpful. This is a fundamental aspect of almost all scholarly communication and an area where many people are unsure. In clear language, Anderson explains fair use and fair dealing, licensing, the Creative Commons licences used in open-access publication, orphan works and patents. To do so without overwhelming or boring the reader is something of an achievement.

Anderson’s writing is eloquent and his explanations are clear and precise. Other highlights include discussions of how researchers use e-books, the projects from Google and the HathiTrust repository to digitise books, and an excellent description of how digitisation is allowing libraries to share their special and rare collections with a wider audience.

The book is structured as a series of questions with short answers of one to five pages. This format invites readers to dip in and out of the sections they are interested in. Mostly this approach is successful, although it does result in some repetition.

Anderson is also a victim, in a small way, of the very dynamism that he aims to capture. The volatility of scholarly communication means that most of the specialist discussion tends to occur in outlets that publish quickly, such as mega journals and blogs. The timescale for a regular journal article, which can take a couple of years to get from submission to publication, is too long and risks the contents losing relevance. Books have similarly long lead times; coupled with the dynamic nature of scholarly communication, this makes some out-of-dateness inevitable.

For example, in his chapter on metrics Anderson notes that “the universe of altmetrics is a highly dynamic one, and products and services…seem to be born and die nearly every month”. This is evidenced within that very chapter: among the companies offering research metrics, it mentions Thomson Reuters (whose intellectual property and science business was sold to private equity and changed its name to Clarivate Analytics in October 2016), Delicious (bought by Pinboard in June 2017) and Plum Analytics, which has kept its name but was bought by Elsevier in February 2017.

As the associate dean for collections and scholarly communication at the University of Utah, Anderson makes for a well-qualified author, although the text does reflect his North American perspective. Generally this is not a problem, although a statement such as “There is a professional organisation for university press publishing: the American Association of University Presses” implies, inaccurately, that the rest of the world lacks such organisations.

This is a vast topic, and clearly decisions needed to be made over what to include and omit. Some omissions are easier to justify than others. I would have liked a deeper exploration of the commercial academic publishing market, as this drives much of the activity in the open-access space. The lack of this might reflect the level of disagreement over even basic definitions in scholarly communication, something Anderson acknowledges.

But that’s a minor quibble. Given the need for a book such as this, it would not be surprising if it became compulsory reading for training courses in scholarly communication.

Published 18 December 2018
Written by Dr Danny Kingsley
Creative Commons License

Turn on, tune in, tweet out – experiments in engagement

This time of year is often one of reflection – what went well, what could be improved and so on. In this spirit we are putting up here an assessment of the livestreaming aspect of our outreach programme over the past couple of years.

This blog asks what was successful? What flopped? Where did we get bang for buck? Read on and find out…

Lofty goals

The OSC works towards collaborative engagement with the research community and relevant stakeholders – amongst other things, this helps us to communicate policies, promote our services and identify needs and knowledge gaps within the communities we work with.  

It will come as no surprise, therefore, that the words ‘open’ and ‘transparent’ crop up frequently when we are planning our communications. In the context of events and outreach, we usually start from the position of wanting to invite as many guests to the table as possible – not just those within the University but across the whole scholarly communication community. Given the international span of this group of people, one obvious solution is to take the party on the road – virtually speaking, at least.  

Starting in October 2016 with the ambition to make the most of technological solutions to achieve this (whilst taking into account the limits of our A/V expertise and resources), we experimented over 18 months with various platforms and approaches in order to live-stream, record and share footage from the events we hosted. Evaluating the returns on these efforts has led to some useful lessons: whilst we’d like to share our events as widely as possible, we have had to make some strategic choices to make the venture worthwhile.  

In terms of evaluating the impact of online sharing, we acknowledge that social media marketing is one small part of our Communications remit – whilst the scope for digging down into statistics on YouTube and Twitter engagement is almost unlimited, the time available to devote to this activity is not.

Stepping into the stream: live broadcasting and video recording 

Livestreaming allows viewers to remotely attend events, and we hoped to find a method of broadcasting that would adequately capture all sound and visuals (including slide presentations) whilst allowing viewers to simultaneously contribute their questions and comments. We found these goals something of a challenge!  

1. Adobe Connect

When organising one-day workshops, we initially managed the streaming, recording and processing ourselves using the Adobe Connect package (which came with the advantage that we could use the University’s subscription without any additional costs for us).

However, this method required a stable connection to the wired Local Area Network (LAN), plus high-intensity input from our team members, neither of which were factors that could always be guaranteed – many of the University’s lecture rooms are in old buildings with minimal A/V infrastructure at best, and it was not always possible to plug into sound systems or connect to the ethernet.

After the events, we made recordings of the live-stream available via our YouTube channel, despite some of them falling short of our expectations in terms of sound quality and uninterrupted broadcasting. We concluded that whilst Adobe Connect was excellent for hosting webinars in a controlled environment (where the room was quiet and we were familiar with the available technological capacity) it was not suitable for livestreaming large events. 

2. Calling in the professionals

We took a different approach when organising higher profile events such as the Engaging Researchers in Good Data Management event in November 2017, hiring an external company to take care of both the livestream and video recordings. The difference in quality was remarkable – of course, you get what you pay for!

We also trialled the approach of making video recordings of one-day workshops, without live-streaming. Hiring professional recording equipment from the University Information Service to do this and having a quick in-house tutorial on how to use it again required high-intensity input from our team, although it produced higher quality results than filming through Adobe Connect  

Was it worth it? 

After 18 months of trying out these different methods, we needed to establish if the investment of time and money was reaping rewards, particularly given that the hire of professional equipment and services accounted for the largest single expense for an event. We needed to decide how much priority to give recording and streaming events for sharing in our Communications Strategy.  

A summary of the statistics showed: 

  • maximum livestream engagement reached 50 participants (for the Engaging Researchers event)  
  • engagement with our content on YouTube at the time of dissemination (through advertising in our newsletters, emails and Twitter accounts) varied from ten clicks to 600 clicks. 

The engagement statistics at the time of the event were moderate, and the audience for the livestream did not exceed the audience in the room. We therefore concluded that we would reserve the option of livestreaming for events where sharing on-the-spot footage was of significant benefit to the wider scholarly communication and research data community – for instance for high-profile conferences or politically urgent discussions.

We would continue to hire professional AV services to video ‘headline’ events that were of interest to the community, but would not make recording standard practice for every event.  

A last experiment… 

We realised there was another aspect to this question: after the initial promotion of the recordings, they sat dormant in the YouTube playlist and embedded on our websites, relying on users discovering them by serendipity. We needed to think about continuing to maximise returns on the investment.   

In order to address this additional concern, between March and June 2018, we used our twitter accounts @CamOpenData and @CamOpenAccess to re-promote 33 and 54 videos respectively. We monitored the viewings on YouTube and looked at various metrics in the Twitter analytics. 

During that period we saw an average 16% increase in the YouTube video clicks, with some videos attracting far more attention than others. These viewing figures were less than we had anticipated, and there were various hypotheses as to why: 

  • We were re-promoting the videos to an audience that may well have seen the videos the first time around, so were not offering anything new. 
  • Some videos were specialised in subject and therefore appealed to a limited audience. 
  • Some videos were lengthy and likely to hold the attention only of the most dedicated viewers. 

It was notable that the professional videos we’d commissioned performed better in YouTube as well as in Twitter in terms of engagement rates and impressions. Perhaps due to our confidence in the quality of these videos, we invested extra time in promoting them (for instance by adding images to our tweets), and engagement was indeed higher. However, the fact that many of these videos were short recordings of single presentations may also have added to their relative appeal. 

What we learnt 

There were lots of positive outcomes of this final experiment. The re-promotion campaign helped to maintain the presence of our brand on Twitter and YouTube, resulting in almost 600 clicks on existing YouTube content over three months. It added diversity to the content of our tweets and increased tweet impressions as a whole. It contributed to our strategic aim to disseminate professional knowledge, maintained contact with our community, and influenced the acquisition of new followers. 

In addition, we observed the most popular themes amongst our Twitter followers:  

  • Open access monograph publishing  
  • How to spot a predatory publisher 
  • Peer review and the benefits of openness 
  • Copyright 
  • Text & Data Mining 
  • Data management needs for different disciplines and different institutions 
  • Standard practices for managing and sharing code 
  • How to make data publications first class research outputs 

These are insights that will inform our planning for future engagement activity. 

Looking ahead 

Our re-promotion experiment has given us a handy list of priorities that will allow us to keep using our film resources even when staff time is scarce, and will inform our event planning from the outset if we know we want to record the occasion.  Our top take-away tips: 

1. Less but better – Resources are limited: livestream important events only but don’t compromise on quality.  Short videos are better received: take into account the length of talks, panel discussions and workshops. Can longer talks be naturally broken into shorter segments? 

2. Specific, practical, catchy – Take time to create engaging and specific titles for videos, and emphasise their practical focus, for example by starting with “How to”. These items are instantly more appealing to the browsing viewer, and also appear higher on search rankings when the subject is Googled.    

 3. Re-use and repurpose – Use short clips from older videos on social media when their content complements news or trends. Routinely reference videos when writing content, for example blogs or training slides. 

Want to know more? 

You can explore these recordings of past events on the OSC’s website, and subscribe to our YouTube Channel 

For an alternative perspective on using video to engage with the research and scholarly communications communities, join our Research Skills Support Coordinator, Claire Sewell, with an expert panel for the MmIT webinar, Using Video in your library and information service2pm Wednesday 12 December, and look out for her upcoming blog on preparing online training.   

 Published 10 December 2018
Written by Hannah Haines and Maria Angelaki
Creative Commons License

Blood: in short supply?

Two years ago (almost to the day) we called out Blood for their misleading open access options that they offered to Research Council and Charity Open Access Fund (COAF) authors. Unfortunately, little has changed since then:

Neither of these routes is sufficient to comply with either Research Councils’ or COAF’s open access policies which require that the accepted text be made available in PMC within 6 months of publication, or that the published paper is available immediately under a CC BY licence.

At the time, we called on Blood to change their offerings or we would advise Research Councils and COAF funded authors to publish elsewhere. And that’s exactly what’s happened:

Figure 1. All articles published in Blood since 2007 which acknowledge MRC, Wellcome, CRUK or BHF funding. Data obtained from Web of Science.

Over the last two years we’ve seen a dramatic decline in the number of papers being published in Blood by Medical Research Council (MRC), Wellcome Trust, Cancer Research UK (CRUK) and British Heart Foundation (BHF) researchers. The number of papers published in Blood that acknowledge these funders in now at its lowest point in over a decade.

It’s important to remember that the 23 papers published in Blood in 2017 are all non-compliant with the open access policies of Research Councils and COAF, and if these papers acknowledge Wellcome Trust funding then those researchers may also be at risk of losing 10% of their total grant. If you are funded by Research Councils or one of the COAF members, please consider publishing elsewhere. SHERPA/FACT confirms our assessment:

Sign the open letter

We’re still collecting signatures for our open letter to the editor of Blood in the hope that they’ll reconsider their open access options. Please join us by adding your name.