All posts by Office of Scholarly Communication

Lessons learned from Jisc Research Data Champions

In 2017 four Cambridge researchers received grants from Jisc to develop and share their research data management practices. In this blog, the four awardees each highlight one aspect of their work as a Jisc Data Champion.

The project

All four Champions embarked on a range of activities throughout the year including creating local communities interested in RDM practices, delivering training, running surveys to understand their department better, creating ‘how-to’ guides for would-be RDM mentors and testing Samvera as part of RDSS. They were excited by the freedom that the grant gave them to try out whatever RDM related activities they wanted, which meant they could develop their skills and see ideas come to fruition and make them reusable for others. For example, Annemarie Eckes developed a questionnaire on RDM practices for PhD students and Sergio Martínez Cuesta has posted his training courses on GitHub.  

However, throughout the duration of the award they also found some aspects of championing good RDM disconcerting. Whilst some sessions proved popular, others had very low attendee figures, even when a previous iteration of the session was well attended. They all shared the sense of frustration often felt by central RDM services that it is getting people to initially engage and turn up to a session that is the hard part. However, when people did come they found the sessions very useful, particularly because the Champions were able to tailor it specifically to the audience and discipline and the similar background of all the attendees provided an extra opportunity for exchanging advice and ideas that were most relevant.

The Champions tried out many different things. The Jisc Research Data Champions were expected to document and publicise their research data management (RDM) experiences and practices and contribute to the Jisc Research Data Shared Service (RDSS) development. Here the Champions each highlight one thing they tried out, which we hope will help others with their RDM engagement.

BYOD (Bring your own data)

Champion: Annemarie Eckes, PhD student, Department of Geography

The “Bring your own data” workshop was intended for anyone who thought their project data needed sorting, they needed better documentation, or even they needed to find out who is in charge or the owner of certain data. I set it up to give attendees time and space to do any kind of data-management related tasks: clean up their data, tidy up their computer/ email inbox, etc. The workshop was, really, for everyone whether at the start of their project and at the planning stage or in the middle of a project and had neglected their data management to some extent.

For the workshop the participants needed a laptop or login for the local computers to access their data and a project to tidy up or prepare, that can be done within two hours. I provided examples of file naming conventions and folder structures as well as instructions on how to write good READMEs (messages to your future self) and a data audit framework to give participants some structure to their organisation. After a brief introductory presentation about the aims and the example materials I provided, people would spend the rest of the time tidying up their data or in discussions with the other participants.

While this was an opportunity for the participants to sit down and sort out their digital files, I also wanted participants to talk to each other about their data organisation issues and data exchange solutions. Once I got everyone talking, we soon discovered that we have similar issues and were able to exchange information on very specific solutions.

1-on-1 RDM Mentoring

Champion: Andrew Thwaites, postdoc, Department of Psychology

I decided to trial 1-on-1 RDM mentoring as a way to customise RDM support for individual researchers in my department. The aim was that by the end of the 1-on-1 session, the mentee should understand how to a) share their data appropriately at the end of their project, and b) improve on their day-to-day research data management practice.

Before the meeting, I encouraged the mentee to compile a list of funders, and their funder’s data sharing requirements. During the meeting, the mentee and I would make a list of the data in the mentees project that they are aiming to share, and then I would then help them to choose a repository (or multiple repositories) to share this data on, and I’d also assist in designing the supporting documentation to accompany it. During the sessions I also had conversations about about GDPR, anonymising data, internal documentation and day-to-day practices (file naming conventions, file backups etc.) with the mentee.

As far as possible, I provided non-prescriptive advice, with the aim being to help the mentee make an informed decision, rather than forcing them into doing what I thought was best.

Embedding RDM  

Champion: Sergio Martinez Cuesta, research associate, CRUK-CI and Department of Chemistry

I came to realise early in the Jisc project that stand-alone training sessions focused exclusively on RDM concepts were not successful as students and researchers found them too abstract, uninteresting or detached from their day-to-day research or learning activities. I think the aerial view of the concept of 1-on-1 mentoring and BYOD sessions is beautiful. However, in my opinion, both strategies may face challenges with necessary numbers of mentors/trainers increasing unsustainably as the amount of researchers needing assistance grows and the research background of the audience becomes more diverse.

To facilitate take-up, I tapped into the University’s lists of oversubscribed computational courses and found that many researchers and students already shared interests in learning programming languages, data analysis skills and visualisation in Python and R. I explored how best to modify some of the already-available courses with an aim of extending the offer after having added some RDM concepts to them. The new courses were prepared and delivered during 2017-2018. Some of the observations I made were:

  • Learning programming naturally begs for proper data management as research datasets and tables need to be constantly accessed and newly created. It was helpful to embed RDM concepts (e.g. appropriate file naming and directory structure) just before showing students how to open files within a programming language.
  • The training of version control using git required separate sessions. Here students and researchers also discover how to use GitHub, which later helps them to make their code and analyses more reproducible, create their own personal research websites …
  • Gaining confidence in programming, structuring data / directories and version control in general helps students to acknowledge that research is more robust when open and contrasted by other researchers. Learning how researchers can identify themselves in a connected world with initiatives such as ORCID was also useful.

Brown Bag Lunch Seminar Series: The Productive Researcher

Champion: Melissa Scarpate, postdoc, Faculty of Education

I created the Productive Researcher seminar series to provide data management and Open Access information and resources to researchers at the Faculty of Education (FoE). The aim of the brown bag lunch format was to create an informal session where questions, answers and time for discussion could be incorporated. I structured the seminars so they covered 1) a presentation and discussion of data management and storage; 2) a presentation about Open Access journals and writing publications; 3) a presentation on grant writing where Open Access was highlighted.

While the format of the series was designed to increase attendance, the average was four attendees per session. The majority of attendees were doctoral students and postdocs who had a keen interest in properly managing their data for their theses or projects. However, I suspect it may be the case that those attending already understood data management processes and resources.

In conclusion, I think that whilst the individuals that attended these seminars found the content helpful (per their feedback) the impact of the seminars was extremely limited. Therefore, my recommendation would be to have all doctoral students take a mandatory training class on data management and Open Access topics as part of their methodological training. Furthermore, I think it may be most helpful in reaching postdocs and more senior researchers to have a mandated data management meetings with a data manager to discuss their data management and Open Access plans prior to submitting any grant proposals. Due to new laws and policies on data (GDPR) this seems a necessary step to ensure compliance and excellence in research.

Published 2 October 2018
Compiled and edited by Dr Lauren Cadwallader from contributions by Annemarie Eckes, Dr Andrew Thwaites, Dr Sergio Martinez Cuesta, Dr Melissa Scarpate
Creative Commons License

10 years on and where are we at? COASP 2018

Last week, the 10th Conference of the Open Access Publishing Association was held in Vienna. Much was covered over the two and a half days. A decade in, this conference considered the state of the open access (OA) movement, discussed different approaches to OA, considered inequity and the infrastructure required to meet this need and argued about language. Apologies – this is a long blog.

Fracturing of the ‘OA movement’?

In an early discussion, Paul Peters, OASPA President and CEO of Hindawi noted that similarly to movements like organic food or veganism, the OA ‘movement’ is not united in purpose. When what appear to be ‘fringe’ groups begin, it is easy to assume that all involved take a similar perspective. But the reasons for people’s involvement and the end point they are aiming for can be vastly different. Paul noted that this can be an issue for OASPA because there is not necessarily one goal for all the members. He posed the question about what this might mean for the organisation.

It also raises questions about approaches to ‘solving’ OA issues. Many different approaches were discussed at the event.

Unbundling

The concept of ‘unbundling’ the costs associated with publishing and offering these to people to engage with on an as needs basis was raised several times. This points to the concept put forward last year by Toby Green of the OECD. It also triggered a Twitter conversation about the analogy of the airline industry (and how poorly they treat their customers).

If the scholarly journal were unbundled, different players could deliver the functions. Kathleen Shearer, Executive Director of COAR noted that not all functions of scholarly publishing need to take place on the same platform. She suggested next generation repositories as one of the options.

Jean-Claude Guedon provided several memorable quotes from the event, with the most pertinent being “We don’t need a ‘Version of Record’. We need a ‘record of versions’”. Kirsten Ratan, of Coko Foundation agreed in her talk on infrastructure, stating “we publish like its 1999”. The Version of Record is the one that matters and it is static in time. But it is not 1999, she noted, and we need to consider the full body of work in its entirety.

After all, it was observed elsewhere at the conference, nothing radical has changed in the format of publications over the past 25 years. We are simply not using the potential the internet offers. Kathleen quoted Einstein stating “You cannot solve a problem from the same consciousness that created it. You must learn to see the world anew”.

New subscribing models

Wilma van Wezenbeek, from TU Delft and Programme Manager, Open Access, VSNU discussed the approach to negotiations taken in The Netherlands. They are arguing that when comparing how much is spent per article under the toll system and what it would cost to have everything published OA, that enough money exists in the system. VSNU are being pragmatic, focusing on big publishers and going for gold OA (to avoid the duplication of journals). She also noted how important it is for libraries to have presidents of the University at the negotiation table. Her parting advice on negotiations was to hold your nerve, stay true to the principles and don’t waiver.

This approach does not include smaller publishers and completely ignores fully gold publishers, an observation that was made a few times in the conference. An alternative approach, argued Kamran Naim, Director of Partnerships & Initiatives at Annual Reviews, was collective action. In his talk ‘Transitioning Subscriptions to OA Funding: How libraries can Subscribe to Open’ he asked what is required to flip the subscription cost to manage OA publication (instead of APCs). The challenge with this idea is it requires people to continue subscribing even when material is OA and they don’t have to. Another problem is the idea of ‘subscribing’ to OA material can become a procurement challenge. This cost can be classified as a ‘donation’ which is not allowed by some library budgets. So the suggestion is that subscribing libraries will be offered to subscribe to select journals and receive 5% off the subscription cost. The plan is to roll out the project to libraries in 2019 for 2020 models.

Study – downloading habits when material is OA

A very interesting study was presented by Salvatore Mele and Alexander Kohls from CERN and SCOAP3. Entitled ‘Preprints vs traditional journals vs Open Access journals – What do scientists download?’ the study compared downloads of the same scientific artefact as a preprint on arXiv and as a published article on a (flipped) journal platform.

Their findings, which came from arXiv, Elsevier and SpringerNature’s statistics, showed that there is a significant use of the version in arXiv during the first six months (when the only version of the work is available in arXiv) which drops off dramatically after the work is published (a point identified as when the DOI is minted).

They also compared downloads from 2013 – before the journals flipped to gold under the SCOAP3 arrangement with those from 2016 when the journals were open access. The pattern over time was similar, but accesses in 2016 were higher overall over time, but dramatically higher in the first three months after the DOI was minted.

 

The final slide demonstrated that having recent open access content was also driving up downloads of older works in the non-open access backfiles from the publisher platforms.

This work is not published “because we have day jobs”. I have included my poor images of the slides in this blog and will link to the slides when they are made available.

Nostalgia

Being the 10th OASPA conference there was some reminiscing throughout the presentations. In a keynote reflection on the Open Access movement, Rebecca Kenniston from KN Consultants noted several myths about OA publishing that existed 10 years ago that still persist. Rebecca discussed “library wishful thinking” when it came to OA. This has included thinking OA would solve the serials crisis, that practice would change ‘if only the academic community were aware’, that institutional repositories and mandates would solve OA. (Certainly one of my own observations over the 16 or so years I have been involved in OA is there is always a palpable sense of glee at OA events when ‘real’ researchers bother to turn up.)

David Prosser, Executive Director of Research Libraries UK was outed as the architect of the ‘hybrid’ option, which he articulated in his 2003 paper “From here to there: a proposed mechanism for transforming journals from closed to open access“. David defended himself by noting that the whole concept did not work because it was proposed with an assumption about the “sincerity of the industry to engage”.

This made me consider the presentation I gave to another 10th anniversary conference this year – Repository Fringe at Edinburgh. In 1990 Steven Harnad wrote about ‘Scholarly Skywriting’ and described the obstacles to the ‘revolution’ as including ‘old ways of thinking about scientific communication and publication’, ‘the current intellectual level of discussion on electronic networks is anything but inspiring’, ‘plagiarism’, ‘copyright’ and ‘academic credit and advancement’ amongst others. Little appears to have changed in the past 28 years.

The more perceptive readers will note how long ago these dates are. This OA palaver has been going on for decades. And it seems even longer because, as Guido Blechl from the University of Vienna noted, “open access time is shorter than normal time because it moves so fast”.

But none of this wishful thinking has come to fruition. Rebecca asked “what shift do we need in our thinking?” Well in many ways that shift has landed in the form of Plan S. See the related blog for the discussions about Plan S that happened at the conference.

Language matters

Rebecca also mentioned “our own special language”, which is, she observed, a barrier to entry to the discussion. Indeed language issues came up often during the few days of the conference.

There were a few references to the problems with the terms ‘green’ and ‘gold’, and specifically gold. This has long been a personal bugbear of mine because of the nonsensical nature of the labels, and the associations of ‘the best’ and ‘expensive’ with gold. There has been a co-opting of the term ‘gold’ by the commercial publishing sector to mean ‘pay to publish’. Of course all *hybrid* journals charge an APC, and more articles are published where an APC has been paid than not, which is possibly why the campaign has been successful – see the Twitter discussion here. But it is inaccurate. In truth, ‘gold’ means the work is open access from the point of publication. More fully gold open access journals do not charge an APC than do.

There was also concern raised about the term ‘Open Science’ which, while in Europe is an inclusive term to cover all types of research, is not perceived this way in other parts of the world. There was strong support amongst the group for using the term ‘Open Scholarship’ as an alternative. This also brought up a discussion about using the term ‘publication’ rather than the more inclusive research ‘outputs’ or ‘works’, which encompass publishing options beyond the concept of a book or a journal.

Inequity

Inclusivity is not optional! We need a global (information/publishing) system!” was the rallying cry of Kathleen Shearer in her talk.

For many in the OA space, equity of access to research outputs lies at the centre of what the end goal is. It is clear that knowledge published by academic journals is inaccessible to the majority of researchers in low- and middle-income countries. But if we move to a fully gold environment, with the potential to increase the cost of author participation in the publishing environment, then we might have simply reversed the problem. Instead of not being able to read research, academics in the Global South will be excluded from participating in the academic discussion.

There was a discussion about the change in global publishing output since 2007, which reflects a big increase in output from China and Brazil, but otherwise shows that output is uneven and not inclusive.

One possible solution to this issue would be for open access publishers to make it clearer to authors that they offer waivers for authors who are unable to pay the APC. There was discussion about the question ‘what form should OA publishing take in Eastern and Southern Europe?’. The answer was that it should be inexpensive and use infrastructure that is publicly owned and cannot be sold.

Infrastructure

Ahhhh infrastructure. We are working within a fast consolidating environment. Elsevier continues to buy up companies to ensure it has representation across all aspects of the scholarly ecosystem and Digital Science is developing and acquiring new services to a similar end. See ‘Virtual Suites of tools/platforms supported by the same funder’ and ‘Vertical integration resulting from Elsevier’s acquisitions’. These are obvious examples but Clarivate Analytics has recently acquired Publons and ProQuest has absorbed Ex Libris which has in turn bought Research Research and has plans to create Esploro – a cloud-based research services platform, so this is prevalent across the sector.

This raises some serious concerns for the concept of ‘openness’. In his excellent round up, Geoff Bilder, Director of Strategic Initiatives at Crossref, commented that we are looking in the rear view mirror at things that have already happened and we are not noticing what is in front of us. While we might end up in a situation where publications are open access, these are not representative of the discussions that occurred to allow the authors to come to those conclusions. The REAL communication happens in coffee shops and online discussions. If these conversations are using proprietary systems (such as Slack, for example), then these conversations are hidden from us.

Who owns the information about what is being researched and the data behind it when the scholarly infrastructure is held within a commercial ecosystem? Is there an opportunity to reimagine? asked Kirsten Ratan, referencing SPARC’s action plan on ‘Securing community controlled infrastructure’. “In scholarly communication,” she summarised, “we have accepted the limitations of the infrastructure with a learned helplessness. It‘s time that these days are over.”

There are multiple projects currently in place around the world to collectively manage and support infrastructure. Kathleen Shearer described several projects:

  • Consortia negotiations such as OA2020 and SCOAP3
  • The Global Sustainability Coalition for Open Science Services (SCOSS) is an international group of leading academic and advocacy organisations that came together in 2016 to help secure the vital infrastructure underpinning Open Access and Open Science. SPARC Europe is a founding member.
  • The 5% commitment is a call that “Every academic library should commit to contribute 2.5% of its total budget to support the common infrastructure needed to create the open scholarly commons”. This is primarily a US and Canadian discussion.
  • OA membership models
  • APC funds

There are actually a couple of other projects not mentioned at COASP 2018. In 2017, several major funding organisations met and came to a strong consensus that core data resources for life sciences should be supported through a coordinated international effort to both ensure long term sustainability and appropriately align funding with scientific impact. The ELIXIR Core Data Resources project is identifying resources defined as a set of European data resources that are of fundamental importance to the wider life-science community and the long-term preservation of biological data.

OA Monographs

The final day of the event looked at OA monographs. Having come from a British Academy event on OA monographs the week before (see the Twitter discussion), this debate is fairly top of mind at the moment for me.

Sven Fund, who is both the Managing Director of Knowledge Unlatched and of fullstopp which is running a consultation on OA monographs for Universities UK, spoke about the OA monograph market. He noted that books are important, and not just because “people like to decorate their living rooms with them”. But he suggested that rather than just adding a few hyperlinks, we should be using the technology available to us with books. It has been the smaller publishers who have been innovating, large publishers have not been involved, which has limited the level of interest.

The OA book market is still small, with only 12,794 books and chapters listed in the Directory of OA Books (DOAB) compared to over 3 million articles  listed in the Directory of OA Journals (DOAJ). But growth in OA books is still strong even though the OA journal market matures. Libraries are the bottleneck, Sven argued, because they need to change the funding model significantly. There has been 10-15 years of discussion and now is the time to act. Libraries need to make a significant commitment that X% goes into open access.

There are also problems with demonstrating proof of impact of the OA book. Sven argued we need transparency and simplicity in the market, and said that no-one is doing an analysis of which books should be OA or not based on impact and usage. This needs to happen.

Sven said that royalties are important to authors – not because of the money but because it shows how much the work has been read. For this reason he argued we need publishers to share their usage data for a specific OA titles with the author. As an aside, it seems extraordinary that publishers are not already doing this, and I asked Sven why they don’t. He replied that it seems that ‘data is the new gold’ and therefore they do not share the information. Download information about open books is often protected because of the risk a of providing information that gives their competitors a commercial advantage.

But Sven also noted there needs to be context in the numbers. Libraries in the past have done a simple analysis of cost per download without taking into consideration the differences in topics. Of course some areas cost more per download than others, he said. There is also the risk that if you share this data then you might have a situation where a £10,000 book only has a few downloads which ‘looks bad’.

The profit imperative

There were some tensions at the meeting about profits. A question that arose early in the first panel discussion was: “Should we be ashamed as commercial publishers for making money?”. One response was that if you don’t make money you are not a commercial publisher. But the same person noted the ‘anti commercial sentiment’ in these discussions indicate that something is wrong.

A secondary observation was that open access publishers are doing a good job “while the current incentive systems are in place”. This of course points to the academic reward system controlling the behaviour of all players in this game.

As is always the case at open meetings, the Journal Impact Factor was never far away, although Paul Peters noted that the JIF was partly responsible for the success of OA journals, PLOS ONE took off when it received an impact factor. It was noted in that discussion that OA journals obtaining and increasing their JIF is ‘not proof of success, it’s proof of adaptation’.

The final talk was from Geoff Bilder. One participant described his talk on Twitter as “the best part of the publishing conference, where Geoff Bilder tells us everything that we’re doing that’s wrong”. Geoff noted that throughout the conference people had used some terms fairly loosely, including ‘commercial’ and ‘for profit’. He noted that profit doesn’t necessarily mean taking money out of the system, often profit is channelled back into the business.

In the end

In all it was an interesting and thought provoking conference. Possibly the most memorable part was the 12 flights of stairs between the lecture rooms and the breakout coffee and lunch space. This has been the first OA conference I have attended where participants improved their cardiovascular fitness as a side bonus to the event.

The Twitter hashtag is #COASP10

Published 24 September 2018
Written by Dr Danny Kingsley
Creative Commons License

The Plan S conversation continues

Last week, the 10th Conference of the Open Access Publishing Association was held in Vienna (see the blog about the event). On the Tuesday afternoon Robert-Jan Smits spoke as part of a panel about Plan S [this link to a video of his talk was added 3 October 2018]. It was a calm measured discussion where he thanked many people who had worked with them to develop the plan. He noted that  things went ‘wild’ after releasing the plan, with over 70,000 tweets on the first day. The comments, he said, were mostly positive but there are some negative comments from publishers and some academics – which not surprising  because the plan is so robust. He also noted multiple positive comments from developing countries, thanking him “because they struggle to access research outputs”.

There were some suggestions that came from the floor. One was the need for transparency in pricing. Questions were asked about infrastructure and how the plan would support it. Smits noted “we need to look into this and decide what will we support to get this plan on the road and reach these targets”.

Stuart Taylor asked if there was a green option that would be compliant. Smits noted that he doesn’t discern between green and gold, he prefers to think about access and to that end, yes, publishing an AAM with zero embargo and CC-BY would comply.

Funders stepping up

Echoing other comments that had occurred throughout the conference, Smits noted that funding agencies have left open access negotiations to libraries, but the agencies are the ones that hold the key to the solution. This realisation is what led to the development of Plan S. While the funding agencies couldn’t push OA alone, Smits asked if they would be prepared to work together to move forward on open access.

Smits underlined that the principles are simple – “if you want to use the funds you are required to meet these rules”. He also emphasised that this “is not a menu”. If people sign up “they sign up to everything” to ensure a level playing field. The list of signatories may well increase, he noted, with discussions happening in Finland “and other countries”. He also mentioned charitable foundations and non-European organisations.

A question of timing

In terms of timing Smits gave three clear indicators. The first is to have a detailed implementation plan by the end of the year, including a “robust policy”. This will include information like the level of cap on APCs and other details. Smits noted that the cap exists initially to ensure there isn’t an explosion of costs, “and then let the market decide”. He suggested that the clever high quality journals will offer more value for money within cap. In terms of the cap amount, they are looking at how much it costs to create an article which will inform a “fair price”.

The second indicator about timing related to flipping journals. Smits mentioned that he has extended his invitation to the larger publishers (Elsevier, Wiley and Springer-Nature) to join the discussion about how to flip their journals. It was “not acceptable” for hybrid journals to become the new norm or business model so these arrangements need to be transformative and only for 3-4 years before flipping the journals.

The final timing indicator related to books, which were, according to Smits, a big point for discussions over formulation of the plan. They have agreed not to have a deadline for books as the compromise to funding agencies, noting that full OA monographs won’t be ready for 2020. Implementation plan will include language about how they see this happening, and while there is no specific date, the range of timing for OA books could be between 2022 – 2026.

Academic freedom?

Inevitably the question of academic freedom came up. Smits noted where to publish is about academic choice, not academic freedom. He said academic freedom is about researching what you want to pursue, not publishing where you want. If you take the academic freedom argument, he asked, what about wanting to stop people from publishing in predatory journals? Is that not also preventing academic freedom? Plan S pushes and encourages scholars to publish for access.

He also noted that academic freedom is the wrong argument in the debates we *should* be having. It is a way to stifle debate, and he noted the vested interests in scholarly publishing because it is so lucrative. In practice, currently many publication choices are not free anyway but are tied to impact factor. He said we should try to get rid of the “obsession with JIF”. Hiring on the basis of JIF is a “sad, sad, situation”, he argued, noting we need to adhere to the Declaration on Research Assessment (DORA) and find better metrics. This calls for a transformation of culture, to walk the walk and not just talk the talk.

The conference discussions can be viewed at the Twitter hashtag #COASP10

Published 24 September 2018
Written by Dr Danny Kingsley
Creative Commons License

Relax everyone, Plan S is just the beginning of the discussion

If you are working (or even vaguely interested) in the scholarly communication space then you will not have failed to hear about the release of ‘Plan S’ last week. There has been a slew of reports and commentary (at the end of the sister blog “Most Plan S principles are not contentious”). Here’s another (hopefully useful) addition to the mix.

The document identifies the key target as being: After 1 January 2020 scientific publications on the results from research funded by public grants provided by national and European research councils and funding bodies, must be published in compliant Open Access Journals or on compliant Open Access Platforms.” There are 10 supporting principles to this statement.

The plan is specifically engineered to force the hand of publishers and academics to really embrace (begrudgingly adopt?) change. Personally I welcome a bit of disruption. It will be no surprise to anyone that I consider the policies that arose from Finch to have failed. But this new development has, understandably, given a few people the jitters.

First up, and if this is all you read remember this, Plan S is a statement of principle. Until we see the actual policies for our funding bodies everything is speculation. And while UKRI is one of the 11 13* funding bodies that has signed up to Plan S, it has said that the report from the review of the OA policy is unlikely to appear before the second half of next year.

[*changed on 14 October]

The reassuring part

So the first thing to say is – don’t panic. We have some time. The second is that fully half of the 10 principles are not contentious – see the sister blog. A further two may have some implications for institutional administration and possibly for managing budgets, but are again fairly non contentious from an academic, and mostly even from an institutional, perspective.

And then there were three

So we are down to three principles that need a little more unpacking. They relate to the retention of copyright and the ability choose where to publish. It is worth looking at these in more detail, and consider the information contained in the accompanying document “cOAlition S: Making Open Access a Reality by 2020: A Declaration of Commitment by Public Research Funders”. As it happens, we are already well on our way with many of these principles in the UK anyway. Let’s take a closer look.

Retaining copyright

Authors retain copyright of their publication with no restrictions. All publications must be published under an open license, preferably the Creative Commons Attribution Licence CC BY. In all cases, the license applied should fulfil the requirements defined by the Berlin declaration.

With my OA advocacy hat on I agree with this statement. There is no need for a publisher to hold full copyright over a work. They are able to operate in a commercial environment with a first publication right. Currently the system means that researchers must apply for permission to reuse work of their own if writing a new piece of work. There is a significant side income stream for publishers in relation to copyright ‘management’. Publishers claim they need copyright so they can protect author’s rights, but there appear to be few examples of a publisher protecting, say the integrity of an author’s work rather than the income stream from the work.

And this is not the first statement of this kind. The University of California released on 21 June their Declaration of Rights and Principles to Transform Scholarly Communication which states as one of the principles: “No copyright transfers. Our authors shall be allowed to retain copyright in their work and grant a Creative Commons Attribution license of their choosing”.

However as a person responsible for implementing policy within a large research institution I can see some issues that will need to be managed.

For a start, currently, in the vast majority of cases, while researchers own the copyright of their work, they sign it over to the publisher of their articles. As it happens the retention of copyright is a fundamental principle of the UK Scholarly Communications Licence (UK-SCL) which allows institutions to provide a REF compliant green OA route while allowing authors to retain their rights.

The alternative is to negotiate (as the sector) with the publishing industry to ensure that the publishing agreements that each researcher signs retains the author’s copyright. This would also require a huge advocacy and education programme amongst our community. For an excellent analysis of why there remains such a high level of confusion and misunderstanding about copyright amongst our academic community, I strongly recommend Dr Lizzie Gadd’s guest post to the Scholarly Kitchen Academics and Copyright Ownership: Ignorant, Confused or Misled?

The requirement for an open license is also potentially an issue for some disciplines. While many science based disciplines are not concerned with a requirement to publish under a Creative Commons Attribution (CC-BY) licence, there are members of our Arts, Humanities and Social Science communities who only feel comfortable with a CC-BY-NC-ND license. It is the Non Derivative aspect of the license that is of greatest concern and has been the subject of considerable discussion.

Restriction on ability to publish in a hybrid journal

The “hybrid” model of publishing is not compliant with the above requirements.

The nuclear interpretation of this statement is that funders won’t pay for hybrid at all. There are several precedents for this. Several UK institutions have stopped supporting payment for hybrid. London School of Tropical Diseases and Medicine are now restricted to fully open access journals only. University of St Andrews will no longer be able to pay APCs for articles via the ‘gold’ route in hybrid (subscriptions-based) journals. Their normal criteria is if the journal is listed in DOAJ. A 2016 analysis showed this is a common position.

I have written extensively about hybrid mostly arguing against it. But I do support the position that we need to walk carefully here. In our analysis at Cambridge on what might be seen as a ‘progressive’ publisher we noted there is an extremely long tail of society and smaller journals that we don’t publish in much but that collectively are a not insignificant number of papers. Let’s just say that learned societies have some way to go on their open access journey. But if we were to prevent our researchers from being able to publish in these journals this could well deeply affect the learned societies.

That’s why I welcome the statement in the preamble document that ‘transformative’ type of agreements which include offsetting arrangements will be acceptable under certain circumstances. The interpretation of this statement by UKRI into their policy will determine which publishers will be acceptable or otherwise.

Restriction on choice of publication outlet

In case such high quality Open Access Platforms or journals do not yet exist, the Funders will jointly provide incentives and support to establish these.

This one is potentially problematic because of the perception there will be a restriction on choice of publication options. But that is not necessarily the case.

The publishing sector adopted the language of ‘a threat to academic freedom’ this year in relation to the question of funders refusing to pay for hybrid open access. Academic freedom refers to freedom of expression not freedom of choice of publication outlet. This language is again being used by  the publishing sector in light of Plan S.

This language is now also being used by the academic sector. In an impassioned post European scientists state that Plan S means researchers are “forbidden to publish in subscription journals, including in hybrid ones, where OA option is available at an extra cost.” This is simply not the case. As described above, not all hybrid is necessarily off the table.

The other point that seems to be missed is under Plan S, authors can publish wherever they choose if they deposit the Author’s Accepted Manuscript in an institutional repository under a CC-BY license with a zero month embargo. We are halfway there already in the UK where authors generally are already depositing their work to an institutional repository for REF compliance. The part that requires attention then goes back to the question of the authors retaining copyright over their own work.

The question of access to open access publishing options is more complicated. There are many disciplines in which there are very few open access journals at all. These will need specific support especially initially in relation to these policies. Even then this is going to be tricky because establishing a new journal takes time. There are a few precedents, the Wellcome Trust launched Wellcome Open Research in 2016 based on the F1000 platform, and Bill and Melinda Gates Foundation followed suit using the same platform in 2017. But these are unlikely to reassure many of our researchers.

The elephant in the room

There are some serious concerns with Plan S which relate to the equity issue of moving to a pay to publish ecosystem. These are valid and need to be discussed in the broader context of the open research debate. But that is not the theme of the majority of concerns from the academic sector. Those worries about freedom of choice to publish point to the real problem – what is attached to publication.

The problem is not Plan S, or open access per se. Publishing in specific journals or with specific publishers is primarily an issue of career prospects rather than of disseminating the work, and has been for a long time. When researchers say that the right to publish in an outlet of their choosing threatens ‘academic freedom’ they are referring to their ability to subsequently succeed in future job applications, promotions and grant applications. It is the academic reward system in which everyone is trapped.

Indeed the Plan S preamble refers to a “misdirected reward system which puts emphasis on the wrong indicators (e.g. journal impact factor)”. It commits to “fundamentally revise the incentive and reward system of science” and suggests that the San Francisco Declaration on Research Assessment (DORA) as a starting point.

This is the real conversation we need to be having. It is not an easy one to address, but for those who have been arguing for the need to have a serious, international, sector wide conversation about this, Plan S offers a welcome shot in the arm.

Published 12 September 2018
Written by Dr Danny Kingsley
Creative Commons License

Most Plan S principles are not contentious

This is a sister blog to “Relax everyone, Plan S is just the beginning of the discussion” and provides the ‘supplementary material’ to that blog. It discusses the points in the Plan S principles that are not particularly contentious.

At the end of this blog is a list of links and commentary to date on Plan S.

Not much new here

The Funders will ensure the establishment of robust criteria and requirements for the services that compliant high quality Open Access platforms and journals must provide.

This is perfectly reasonable. The amount of money being invested is huge and quite rightly, the funders want to articulate what they are prepared to pay for. It is also helpful from an institutional perspective to have guidelines that clearly identify which journals are compliant and which are not.

Indeed, there is a precedent. In 2017 the Wellcome Trust introduced a publisher requirement list stating that compliant publishers needed to deposit to PubMed Central Europe, apply the correct licence and provide invoices that contained complete and understandable information. They asked publishers to sign up to these principles to be listed on their ‘white list’.

Where applicable, Open Access publication fees are covered by Funding Agencies or universities…

This point reflects the status quo in the UK at least. Universities across the UK are currently managing open access payments through various funding models. In some instances, such as Cambridge, payments are only made from funds provided by funding bodies with no extra funds provided by the institution. Other institutions such as UCL provide central university funds in addition to those provided by funders. There are a small number of institutions which do not receive any funds from funders but do provide central funds for specific publications.

Of course, if journals were to flip to fully open access then funds currently being used to pay for subscriptions could be freed up to divert to expenditure on APCs for fully gold publications.

Funders will ask universities and libraries to align their policies and strategies, notably to ensure transparency.

While this might be a little tricky simply because of the individual governance arrangements at institution, it is a sensible thing to aim for.

The above principles shall apply to all research outputs, but it is understood that the timeline to achieve Open Access for monographs and books may be longer than 1st January 2020.

Open Access monographs ARE contentious, don’t get me wrong. But in the context of this statement of principle, there is concession that there is some work to be done in this space. And we already knew that UKRI intends to include monographs in the post REF2021 (as in, anything published from 1 January 2021). Wellcome Trust have had OA monographs in their policy for years.

The importance of open archives and repositories for hosting research outputs is acknowledged because of their long-term archiving function and their potential for editorial innovation.

Now I know this is contentious for us Open Access nerds because there is a sense that repositories are once again being pushed into the shadows, which is what happened with the Finch report. But as noted in the main blog, under Plan S, deposit of an Author’s Accepted Manuscript into a repository is compliant if it is there under a CC-BY licence and with a zero embargo.

Some issues are operational

In a few instances, the queries or concerns raised about Plan S are actually operational ones.

When APCs are applied, their funding is standardised and capped (across Europe)

Currently the RCUK (now UKRI) does cap funding to Universities, using a complex algorithm to determine allocations in a given year to support the institutions meeting the open access policy. This has resulted in some institutions (including Cambridge) to identify a preference for publishers  exhibiting actions towards an open access future.

Manchester University has introduced new criteria for payment of APCs. They support “Publishers who are taking a sustainable and affordable approach to the transition to OA, e.g. by reducing the cost of publishing Gold OA in hybrid (subscription) journals via offsetting deals or membership schemes are listed below:…” They include a list of journals for which APCs will not be paid.

The alternative interpretation of this statement will be that individual APCs will be capped. This would have implications for all administrators of APCs. It would have particular implications for Cambridge University because of the relatively high proportion of papers published in expensive open access journals such as Nature Communications. The University would both have to find funds to supplement the cost, and also provide the administrative support for this process. This is where discussions need to happen about redirecting subscription budgets towards open access activities. While Plan S adds some urgency, there is time to have these.

The Funders will monitor compliance and sanction non-compliance.

This is the statement that has some administrative staff highly concerned. In the end it will fall upon them to ensure their research community is up to speed and doing the required activities. But we have had sanctions for non-compliance to Wellcome Trust policies since 2014 so this in itself is not new.

Relevant documents from Science Europe

Commentary, news stories & press releases

There has been considerable discussion about Plan S – here are just a few links that might be interesting. NOTE this list has been moved and is now being maintained on a separate blog: ‘Plan S – links, commentary and news items‘.

Published 12 September 2018
Written by Dr Danny Kingsley
Creative Commons License

New to OA? Top tips from the experts

We have a fantastic community in the Scholarly Communication space. And this is one of the clear themes that emerged from a recent exchange on the UKCORR discussion list. The grandly named UK Council of Research Repositories is a self-organised, volunteer, independent body for repository managers, administrators and staff in the UK.

The main activity for UKCORR is a closed email list which has 570 members and is very active. Questions and discussions range from queries about how to interpret specific points of OA policy through to technical advice about repositories.

Recently, the OSC’s Arthur Smith (the current Secretary of UKCORR), posed the first ‘monthly discussion’ point, asking the group two questions:

  • What do you wish you were told before you started your job in repository management/scholarly communication?
  • What are your top three tips for someone just starting?

What followed was a flurry of emails full of great advice. Too good not to share – hence this blog. In summary:

  1. This is a varied and complex area
  2. Open access is bigger than mandates
  3. Things change fast in scholarly communication
  4. Don’t panic
  5. Work with your academic colleagues
  6. The OA community is strong and supportive

Top tips for someone just starting in Scholarly Communication

1. This is a varied and complex area

It’s complicated! Terminology, changing guidance and policies, publisher’s rules… everything is complicated and it takes time to learn it all.

You will experience A LOT of frustration (with publishers, financial constraints, lack of policy alignment, issues with interoperability, ) but there will be moments when it all comes together and you realise you have made a difference to someone and it is all worthwhile.

You’re not mad for wondering why open access policies/dates etc. are not easily found…

How varied and exciting the role is, with requirements (and opportunities) to develop expertise in diverse areas: communication/advocacy, copyright, systems, researcher training, project and team management, budget management…to name but a few.

To remember that this is an industry we have not traditionally been involved in, that it is a constantly changing landscape, that the community is incredibly supportive and endlessly useful, that Sherpa Romeo is still vital, that publishers really vary in their responses to open access – from behemoths to start-ups, and that everyone should back the collaborative effort behind the Scholarly Communications Licence!

2. Open access is bigger than mandates

Remember the bigger picture – open access/open research should not be about compliance; don’t allow yourself to become jaded.

Remember that it is not all just about compliance (the REF). Yes, it is concentrating researchers minds wonderfully at the moment but Open Access/scholarly communications should be about selling the benefits– the carrot not the stick.

Efface mandates & policy when possible – while the REF (along with funder and institutional) mandates are powerful driving forces, some people are not motivated by them, and OA and Open Science are bigger and better than any mandates.

It’s not all about compliance…

It’s not all about the REF.

3. Things change fast in scholarly communication

It’s not finished yet – we’re still building it and nothing is set in stone, so what do you think?

My advice is be adaptable – change is good. This field is rapidly evolving which demands that you remain flexible. What was true yesterday may not be applicable tomorrow.

It is a fluid constantly-changing field to be involved in and it will continue to evolve, so enthusiasm (or nosiness) and an enquiring mind helps

Identify ways to keep up-to-date as it is a rapidly evolving area and it’s impossible to keep on top of everything

Keep the big picture alive alongside the ‘how-to’, operational aspects. Reflect this in your communications.

Don’t be afraid to say you don’t know something – a lot of things in this area are based on interpretation of policies etc

Stay passionate (even when the details are dragging you down).

There is a lot more to it than meets the eye – and that is what is appealing – variety and challenge.

Don’t be afraid to try and change things.

4. Don’t panic!

Open Access Emergencies are very rare. If you’re sent a takedown notice, hide the record immediately and then think about what to do (I’ve had two in something like 6 years, they’re pretty rare). Other than that, very few things are actually urgent and you can afford to spend a bit of time thinking about them.

You’re not going to get everything right – mistakes can be made and for the most part easily rectified (in my position at least!)

Don’t worry about asking questions– Green? Gold? Need some context? Get some context!

5. Work with your academic colleagues

Recognise that some of your best allies will be researchers, although they will often be silent partners working away in the background. It’s easy to moan that they always get it wrong, but no amount of lecturing about policies will ever be as effective as a casual conversation between two researchers over lunch. Catalysing those discussions is what we should be aiming for.

Your academics do not care about the vagaries of policy and probably weren’t listening when you told them. Keep the message very simple. If a specific funder is more complicated you may best off targeting those authors directly with an additional message that explains the difference.

Take time to understand the daily and yearly calendar of academic staff to better understand their pressures.

Engage academics in conversations – for me that is the most interesting and rewarding part of the role.

Be confident, you know what you’re doing. And if you don’t? Find out-  you’ve checked the embargo/copyright regardless of what the academic might want you to do!

Customer focus is important – support rather than appear to police (even though we might be doing a bit of policing).

You have to remember that even if you are relatively new, that you will probably know more than the academics/researchers themselves, so don’t panic when you don’t know/understand something they ask/request. They are usually fine with the standard “I’ll get back to you….” to give you time to find out. Plus, a lot of them are happy that you are dealing with it so they don’t have to.

6. The OA community is strong and supportive

It takes time to build knowledge, so build your networks.

Make use of your colleagues’ expertise – it’s ok not to know everything about everything and you’ll become a stronger team.

Engage on Twitter – it’s where I find a lot of useful resources, updates and share ideas.

Join UKCORR (but I would say that).

You are part of a community that works together – UKCORR is a great platform for discussion, keeping up with news (eg the release of multiple REF2021 related guidance papers within a few days of each other) and finding out the answers to questions.

Network as much as you can; UKCORR is a fantastic community.

Use the support networks that are available –Colleagues/Local Groups/UKCoRR/ARMA – people are genuinely helpful and supportive and repetition of questions does not offend.

Join the Open Access Tracking Project or at least subscribe to notifications. I read the email digest every morning, there is always plenty going on.

7. General advice

The validation queue will vary rarely reach zero. Your academics are publishing all the time. Don’t try to get the queue to zero, for that way madness lies. Instead set a time period (e.g. 2 weeks) and aim to have nothing take longer than that to validate. Don’t worry if this slips a bit during the busy times.

Don’t be intimidated by copyright – get expert advice when you need it, but most re-use & sharing rights are written down somewhere (in the agreement to publish, or in a publisher’s pages).

Don’t forget the Arts & Humanities – much of the lingo (& policy) in OA, e.g. “pre-print”, PubMed/EPMC deposits, etc. comes from the STEM side of the Two Cultures, and the Humanities tradition can be slightly different (for one thing, more publishing in books).

I’m also happy to admit that I was rather overwhelmed by acronyms and abbreviations. It took me an age to figure out that CRIS was Current Research Information System. Don’t be afraid to stop someone if they’re using a term that you don’t know.

Learn a little bit about code and the underpinnings of your platform so you can communicate more effectively with developers.

If you have the opportunity to learn how the technical infrastructure works, eg coding, APIs, go for it. This is on my wish list – so often I can’t tell if a development/improvement hasn’t happened because it’s technically not possible or if it’s for other reasons.

Published 20 August 2018
Compiled by Dr Danny Kingsley from responses amongst the UKCORR community
Creative Commons License

‘No free labor’ – we agree.

[NOTE: The introductory sentence to this blog was changed on 27 June to provide clarification]

Last week members of the University of California* released a Call to Action to ‘Champion change in journal negotiations’ which references the April 2018 Declaration of Rights and Principles to Transform Scholarly Communication.  This states as one of the 18 principles:

No free labor. Publishers shall provide our Institution with data on peer review and editorial contributions by our authors in support of journals, and such contributions shall be taken into account when determining the cost of our subscriptions or OA fees for our authors.”

Well, this is interesting. At Cambridge we have been trying to look at this specific issue since late last year.

The project

Our goal was to have a better understanding of the interaction between publisher and researcher. The (not very imaginatively named) Data Gathering Project is a project to support the decision making of the Journal Coordination Scheme in relation to subscription to, and use of, academic journal literature across Cambridge.

What we have initially found is that the data is remarkably difficult to put together. Cambridge University does not use bibliometrics as a means of measuring our researchers, so we do not subscribe to SciVal, but we have access to Scopus. But Scopus does not pick up Arts and Humanities publications particularly well, so it will always be a subset of the whole.

Some information that we thought would be helpful simply isn’t. We do have an institutional Altmetric account, so we were able to pull a report from Altmetric of every paper with a Cambridge author held in that database.  But Altmetric does not give a publisher view – we would have to extract this using doi prefixes or some other system. 

Cambridge uses Symplectic Elements to record publications from which, for very complicated reasons, we are unable to obtain a list of publishers with whom we publish. As part of the subscription we have access to the new analysing product, Dimensions. However, as far as we have managed to see, Dimensions does not break down by publisher (it works at the more granular level of journal), and seems to consider anything that is in the open domain (regardless of licence) to be ‘open access’. So figures generated here come with a heavy caveat.

We are also able to access the COUNTER usage statistics for our journals with the help of  the Library eresources team. However these include downloads for backfiles and for open access articles, so the numbers are slightly inflated, making a ‘cost per download’ analysis of value against subscription cost inaccurate.

We know how much we spend on subscriptions (spoiler alert: a lot). We need to take into consideration our offsetting arrangements with some publishers – something we are taking an active look at currently anyway.

Reaching out to the publishing community

So to supplement the aggregated information we have to hand, we have reached out to those publishers our researchers publish with in significant quantities to ask them for the following data on Cambridge authors: Peer Reviewing, Publishing, Citing, Editing, and Downloading.

This is exactly what the University of California is demanding. One of the reasons we need to ask publishers for peer review information is because it is basically hidden work. Aggregating systems like Publons do help a bit, although the Cambridge count of reviewers in the system is only 492 which is only a small percentage of the whole. Publons was bought out by Clarivate Analytics (which was Thompson Reuters before this and ISI before that) a year ago. We did approach Clarivate Analytics for some data about our peer reviewing, but declined to pay the eye watering quoted fee.

What have we received?

Contrary to our assumptions, many of the publishers responded saying that this information is difficult to compile because it is held on different systems and that multiple people would need to be contacted. Sometimes this is because publishers are responsible for the publication of learned society journals so information is not stored centrally.  They also fed back that much of the data is not readily available in a digestible format. 

Some publishers have responded with data on Cambridge peer reviewers and editors, usage statistics, and citation information. A big thank you to Emerald, SAGE, Wiley, the Royal Society and eLife. We are in active correspondence with Hindawi and PLOS. [STOP PRESS: SpringerNature provided their data 30 minutes after this blog went live, so thanks to them as well].

However, a number of publishers have not responded to our requests and one in particular would like to have a meeting with us before releasing any information.

Findings so far

The brief for the project was to ‘understand how our researchers interact with the literature’.  While we wrote the brief ourselves, we have come to realise it is actually very vague. We have tried to gather any data we can to start answering this question.

What the data we have so far is helping us understand is how much is being spent on APCs outside the central management of the Office of Scholarly Communication (OSC). The OSC manages the block grants from the RCUK (now UKRI) and the Charities Open Access Fund, but does not look after payments for open access for research funded by, say the Bill and Melinda Gates Foundation or the NIH. This means that there is a not insignificant amount of extra expenditure on top of that  coordinated by the OSC. These amounts are extremely difficult to ascertain as observed in 2014.

We already collect and report on how much the Office of Scholarly Communication has spent on APCs since 2013. However some prepayment deals makes the data difficult to analyse because of the way the information is presented to us. For example, Cambridge began using the Wiley Dashboard in the middle of the year with the first claim against it on 6 July 2016, so information after that date is fuzzy.

The other issue with comparing how much a publisher has received in APCs and how much the OSC has paid (to determine the difference) is dates. We have already talked at length about date problems in this space. But here the issue is publisher provided numbers are based on calendar years. Our reporting years differ – RCUK reports from April to March and COAF from October to September, so pulling this information together is difficult.

Our current approach to understanding the complete expenditure on APCs, apart from analysing the data being provided by (some) publishers, is to establish all of the suppliers to whom the OSC has paid an APC and obtain the supplier number. This list of supplier numbers can then be run against the whole University to identify payments outside the OSC.

This project is far from straightforward. Every dataset we have will require some enhancement. We have published a short sister post on what we have learned so far about organising data for analysis. But we are hoping over the next couple of months to start getting a much clearer idea of what Cambridge is contributing into the system – in terms of papers, peer review and editorial work in addition to our subscriptions and APCs. We need more evidence based decision making for negotiation.

Footnote

* There has been some discussion in listservs about who is behind the Call to Action and the Declaration. Thanks to Jeff MacKie-Mason, University Librarian and Professor, School of Information and Professor of Economics at UC Berkeley, we are happy to clarify:

  • The Declaration is by the faculty senate’s library committee – University Committee on Library and Scholarly Communication (UCOLASC)
  • The Call to Action is by the University of California’s Systemwide Library and Scholarly Information Advisory Committee, UCOLASC, and the UC Council of University Librarians, who: “seek to engage the entire UC academic community, and indeed all stakeholders in the scholarly communication enterprise, in this journey of transformation”.

Published 26 June 2018 (amended 27 June 2018)
Written by Dr Danny Kingsley & Katie Hughes
Creative Commons License

Observations on a data gathering project

The Office of Scholarly Communication provides information, advice and training on research data management.  So when faced with running a research project that involves a considerable amount of data, it is telling to see if we can practice what we preach.

This blog post is a short list of how we have approached managing data for analysis. Judging by our colleagues’ faces when we described some of the advice here, this is blindingly obvious to some people. But it was news to us, so we are sharing it in case it is helpful to others.

Organising and storing the data

As is good practice we have started with  a  Data Management Plan. Actually we ended up having to write two, one for the qualitative and one for the quantitative aspect of the project. 

We have also had to think through where the data is being stored and backed up. All of the collected data is currently being stored on a shared Cambridge University Google Drive where only invited users with a Cambridge University email address can view the data. This is because it can handle Level 2 confidential information and was accessible on and off campus. Some of the data is confidential and publishers have asked us to keep it private.

The data is also stored on a staff member’s laptop computer in her Documents folder (the laptop is password protected) that is backed up by the Library on a daily basis. There is a second storage place on the Office of Scholarly Communication’s (OSC) Shared Drive to ensure that there are two backups in two different locations.

One dataset has proven difficult to use as it is 48MB and Google Drive does not seem to be able to handle that file size well.

Each dataset was renamed with the file naming syntax that the OSC uses. This includes a three letter prefix at the beginning (e.g. RAW for raw data), a short description, then a version, and finally the date that the data was received. Underscores separate each section and there are no spaces. An example is MEM_JCSBlogData_V1_20180618.docx

To organise and summarise the metadata, we have created two spreadsheets. One is a logbook that records the name of the file, a description of the data, size of the file, if it is confidential, and what years it covers. The second spreadsheet records what information each dataset covered, i.e. Peer Review, Editing, Citing, APCs, and Usage. The spreadsheet also records correspondence with the publishers.

Assessing our data

At first glance, we were unsure whether we could do cross comparisons between publishers with the data that we had collected. Although most datasets were provided in Excel (with the exception of the Springer 2017 report on gold open access and eLife), they were formatted differently and covered different areas.

Dr Laurent Gatto, one of Cambridge’s Data Champions, very kindly agreed to meet with us and look over the data that we had collected so far. He suggested a number of ways that we could clean up the data so that we could do some cross comparison analysis. Somewhat to our surprise he was generally positive about the quality and analysability of the data that we had collected.

Cleaning up data for analysis

After an initial look at the data, Laurent gave us some excellent suggestions on managing and analysing the data. These are summarised below:

  • Have a separate folder where the original datasets will be saved. These files will remain untouched.
  • When doing any re-formatting, a new file will be created using the same naming convention, but updating the version. A record of any changes to the dataset will need to be recorded in a spreadsheet.
  • Ensure that all of the headers are uniform across the different spreadsheets, to allow analysis across datasets. Each header must be the same down to the last lowercase letter and cannot include any spaces
  • Dates must also be uniform using Year-Month-Day format
  • Only the first row of a spreadsheet can include the header. Having more than one row with header information will cause problems when you are starting to code.
  • Create a readme file where every header will be recorded with a short description.

Next steps

After speaking with Laurent we are more optimistic about the data that we have collected than we were before. We were concerned that there was not enough information to do analysis across publishers; however, we are more confident that this is not the case. As we start the analysis it will also give us a better understanding of what data is missing.

We will provide an update as we close in on our findings.

Published 26 June 2018
Written by Katie Hughes & Dr Danny Kingsley
Creative Commons License

Compliance is not the whole story

Today, Research England released Monitoring sector progress towards compliance with funder open access policies the results of a survey they ran in August last year in conjunction with RCUK, Wellcome Trust and Jisc.

Cambridge University was one of the 113 institutions that answered a significant number of questions about how we were managing compliance with various open access policies, what systems we were using and our decision making processes. Reading the collective responses has been illuminating.

The rather celebratory commentary from UKRI has focused on the compliance aspect – see the Research England’s press release: Over 80% of research outputs meet requirements of REF 2021 open access policy and the post by the Executive Chair of Research England David Sweeney, Open access – are we almost there for REF?

What’s it all about?

At risk of putting a dampener on the party I’d like to point a few things out. For a start,  compliance with a policy is not the end goal of a policy in itself. While clearly the UK policies over the past five years have increased the amount of UK research that is available open access, we do need to ask ourselves ‘so what?’.

What we are not measuring, or indeed even discussing, is the reason why we are doing this.

While the open access policies of other funders such as Wellcome Trust and Bill and Melinda Gates Foundation articulate the end goal: “foster a richer research culture” in the former and “ information sharing and transparency” in the latter, the REF2021 policy is surprisingly perfunctory. It simply states: “certain research outputs should be made open-access to be eligible for submission to the next Research Excellence Framework”.

It would be enormously helpful to those responsible for ‘selling’ the idea to our research community if there were some evidence to demonstrate the value in what we are all doing. A stick only goes so far.

It’s really hard, people

Part of the reason why we are having so much difficulty selling the idea to both our research community and the administration of the University is because open access compliance is expensive and complicated, as this survey amply demonstrates.

While there may have been an idea that requiring the research community to provide their work on acceptance would mean they would become more aware and engaged with Open Access, it seems this has not been achieved. Given that 71% of HEIs reported that AAMs are deposited by a member of staff from professional services, it is safe to say the past six years since the Finch Report have not significantly changed author behaviour.

With 335 staff at 1.0FTE recorded as “directly engaged in supporting and implementing OA at their institution”, it is clear that compliance is a highly resource hungry endeavour. This is driving the decision making at institutional level. While “the intent of funders’ OA policies is to make as many outputs freely available as possible”, institutions are focusing on the outputs that are likely to be chosen for the REF (as opposed to making everything available).

I suspect this is ideology meeting pragmatism. Not only can institutions not support the overall openness agenda, these policies seem to be further underlining the limited reward systems we currently use in academia.

The infrastructure problem

The first conclusion of the report was that “systems which support and implement OA are largely manual, resource-intensive processes”. The report notes that compliance checking tools are inadequate partly because of the complexity of funder policies and the labyrinth that is publisher embargo policies. It goes on to say the findings “demonstrate the need for CRIS systems, and other compliance tools used by institutions be reviewed and updated”.

This may the case, but buried in that suggestion is years of work and considerable cost. We know from experience. It has taken us at Cambridge 2.5 years and a very significant investment to link our CRIS system (Symplectic Elements) to our DSpace repository Apollo. And we are still not there in terms of being able to provide meaningful reports to our departments.

Who is paying for all of this?

When we say ‘open’…

The report touches on what is a serious problem in the process. Because we are obtaining works at time of acceptance (an aspect of the policy Cambridge supports), and embargo periods cannot be set until the date of publication is known, there is a significant body of material languishing under indefinite embargoes waiting to be manually checked and updated.

The report notes that ‘there is no clear preference…as to how AAMs are augmented or replaced in repositories following the release of later versions’. Given the lack of any automated way of checking this information the problem is unmanageable without huge human intervention.

At Cambridge we offer a ‘Request a Copy’ service which at least makes the works accessible, but this is an already out of control situation that is compounding as time progresses.

Solutions?

We really need to focus on sector solutions rather than each institution investing independently. Indeed, the second last conclusion is that ‘the survey has demonstrated the need for publishers, funders and research institutions to work towards reducing burdensome manual processes”. One such solution, which has a sole mention in the report, is the UK Scholarly Communication Licence as a way of managing the host of licences.

Right at the end of the report in the second last point something very true to my heart was mentioned: “Finally, respondents highlighted the need for training and skills at an institutional level to ensure that staff are kept up to date with resources and tools associated with OA processes.” Well, yes. This is something we have been trying to address at a sector level, and the solutions are not yet obvious.

This report is an excellent snapshot and will allow institutions such as ours some level of benchmarking. But it does highlight that we have a long way to go.

Published 14 June 2018
Written by Dr Danny Kingsley
Creative Commons License

What’s new in OA?

The world of Open Access moves fast and it can be difficult to keep up. We run regular updates for our community here at Cambridge and following a recent webinar, figured a blog about it might be a good idea too. Strap yourselves in, this is a bumpy ride.

Sweden draws the line

After a breakdown in negotiations, the Bibsam Consortium in Sweden cancelled the agreement with Elsevier on 16 May. It is anticipated that after 1 July 2018, Swedish universities will not have access to new articles in Elsevier’s journals. Articles published before this date will remain accessible.

In his blog, The circuitous road towards open access: Swedish universities to pull the plug on Elsevier, Ole Petter Ottersen  Rector of Karolinska Institute in Sweden noted: “Almost 600 years ago the development of the printing press led to dramatic changes in how knowledge was spread and communicated. This did not happen without opposition. Today digitalization opens for an equally dramatic and welcome change towards the democratization of knowledge. …  It’s time that knowledge becomes a public good.”

Europe no-deals

Sweden is following a growing European trend in relation to pulling out of publishing deals.

On 30th of March this year, the French national consortium representing 250 academic institutions, Couperin.org, cancelled subscriptions to SpringerNature journals. Despite expectations that the publisher would cut access, Springer is maintaining access to journals for French institutions while discussions continue.

Two weeks earlier on 12 March, the Dutch consortium VSNU announced that “Dutch universities and Royal Society of Chemistry Publishing (RSC) have been unable to reach a new agreement on access to scientific journals”. In anticipation of losing access to the material, the VSNU has advised that researchers use alternative ways to access materials including Unpaywall, Open Access button and requesting a copy from the author or from the library. At the end of the list ‘if all else fails’ they suggest using Sci-Hub, noting that “the use of Sci-Hub is considered as an illegal act”.

There were concerns that German researchers would lose access to Elsevier materials from January 2017 after negotiations broke down and subscriptions stopped being paid at the end of 2016.  But in January this year, Nature reported that German universities still had access to Elsevier journals as discussions continued.  It has been estimated that across the country, libraries are saving more than €10 million (£8.7 million) a year as a result of cancelling the subscriptions.

While not exactly ‘new news’ it is worth mentioning here that in October 2016, the French Law for a Digital Republic Act came into force, including Article 30, which is about Open Access and creates a legal right for authors to archive an OA copy, even if they have granted an exclusive right to publish.

Springer sinks

On May 9, Springer Nature was due to be listed on the German midcap index, offering 1.6 billion Euros in shares. However the day before, the float was cancelled due to ‘weak demand’.

An analysis of the prospectus recently published in the Times Higher Education has identified plans for Springer Nature to link the cost of Article Processing Charges for open access with a journal’s Impact Factor. This is interesting to say the least and indicates a move by large publishers to consolidate payments for open access into an effective new type of Big Deal. This approach also further cements the current flawed academic reward system despite Springer recently signing the Declaration on Research Assessment (DORA). The messages the academic/library community are being given are in stark contrast to the messages that were being sent to potential investors.

ResearchGate shenanigans

ResearchGate is an academic social networking site upon which researchers post copies of their works, often against copyright agreements they have signed with their publishers. In October last year, Elsevier and the American Chemical Society filed a lawsuit in Germany against ResearchGate, alleging copyright infringement on a mass scale. In November, ResearchGate restricted access to 1.7 million papers on their site. The court case began on 18 April in Germany with the intention to: “establish clarity on the legal responsibility of ResearchGate regarding copyright infringements”.

Not all publishers have agreed with this combative approach to ResearchGate. A day after the court case began, an agreement between Springer, Cambridge University Press, Theime and Research Gate was announced. The agreement is to work together on the sharing of articles on the scholarly collaboration platform “in a way that protects the rights of authors and publishers”.

All together now

The 1 April marked several important happenings in the open access space. The former Research Councils UK and Higher Education Council for England merged under the single banner of UK Research and Innovation (UKRI), with the latter being rebranded as Research England. Apart from the intensely irritating breaking of every link to RCUK webpages, this is a very positive step. Even before starting operations, the organisation was flagging a review of open access, including questioning whether it would continue to support the payment of hybrid article processing charges.

The 1 April also marked the first time HEFCE/Research England was implementing the ‘three months from acceptance’ rule for compliance of works for the Research Excellence Framework (REF). This timeframe for depositing works and making them open access was in the original policy, but in the first two years of operation of the policy, HEFCE relented to pressure from institutions concerned they were not prepared and amended the policy to ‘three months from publication’. This has been a tricky balancing act for those working in institutions (such as the Office of Scholarly Communication), with many opting not to inform their research community of the extended time frame from 1 April 2016 because of concerns about confusion. It is a relief to have the policy operating as originally written.

Speaking of REF, in March 2017, HEFCE conducted a consultation on the REF. The initial outcomes were made available in September last year.  The guidance for the 2021REF is not far from being released for feedback, and signs are that there might be some movement on the question of the eligibility of arXiv as a repository. Those interested in this issue might find “ArXiv and the REF open access policy” by yours truly and Katie Shamash which presents the case that articles deposited to arXiv are, in general, compliant with the requirements of the HEFCE policy worth a read.

Wellcome Trust consultation

Not to be left out, the Wellcome Trust is coming to the end of its consultation on its open access policy. Wellcome Trust has, along with the National Institutes of Health, led the world in the implementation of open access requirements in 2005. This is the first wide scale review of the policy and the report from the review will be released by the end of 2018.

Questions being asked of funders, publishers and institutions have focused on the hybrid question. There has also been discussion of the merits or otherwise of the Wellcome Trust centralising the negotiation with publishers and managing the block grants centrally. Some respondents have made their responses public already, such as SCONUL.

Responsible metrics?

As we blogged about recently, a considerable amount of open access activity is tied into reproducibility issues in research. Universities UK is involved in a Forum for Responsible Research Metrics in conjunction with HEFCE to address these questions. At an event in February this year: ‘The turning tide: A new culture of responsible metrics for research’ researchers spoke about the impact of metrics on their careers and health. Spoiler alert: it is not good.

The first half of last year saw a spate of organisations signing the San Francisco Declaration on Research Assessment (DORA) including Nature (April 2017)Imperial College London  and Birkbeck University of London (both in Feb 2017). However, University College London gets the prize, having signed up in 2015. In February 2018 all seven UK research councils signed up to DORA.

Early this year DORA revamped its steering committee. In addition funders (ASCB, Cancer Research UK, the European Molecular Biology Organization and Wellcome Trust) and publishers (the Company of Biologists, eLife, F1000, Hindawi and PLOS) invested in DORA to allow the hiring of a full-time community manager.

Data monetisation

There has been an increase in the offerings by publishers to manage data. Mendeley (owned by Elsevier) has long been in this space, but in 2018 Mendeley Data announced a ‘comprehensive research data platform for institutions‘ and ‘superior data management for researchers‘. Note that under the ‘Data Linking’ heading it only offers to link “datasets in repositories with research articles on ScienceDirect”, which suggests limited value of the ‘service’.

Elsevier is experimenting with ways to monetise freely available data. Datasearch allows people to: “Search for research data across domains and types, from many domain-specific, cross-domain and institutional data repositories”. The FAQs list the repositories that are indexed, including Cambridge’s very own Apollo. Because our metadata is available under a CC0 license there is nothing we can do. The FAQs also state: “At the moment, DataSearch is not a commercial product.” There is no guarantee of course that this will remain the status quo.

But Elsevier is not the only company moving into this space. Since 22 March, Springer Nature have been offering ‘Research Data Support‘ which for £265 + VAT will deposit up to 50GB of data into figshare – a commercial repository owned by Digital Science, which shares a parent company with Springer Nature. The companies insist they are entirely separate organisations.

Ecosystem takeover

If this is all starting to sound a little incestuous, then you are on the right track. As I am arguing in an upcoming Group Editorial for the Journal of Librarianship and Scholarly Communication:

There has been a redirection of business strategy by some academic publishing companies to develop portfolios that address the entire research process. Rather than adjusting workflows and internal processes, several companies are moving away from publishing into scholarly infrastructure: the tools and services that underpin the scholarly research life cycle, many of which are geared toward data analytics. This has been effected through an aggressive acquisition program in the case of Elsevier, and through the development of new products in the case of Digital Science. In both cases, the individual products across the portfolio retain their own distinctive branding.

Possibly the most dramatic way to illustrate the extent of the situation is a graphic showing where Elsevier-owned products sit throughout the research lifecycle, appearing in Rent Seeking and Financialization strategies of the Academic Publishing Industry – Publishers are increasingly in control of scholarly infrastructure and why we should care- A Case Study of Elsevier.

This situation requires vigilance. Infrastructure is the next big battleground.

Stay up to date

Remember, the Office of Scholarly Communication tries to make our work as accessible as possible to all. In addition to this blog we have a sister blog called Open Research:  Adventures from the frontline.

We publish two monthly newsletters – KaleidOSCope is focused on scholarly communication more broadly and the Research Data Newsletter keeps people up to date on data issues and opportunities.

Many of our presentations are filmed and uploaded to our YouTube channel – and there is a list of our recordings of past events including all the presentations from our TDM Symposium, our Open Access Week ‘getting published’ events and Engaging Researchers in Good Data Management.

Our presentations are freely available from Apollo, as are the slides from our training sessions. Our Twitter feeds are very popular, Cambridge Open Access @CamOpenAccess and Cambridge Research Data Management @CamOpenData.

You have no excuse!

Published 4 June 2018
Written by Dr Danny Kingsley
Creative Commons License