“Become part of the research process” – observations from RLUK2017

When is a librarian not a librarian? Rather than a bad joke, this was one of the underlying interesting discussions arising from the 2017 RLUK conference held earlier in March. The conference Twitter hashtag was #rluk17 and the videos are now available. The answer, it appears is when we start talking about partnerships with, rather than support of, our research community.

As always with my write-ups of conferences, these are simply the parts that have resonated with me, and the impression I walked away with. This write up will be very different from anyone else’s from the conference, such as this blog from Lesley Pitman, and the RLUK conference report.

I have also written a sister blog describing the workshop I co-presented on the topic of Text and Data Mining.

Libraries’ role in research

The role of libraries and the people who work in them was the theme of one session – with arguments that libraries should be central to the research process.

Masud Khokhar, the Head of Digital Innovation and Research Services at Lancaster University, gave a talk on the Role of research libraries in a technological future. He said we need to get out of the culture of researchers only coming to the library with research outputs/outcomes. Language matters, he said. Lancaster University has made a deliberate decision not to use the word ‘support’, because “we have bigger aims than that”. Partnership is the future for libraries rather than just collaboration. We need to be creative co-developers working with the research community if we are to be a research library.

We need to generate a culture of experimentation: “Be creative, experiment fast, succeed or fail fast and learn from both”. It is a good challenge for us librarians to be more creative and less passive. We should embed library in research questions and processes.

The issue of how we present information to our clients came up, with Khokhar saying consistency when searching should no longer be important – we should depend on the context of the searcher. “Content might be king, but context is the kingdom”, he said.

He also showed evidence of how data visualisation can lead to greater downloads of data, and it may be even more important to data use than good metadata. Indeed, Lancaster University Library has allowed 10TB of server space for analytics of library data alone, because this is a growing and important area to drive decision making.

This perspective was also put forward by Patrick McCann from the University of St Andrews Library. He talked about the new role of Research Software Engineers, which is a role which works with the research community to develop research solutions and research outputs. St Andrews has a senior librarian for digital humanities and research computing. He noted: “we are part of the research process”.

A comment was made during the conference that many speakers had identified themselves as ‘not a librarian’. There was a call for us to open the idea of what a librarian is. Masud Khokhar suggested he would consider himself to be an ‘honorary’ librarian.

But the ‘librarian or not’ debate is an interesting question. William Nixon from the University of Glasgow noted that their Research Data Management team are not librarians, saying “it is a skill set in itself. Kokhar argued that we need to develop digital leaders for libraries. Are these people already in libraries who we train up, or are they people with these skill sets we bring in and introduce to library culture?

Libraries’ role in the Open Science agenda

Libraries are the central pivot point for the move to open research across the world, was the message from presentations about activities in Europe and Canada. This fits with the narrative that libraries should be driving the agenda rather than reacting to it.

Susan Reilly, the outgoing Executive Director of LIBER talked about re-imagining the library space in the context of open science as she presented the LIBER 2020 vision.

Open Science (a term used in Europe for ‘open research’) is on the European agenda, every single member state has signed up to develop the necessary skills, development of the open science cloud. There has been an 80 million Euro investment in this. Given LIBER is a group of libraries with a common mission to enable world-class research, the question is whether LIBER should make the whole strategy about open science?

Reilly noted that libraries have been ‘bold’ on open science for years and held back by faculty and publishers. She argued we must be resilient on this agenda. Libraries need to be taking a leadership role in all research. “Libraries need to get into the researchers’ lifecycle”, she argued. They should provide tools throughout the research lifecycle to ensure ‘open science’. To achieve this, we need digital skills, which underpin a more open and transparent research lifecycle.

The end goal, said Reilly, is world-class research, but open science facilitates that through facilitating collaboration and ensuring the sustainability of research. The 2020 vision is: “Libraries powering sustainable knowledge in the digital age”.

The proposal is that by 2022, open access will be the predominant form of publishing and research data is Findable Accessible Interoperable Reusable (F.A.I.R). Reilly noted that it is research data management “where we get the most pushback” – an experience reflected in many other institutions.

Libraries can provide platforms of innovative scholarly communications. They can facilitate open access to research publications, with services ranging from payment for APCs and becoming a publisher. Libraries also offer research data management, innovative metrics and innovative peer review.

This is an opportunity for libraries to disrupt scholarly communications system. In order for us to achieve this goal, we need research skills that underpin a more open and transparent research lifecycle – and so we need to equip researchers to do this.

Reilly noted that when LIBER went out to stakeholders – “they bought into the vision”. To achieve these goals, Reilly said it is important for libraries to have a strong relationship with institutional leadership. There needs to be transparency around the cost of publications.

We need to work on diversifying librarian’s skills and research skills. This is a matter of ‘compete or fail’ or Elsevier could take over what libraries do. We need to get into the research workflow.

LIBER’s outcomes from their consultation with stakeholders were:

  • Importance of libraries having a string relationship with institutional leadership
  • Transparency around the cost of publications
  • Working on diversifying librarians’ skills AND researchers skills
  • Be clear about what the role of libraries is/should be
  • Compete or fail
  • Get into the research workflow
  • Opportunity for libraries to disrupt scholarly communications system

It was interesting (for me) to note how similar these are to the Strategic Goals of the Office of Scholarly Communication:

The Open Scholarship theme was continued in a presentation by representatives of RLUK’s sister organisation, the Canadian Association of Research Libraries (CARL). This is a leadership organisation thinking of ways to enhance members capacity and leadership in this environment. Martha Whitehead, the President of CARL and Susan Haigh, the Executive Director presented the Canadian Roadmap for advancing Scholarly Communication.

There are issues with open access, they noted. Repositories need to improve in two major areas – we need to improve their functionality, and support and encourage the development of value added services such as peer review and tools.

There have been challenges in discussions with publishers about maximising openness which have become ‘somewhat fraught’. Libraries are working with Canadian journals to develop, assess and adopt sustainable open access funding models. The idea is that the model will be non-profit (where the money goes back in).  While it is not clear if the discussions will coalesce around anything new and bold, there is value in bringing together the communities.

The Canadians presented an initiative related to Research Data Management (RDM) called Portage. This is designed to help with RDM in the country. It has a director, and because it is an organisation with a facility, the library voice is well respected around the table. Experts are contributing their expertise to this. There is also a Federated Research Data Repository – a joint software development project with Compute Canada, and the Scholars Portal Dataverse offers data deposit and sharing at no charge to researchers.

New challenges for libraries

Torsten Reimer spoke about the new focus of the British Library on ‘everything accessible’. He discussed the implications for libraries as we move towards a more open access future. We need to change focus, he argued, with new skills and areas, and we should be working together with the research community.

As more material is available openly then what is the role of a national library? Reimer asked. Perhaps libraries need to provide infrastructure, we should focus on preservation & adding value. Given the majority of academics use software in the context of their projects, should libraries be supporting, integrating and preserving it?

The ‘just in case’ model is no longer feasible for libraries. The British Library is looking at partnerships in content creation, research & infrastructure. Examples include plans to expose the EThOS API to allow for machine consumption of data about theses. They are also looking to replace the current “hand knitted” preservation system with more robust scalable shareable solution

Collaborate or die?

The opening keynote was by John MacColl, the University Librarian & Director of Library Services, at St Andrews University (and outgoing president of RLUK). MacColl spoke about the ‘research commons’.

He referred to the ‘tragedy of the commons’ which was an argument put forward in 2003 that individuals cancelling subscriptions for the Big Deal had meant an increase of 129% in cost to access literature. Publishers are creating ‘artificial scarcity’ to the literature which means they can charge as they please. This is a ransack of the commons.

It is not just cost, these Big Deals have meant that most collections are becoming the same and we are losing access to other resources. MacColl also noted the lost need for bibliographers. But his call was that research libraries face a challenge in re-appropriating the responsibility for the preservation of key scholarly objects held on publisher servers and other vendors worldwide.

So, argued McColl, we need to work collectively to ‘find means of getting around being held ransom by publishers’. We need a ‘post-collective Big Deal world’. This is Plan B, where we take back control, find post cancellation access, arrange document delivery and green open access.

But this is not something we can do individually. MacColl asked: “When we are doing things in our own institutions, who are we letting down by not thinking of the wider community?” We need some sort of formal governance to make that happen. The challenge is Higher Education is a very conservative world. People will not take a step unless convinced this is a sensible step to take.

We need to focus on the global – where libraries collaborate on shared bibliographic data and create a ‘collective collection’. Plan B needs to be national.

So much more

This blog has glossed over many very interesting presentations and talks. I do, however wish to mention the last session of the event which broadened the discussion outside of the library to the issue of ‘inclusion’ in the Higher Education sector. Libraries, as a neutral ‘safe’ place on campus, of course have a big role to play in this. As has been the case in every meeting I have attended since November last year, the double threats of Brexit and Trump have never been far from the discussion, and never more so than in the context of inclusion.

Darren Lund, a ‘middle aged white guy from Canada’ spoke very entertainingly about his work on diversity, making the point that if you have privilege you should use it to make positive change.

The final talk was a sobering walk through some research into the racial diversity of universities with plenty of data proving that universities are not as liberal as they are perceived to be by us. Statistics such as 92% of professors in the UK are white, and the fact there are only three Vice Chancellors from the black and minority ethnic community in the UK, supported Professor Kalwant Bhopal’s argument that we need to actively address the issue of inclusion.

Summary

This blog began with a fairly provocative statement – that people do not identify themselves as librarians when we start talking about partnerships with, rather than support of, our research community. This is an interesting question. Many librarians feel that their role is to support, not lead. Yet others argue that unless we do take a leading role we will become redundant.

So what is the solution? Do we widen the definition of a library? Do we widen the definition of a librarian? Or are we happy with the ‘honorary librarian’ solution? These are some of the questions that need further teasing out. One thing is sure, the landscape is changing rapidly and we need to change with it.

Debbie Hansen and Danny Kingsley attended the conference thanks to the support of the Arcadia Fund, a charitable fund of Lisbet Rausing and Peter Baldwin.

Published 30 March 2017
Written by Dr Danny Kingsley
Creative Commons License

Service Level Agreements for TDM

Librarians expect publishers to support our researchers’ rights to Text and Data Mining and not cut access off for a library if they see ‘suspicious’ activity before they establish whether it is legitimate or not. These were the conclusions of a group who met at a workshop to discuss provision of Text and Data Mining services in March. The final conclusions were:

Expectations libraries have of publishers over TDM

The workshop concluded with very different expectations to what was originally proposed. The messages to publishers that were agreed were:

  1. Don’t cut us off over TDM activity! Have a conversation with us first if you notice abnormal behaviour*
  2. If you do cut us off and it turns out to be legitimate then we expect compensation for the time we were cut off
  3. Mechanisms for TDM where certain behaviours are expected need to be built into separate licensing agreements for TDM

*And if you want to cut us off – please demonstrate there are all these illegal TDM activities happening in the UK

Workshop on TDM

The workshop “Developing a research library position statement on Text and Data Mining in the UK” was part of the recent RLUK2017 conference.  My colleagues, Dr Debbie Hansen from the Office of Scholarly Communication and Anna Vernon from Jisc, and I wanted to open up the discussion about Text and Data Mining (TDM) with our library community. We have made the slides available and they contain a summary of all the discussions held during the event. This short blog post is an analysis of that discussion.

We started the workshop with a quick analysis of who was in the room using a live survey tool called Mentimeter. Eleven participants came from research institutions – six large, four small and one  from an ‘other research institution’. There were two publishers, and four people who identified as ‘other’ – which were intermediaries. Of the 19 attendees, 14 worked in a library. There was only one person who said they had extensive experience in TDM, four people said they were TDM practitioners but the largest group were the 14 who classified themselves as having ‘heard of TDM but have had no practical experience’.

The workshop then covered what TDM is, what the legal situation is and what publishers are currently saying about TDM . We then opened up the discussion.

Experiences of TDM for participants

In the initial discussion about experiences of the participants, a few issues were raised if libraries were to offer TDM services. Indeed there was a question whether this should form part of library service delivery at all. The issue is partly that this is new legislation, so currently publisher and institutions are reactive, not strategic in relation to TDM. We agreed:

  • There is a need for clearer understanding of the licensing situation with information
  • We also need to create a mechanism of where to go for advice, both within the institution and the publisher
  • We need to develop procedures of what to do with requests – which is a policy issue 
  • Researcher behaviour is a factor – academics are not concerned by copyright.

Offering TDM is a change of role of the library – traditionally libraries have existed to preserve access to items. The group agreed we would like to be enabling this activity rather than saying “no you can’t”. There are library implications for offering support for TDM, not least that librarians are not always aware of TDM taking place within their institution. This makes it difficult to be the central point for the activity. In addition, TDM could threaten access through being cut off, so this is causing internal disquiet.

TDM activity underway in Europe & UK

We then presented to the workshop some of the activities in TDM that are happening internationally, such as the FutureTDM project. There was also a short run down on the new copyright exception for research organisations carrying out research in public interest being proposed to the European Commission allowing researchers to carry out TDM of copyright protected content if they have lawful access (e.g. subscription) without prior authorisation.

ContentMine is a not for profit organisation that supplies open source TDM software to access and analyse documents. They are currently partnering with Wikimedia Foundation with a grant to develop WikiFactMine which is a project aiming to make scientific data available to editors of Wikidata and Wikipedia.

The ChemDataExtractor is a tool built by the Molecular Engineering Group at the University of Cambridge. It is an open source software package that extracts chemical information from scientific documentation (e.g. text, tables). The extracted data can be used for onward analysis. There is some information in a paper  in the Journal of Chemical Information and Modelling: ChemDataExtractor: A Toolkit for Automated Extraction of Chemical Information from the Scientific Literature“.

The Manchester Institute of Biotechnology hosts the National Centre for Text Mining (NaCTeM), which works with research partners to provide text mining tools and services in the biomedical field.

The British Library had a call for applications for a PhD student placement to undertake thesis text mining on 150,000 theses held in EThOS to extract new metadata such as names of supervisors.  Applications closed 20 February 2017, but according to an EThOS newsletter from March,  they had received no applications for the placement. The suggestion is that “perhaps that few students have content mining skills sufficiently well developed to undertake such a challenging placement”.

The problem with supporting TDM in libraries

We proposed to the workshop group that libraries are worried about getting cut off from their subscription by publishers due to large downloads of papers through TDM activity. This is because publishers’ systems are pre-programmed to react to suspicious activity. If TDM invokes automated investigation, then this may cause an access block.

However universities need to maintain support mechanism to ensure continuity of access. For this to occur we require workflows for swift resolution, fast communication and a team of communicators. This also requires education of researchers of potential issues.

We asked the group to discuss this issue – noting reasons why their organisation is not actively supporting TDM and if they are the main challenges they face.

Discussion about supporting TDM in libraries

The reasons put forward for not supporting TDM included practical issues such as the challenges of handling physical media and the risk of lockout.

The point was made that there was a lack of demand for the service. This is possibly because the researchers are not coming to the Library for help. There may be a lack of awareness in the IT areas that the Library can help and they may not even pass on the queries.  This points to the need for internal discussion with institutions.

It was noted that there was an assumption in the discussion that the Library is at the centre of this type of activity, however and we are not joined up as organisations. The question is who is responsible for this activity? There is often no institutional view on TDM because the issues are not raised at academic level. Policy is required.

Even if researchers do come to the library, there are questions about how we can provide a service. Initially we would be responding to individual queries, but how do we scale it up?

The challenges raised included the need for libraries to ensure everyone understands the needs at the the content owner level. The library, as the coordinator of this work would need to ensure the TDM is not for commercial use, and need to ensure people know their responsibilities. This means the library is potentially being intrusive on the researcher process.

Service Level Agreement proposal

The proposal we put forward to the group was that we draft a statement for a Service Level Agreement for publishers to assure us that if the library is cut off, but the activity is legal, we will be reinstated within and agreed period of time. We asked the group to discuss the issues if we were to do this.

Expectation of publishers

The discussion has raised several issues libraries had experienced with publishers over TDM. One participants said the contract with a particular publisher to allow their researchers to do TDM took two years to finalise.

There was a recognition that for genuine TDM to be identified might require some sort of registry of TDM activity which might not be an administrative task all libraries want to take on. The alternative suggestion was a third party IP registry, which could avoid some of the manual work. Given that LOCKSS crawls publisher software without getting trapped, this could work in the same way with a bank of IP addresses that is secured for this purpose.

Some solutions that publishers could help with include publishers delivering material in different ways – not on a hard drive. The suggestion was that this could be part of a platform and the material was produced in a format that allowed TDM (at no extra cost).

Expectation of libraries

There was some distaste amongst the group for libraries to take on the responsibility for maintaining  a TDM activity register. However libraries could create a safe space for TDM like virtual private networks.

Licenses are the responsibility of libraries, so we are involved whether we wish to be or not. Large scale computational reading is completely different from current library provision. There are concerns that licensing via the library could be unsuitable for some institutions. This raises issues of delivery and legal responsibilities. One solution for TDM could be to record IP address ranges in licence agreements. We need to consider:

  • How do we manage the licenses we are currently signed up to?
  • How do we manage licensing into the future so we separate different uses? Should we have a separate TDM ‘bolt on’ agreement.

The Service Level Agreement (SLA) solution

The group noted that, particularly given the amount publisher licenses cost libraries, being cut off for a week or two weeks with no redress is unusual at best in a commercial environment. At minimum publishers should contact the library to give the library a grace period to investigate rather than being cut off automatically.

The basis for the conversation over the SLA includes the fact that the law is on the subscriber’s side if everyone is doing it legally. It would help to have an understanding of the extent of infringing activity going on with University networks (considering that people can ‘mask’ themselves). This would be useful for thinking of thresholds.

Next steps

We need to open up the conversation to a wider group of librarians. We are hoping that we might be able to work with RLUK and funding councils to come to an agreed set of requirements that we can have endorsed by the community and which we can then take to to publishers.

Debbie Hansen and Danny Kingsley attended the RLUK conference thanks to the support of the Arcadia Fund, a charitable fund of Lisbet Rausing and Peter Baldwin.

Published 30 March 2017
Written by Dr Danny Kingsley
Creative Commons License

Where did they come from? Educational background of people in scholarly communication

Scholarly communication roles are becoming more commonplace in academic libraries around the world but who is actually filling these roles? The Office of Scholarly Communication in Cambridge recently conducted a survey to find out a bit more about who makes up the scholarly communication workforce and this blog post is the first in a series sharing the results.

The survey was advertised in October 2016 via several mailing lists targeting an audience of library staff who worked in scholarly communication. For the purposes of the survey we defined this as:

The process by which academics, scholars and researchers share and publish their research findings with the wider academic community and beyond. This includes, but is not limited to, areas such as open access and open data, copyright, institutional repositories and research data management.

In total 540 people responded to the calls for participation with 519 going on to complete the survey, indicating that the topic had relevance for many in the sector.

Working patterns

Results show that 65% of current roles in scholarly communication have been established in respondent’s organisations for less than five years with fewer than 15% having been established for more than ten years. Given that scholarly communication is still growing as a discipline this is perhaps not a surprising result.

It should also be noted that the survey makes no distinction between those who are working in a dedicated scholarly communication role and those who may have had additional responsibilities added to a pre-existing position. These roles tend to sit within larger organisations which employ over 200 people although whether the organisation was defined as the library or wider institution was open to interpretation by respondents.

Responses showed an even spread of experience in the library and information science (LIS) sector with 22% having less than five years’ experience and 27% having more than twenty.  Since completing their education just over half of respondents have remained within LIS but given the current fluctuations in the job market it is not surprising to learn that just under half of people have worked outside the sector within the same period.

Respondents were also asked to list the ways in which they actively contributed to the scholarly publication process. The majority (72%) did so by authoring scholarly works or contributing to the peer review process (44%). Although not specified as a category a number of respondents highlighted their work in publishing material, indicating a change in the scholarly process rather than a continuation to the status quo.

LIS qualifications

Most of those (71%) who responded to the survey either have or are currently working towards a postgraduate qualification in LIS, an anticipated result given the target population of the survey. The length of time respondents had held their qualification was evenly spread in line with the amount of time spent working in the sector with 48% having achieved their qualification less than ten years ago whilst 49% having held their qualification for over a decade. Just over half of this group felt that their LIS qualification did not equip them with knowledge of the scholarly communication process (56%).

Around a fifth of respondents (21%) hold a library and information science qualification at a level other than postgraduate, with the majority of being at bachelor level. Of these there was a fairly even divide between those who have held this qualification for five to ten years (31%) and those who qualified more than twenty years ago (28%). Only 17% of this group felt that their studies equipped them with appropriate knowledge of scholarly communication.

Qualifications outside LIS

A small number of respondents do not hold qualifications in LIS but hold or are working towards postgraduate qualifications in other subjects. Most of this group hold/are working on a PhD (69%) in a range of subjects from anatomy to mechanical engineering.

This group overwhelmingly felt that what they learnt during their studies had practical applications in their work in scholarly communication (74%). This was a larger percentage than those who had studied LIS at either undergraduate or postgraduate level. These results echo experiences at Cambridge where a large proportion of the team is made up of people from a variety of academic backgrounds. In many ways this has proven to be an asset as they have direct experience of the issues faced by current researchers and are able to offer insight into how best to meet their needs.

So what does this tell us?

The scholarly communication workforce is expanding as academic libraries respond to the changing environment and shift their focus to research support. Many of these roles have been created in the past five years in particular within larger organisations better positioned to devote resources to increasing their scholarly communication presence.

Although results from this survey indicate that the majority of staff come from a library background a diverse range of levels and subjects are represented. As noted above this can provide unique insights into researcher needs but it also raises the question of what trained library professionals can bring to this area. Given that the majority of those educated in LIS felt that their qualification did not adequately equip them for their role this is a potentially worrying trend which needs to be explored further.

We will be continuing to analyse the results of the survey over the next few months to address both this and other questions. Hopefully this will provide insight into where scholarly communications librarians are now and what they can do to ensure success into the future.

Published 9 March 2017
Written by Claire Sewell
Creative Commons License

Open Research Project, first thoughts

Dr Laurent Gatto is one of the participants in the Office of Scholarly Communication’s Open Research Pilot. He has recently blogged about his first impressions of the pilot. With his permission we have re-blogged it here.

I am proud to be one of the participants in the Wellcome Trust Open Research Project (and here). The call was initially opened in December 2016 and was pitched like this:

Are you in favour of more transparency in research? Are you concerned about research reproducibility? Would you like to get better recognition and credit for all outputs of your research process? Would you like to open up your research and make it more available to others?

If you responded ‘yes’ to any of these questions, we would like to invite you to participate in the Open Research Pilot Project, organised jointly by the Open Research team at the Wellcome Trust and theOffice of Scholarly Communication at the University of Cambridge.

This of course sounded like a great initiative for me and I promptly filed an application.

We had our kick-off meeting on the 27th January, with the aim of getting to know each other and somehow define/clarify some of the objectives of the project. This post summarises my take on it.

Here’s how I introduced myself.

Who are you?

Laurent Gatto, Senior Research Associate in the Department of Biochemistry, physically located in Systems Biology and the Maths Department. SSI fellow and Software/Data Carpentry instructor and generally involved in the Open community in Cambridge, such as OpenConCam and Data Champions initiative.

What is your research about and what kind of data does your research generate?

My area of research is computational biology, with special focus on high-throughput proteomics and integration of different data and annotations. I use raw data produced by third parties, in particular the Cambridge Centre for Proteomics (mass spectrometry data), and produce processed/annotated/interactive data and a lot of software (and also here).

What motivated you to participate in the Pilot?

Improve openness/transparency (and hence reproducibility/rigour) in my research and communication, and participate in improving openness (and hence reproducibility/rigour) more widely.

What kind of outputs are you planning to share? Do you foresee any difficulties in sharing?

My direct outputs are systematically shared openly early on: open source software (before publication), pre-prints, improved data (as data packages). Difficulties, if any, generally stem from collaborators less willing to share early and openly.

A personal take on the project

It is a long project, 2 years, and hence a rather ambitious one, of a unique kind. Hence, we will have to define its overall goals as we go. The continued involvement of the participants over time will play a major role in the project’s success.

What are attainable goals?

It is important to note that there is no funding for the participants. We are driven by a desire to be open, benefit from being open and the visibility that we can gain through the project, and the prospect that the Wellcome Trust will learn from our experience and, implement any lessons learnt. We get to interact with each other and with research support librarians, who will help us throughout the duration of the project. We also commit to sharing of research outputs beyond traditional publications and to engage with the Project, by participating in Project meetings and contributing to Project publications.

A lot of our initial discussions centred around rewards for open research or, actually, lack thereof and perceived associated risks. Indeed, the traditional academic rewarding system and the competitiveness in research leaves little room for reproducibility and openness. It is, I believe, all participants hope that this project will benefit us, in some form or another.

A critical point that is missing is the academic promotion of open research and open researcher, as a way to promote a more rigorous and sound research process and tackle the reproducibility crisis. What should the incentives be? How to make sure that the next generation of academics genuinely value openness and transparency as a foundation of rigorous research?

Some desired outputs

Ideally, I would like that the Wellcome Trust’s famous Research investigator awards to be de facto Open research investigator awards. There’s currently a split (opposition?) between doing research and supporting open science when doing research. In every grant I have written, I had to demonstrate that the team had a track record, or was in a good position to successfully pursue to proposed project. Well, how about demonstrating a track record in being good in opening and sharing science outputs? Every researcher submitting a grant should convincingly demonstrate that they are, have been and/or will be proactive open researcher and openly disseminate all the outputs. By leading by example in the frame of this Open Research Project, this is something that the Wellcome Trust could take away from.

Unfortunately, it is a fact that open science is not on the agenda of many (most?) more senior researchers and that they are neither in a position to be open nor that open science is a priority at all. I find it particularly disheartening that many senior academics (i.e. those that will sit on the panel deciding if I’m worth my next job) consider investing time in open science and the promotion of open science as time wasted of actually doing research. A bit like time for outreach and promotion of science to the wider public is sometimes looked down at, as not being the real stuff.

Another desire is that this project will enable us to influence funders, such as the Wellcome Trust, of course, but also more widely the research councils.

As a concrete example, I would like all grants that are accepted to be openly published beyond the daft layman summary. Published grants after acceptance should include data management plan, the pathway to impact, possibly more, and these could then be used to assess to what extend the project delivered as promised.

This serves at least two purposes. First, it is a way to promote transparency and accountability towards the funder, scientific community and public. Also, it is a great resource for early career researchers. Unless there is specific support in place, writing a first grant is not an easy job, especially given the multitude documents to prepare in addition to the scientific case for support. And even for more experienced researchers, it can’t harm to explore different approaches to grant writing.

Another concrete output is the requirement for a dedicated software management plan for each grant that involves any software development. I certainly consider my software to be equivalent to data and document it as such in my DMPs, but there seems to be a need for clarification.

I believe that I do a pretty decent job in conducting open science: pre-prints, open access, release data, … In the frame of this project, I shall do a better job at promoting open science for its own sake.

I also hope that by bringing some of my projects under the umbrella of the the Open Research Project, I will benefit from a broader dissemination that will, directly or indirectly, be beneficial for my career (see the importance of benefits and rewards above).

Next steps

It is important to make the most out of this unique opportunity. We need to create a momentum, define ambitious goals, and work hard to reach them. But I also think that it is important to get as much input as possible from the community. Nothing beats collective intelligence for such open-ended projects, in particular for open projects.

So please, do not hesitate to comment, discuss on twitter or elsewhere, or email me directly if you have ideas you would like to promote and or discuss.

Published 08 March 2017
Written by Dr Laurent Gatto
Creative Commons License

‘Be nice to each other’ – the second Researcher to Reader conference

Aaaaaaaaaaargh! was Mark Carden’s summary of the second annual Researcher to Reader conference, along with a plea that the different players show respect to one another. My take home messages were slightly different:

  • Publishers should embrace values of researchers & librarians and become more open, collaborative, experimental and disinterested.
  • Academic leaders and institutions should do their bit in combating the metrics focus.
  • Big Deals don’t save libraries money, what helps them is the ability to cancel journals.
  • The green OA = subscription cancellations is only viable in a utopian, almost fully green world.
  • There are serious issues in the supply chain of getting books to readers.
  • And copyright arrangements in academia do not help scholarship or protect authors*.

The programme for the conference included a mix of presentations, debates and workshops. The Twitter hashtag is #r2rconf.

As is inevitable in the current climate, particularly at a conference where there were quite a few Americans, the shadow of Trump was cast over the proceedings. There was much mention of the political upheaval and the place research and science has in this.

[*please see Kent Anderson’s comment at the bottom of this blog]

In the publishing corner

Time for publishers to raise to the challenge

The conference opened with an impassioned speech by Mark Allin, the President and CEO of John Wiley & Sons, who started with the statement this was “not a time for retreat, but a time for outreach and collaboration and to be bold”.

The talk was not what was expected from a large commercial publisher. Allin asked: “How can publishers act as advocates for truth and knowledge in the current political climate?” He mentioned that Proquest has launched a displaced researchers programme in reaction to world events, saying, “it’s a start but we can play a bigger role”.

Allin asked what publishers can do to ensure research is being accessed. Referencing “The content trap” by Bharat Anand, Allin said “We won’t as a media industry survive as a media content and putting it in a bottle and controlling its distribution. We will only succeed if we connect the users. So we need to re-engineer the workflows making them seamless, frictionless. “We should be making sure that … we are offering access to all those who want it.”

Allin raised the issue of access, noting that ResearchGate has more usage than any single publisher. He made the point that “customers don’t care if it is the version of record, and don’t care about our arcane copyright laws”. This is why people use SciHub, it is ease of access. He said publishers should not give up protecting copyright but must realise its limitations and provide easy access.

Researchers are the centre of gravity – we need to help them spend more time researching and less time publishing, he says. There is a lesson here, he noted, suppliers should use “the divine discontent of the customer as their north star”. He used the example of Amazon to suggest people working in scholarly communication need to use technology much better to connect up. “We need to experiment more, do more, fail more, be more interconnected” he said, where “publishing needs open source and open standards” which are required for transformational impact on scholarly publishing – “the Uber equivalent”.

His suggestion for addressing the challenges of these sharing platforms is to “try and make your experience better than downloading from a pirate site”, and that this would be a better response than taking the legal route and issuing takedown notices.  He asked: “Should we give up? No, but we need to recognise there are limits. We need to do more to enable access.”

Allin called the situation, saying publishing may have gone online but how much has the internet really changed scholarly communication practices? The page is still a unit of publishing, even in digital workflows. It shouldn’t be, we should have a ‘digital first’ workflow. The question isn’t ‘what should the workflow look like?’, but ‘why hasn’t it improved?’, he said, noting that innovation is always slowed by social norms not technology. Publishers should embrace values of researchers & librarians and become more open, collaborative, experimental and disinterested.

So what do publishers do?

Publishers “provide quality and stability”, according to Kent Anderson, speaking on the second day (no relation to Rick Anderson) in his presentation about ‘how to cook up better results in communicating research’. Anderson is the CEO of Redlink, a company that provides publishers and libraries with analytic and usage information. He is also the founder of the blog The Scholarly Kitchen.

Anderson made the argument that “publishing is more than pushing a button”, by expanding on his blog on ‘96 things publishers do’. This talk differed from Allin’s because it focused on the contribution of publishers.

Anderson talked about the peer review process, noting that rejections help academics because usually they are about mismatch. He said that articles do better in the second journal they’re submitted to.

During a discussion about submission fees, Anderson noted that these “can cover the costs of peer review of rejected papers but authors hate them because they see peer review as free”. His comment that a $250 journal submission charge with one journal is justified by the fact that the target market (orthopaedic surgeons) ‘are rich’ received (rather unsurprisingly) some response from the audience via Twitter.

Anderson also made the accusation that open access publishers take lower quality articles when money gets tight. This did cause something of a backlash on the Twitter discussion with a request for a citation for this statement, a request for examples of publishers lowering standards to bring in more APC income with the exception of scam publishers. [ADDENDUM: Kent Anderson below says that this was not an ‘accusation’ but an ‘observation’. The Twitter challenge for ‘citation please?’ holds.]

There were a couple of good points made by Anderson. He argued that one of the value adds that publishers do is training editors. This is supported by a small survey we undertook with the research community at Cambridge last year which revealed that 30% of the editors who responded felt they needed more training.

The library corner

The green threat

There is good reason to expect that green OA will make people and libraries cancel their subscriptions, at least it will in the utopian future described by Rick Anderson (no relation to Kent Anderson), Associate Dean of University of Utah in his talk “The Forbidden Forecast, Thinking about open access and library subscriptions”.

Anderson started by asking why, if we’re in a library funding crisis, aren’t we seeing sustained levels of unsubscription? He then explained that Big Deals don’t save libraries money. They lower the cost per article, but this is a value measure, not a cost measure. What the Big Deal did was make cancellations more difficult. Most libraries have cancelled every journal that they can without Faculty ‘burning down the library’, to preserve the Big Deal. This explains the persistence of subscriptions over time. The library is forced to redirect money away from other resources (books) and into serials budget. The reason we can get away with this is because books are not used much.

The wolf seems to be well and truly upon us. There have been lots of cancellations and reduction of library budgets in the USA (a claim supported by a long list of examples). The number of cancellations grows as the money being siphoned off book budgets runs out.

Anderson noted that the emergence of new gold OA journals doesn’t help libraries, this does nothing to relieve the journal emergency. They just add to the list of costs because it is a unique set of content. What does help libraries is the ability to cancel journals. Professor Syun Tutiya, Librarian Emeritus at Chiba University in a separate session noted that if Japan were to flip from a fully subscription model to APCs it would be about the same cost, so that would solve the problem.

Anderson said that there is an argument that “there is no evidence that green OA cancels journals” (I should note that I am well and truly in this camp, see my argument). Anderson’s argument that this is saying the future hasn’t happened yet. The implicit argument here is that because green OA has not caused cancellations so far means it won’t do it into the future.

Library money is taxpayers’ money – it is not always going to flow. There is much greater scrutiny of journal big deals as budgets shrink.

Anderson argued that green open access provides inconsistent and delayed access to copies which aren’t always the version of record, and this has protected subscriptions. He noted that Green OA is dependent on subscription journals, which is “ironic given that it also undermines them”. You can’t make something completely & freely available without undermining the commercial model for that thing, Anderson argued.

So, Anderson said, given green OA exists and has for years, and has not had any impact on subscriptions, what would need to happen for this to occur? Anderson then described two subscription scenarios. The low cancellation scenario (which is the current situation) where green open access is provided sporadically and unreliably. In this situation, access is delayed by a year or so, and the versions available for free are somewhat inferior.

The high cancellation scenario is where there is high uptake of green OA because there are funder requirements and the version is close to the final one. Anderson argued that the “OA advocates” prefer this scenario and they “have not thought through the process”. If the cost is low enough of finding which journals have OA versions and the free versions are good enough, he said, subscriptions will be cancelled. The black and white version of Anderson’s future is: “If green OA works then subscriptions fail, and the reverse is true”.

Not surprisingly I disagreed with Anderson’s argument, based on several points. To start, there would need to have a certain percentage of the work available before a subscription could be cancelled. Professor Syun Tutiya, Librarian Emeritus at Chiba University noted in a different discussion that in Japan only 6.9% of material is available Green OA in repositories and argued that institutional repositories are good for lots of things but not OA. Certainly in the UK, with the strongest open access policies in the world, we are not capturing anything like the full output. And the UK is itself only 6% of the research output for the world, so we are certainly a very long way away from this scenario.

In addition, according to work undertaken by Michael Jubb in 2015 – most of the green Open Access material is available in places other than institutional repositories, such as ResearchGate and SciHub. Do librarians really feel comfortable cancelling subscriptions on the basis of something being available in a proprietary or illegal format?

The researcher perspective

Stephen Curry, Professor of Structural Biology, Imperial College London, spoke about “Zen and the Art of Research Assessment”. He started by asking why people become researchers and gave several reasons: to understand the world, change the world, earn a living and be remembered. He then asked how they do it. The answer is to publish in high impact journals and bring in grant money. But this means it is easy to lose sight of the original motivations, which are easier to achieve if we are in an open world.

In discussing the report published in 2015, which looked into the assessment of research, “The Metric Tide“, Curry noted that metrics & league tables aren’t without value. They do help to rank football teams, for example. But university league tables are less useful because they aggregate many things so are too crude, even though they incorporate valuable information.

Are we as smart as we think we are, he asked, if we subject ourselves to such crude metrics of achievement? The limitations of research metrics have been talked about a lot but they need to be better known. Often they are too precise. For example was Caltech really better than University of Oxford last year but worse this year?

But numbers can be seductive. Researchers want to focus on research without pressure from metrics, however many Early Career Researchers and PhD students are increasingly fretting about publications hierarchy. Curry asked “On your death bed will you be worrying about your H-Index?”

There is a greater pressure to publish rather than pressure to do good science. We should all take responsibility to change this culture. Assessing research based on outputs is creating perverse incentives. It’s the content of each paper that matters, not the name of the journal.

In terms of solutions, Curry suggested it would be better to put higher education institutions in 5% brackets rather than ranking them 1-n in the league tables. Curry calls for academic leaders and institutions to do their bit in combating the metrics focus. He also called for much wider adoption of the Declaration On Research Assessment (known as DORA). Curry’s own institution, Imperial College London, has done so recently.

Curry argued that ‘indicators’ would be a more appropriate term than ‘metrics’ in research assessment because we’re looking at proxies. The term metrics imply you know what you are measuring. Certainly metrics can inform but they cannot replace judgement. Users and providers must be transparent.

Another solution is preprints, which shift attention from container to content because readers use the abstract not the journal name to decide which papers to read. Note that this idea is starting to become more mainstream with the research by the NIH towards the end of last year “Including Preprints and Interim Research Products in NIH Applications and Reports

Copyright discussion

I sat on a panel to discuss copyright with a funder – Mark Thorley, Head of Science Information, Natural Environment Research Council , a lawyer – Alexander Ross, Partner, Wiggin LLP and a publisher – Dr Robert Harington,  Associate Executive Director, American Mathematical Society.

My argument** was that selling or giving the copyright to a third party with a purely commercial interest and that did not contribute to the creation of the work does not protect originators. That was the case in the Kookaburra song example. It is also the case in academic publishing. The copyright transfer form/publisher agreement that authors sign usually mean that the authors retain their moral rights to be named as the authors of the work, but they sign away rights to make any money out of them.

I argued that publishers don’t need to hold the copyright to ensure commercial viability. They just need first exclusive publishing rights. We really need to sit down and look at how copyright is being used in the academic sphere – who does it protect? Not the originators of the work.

Judging by the mood in the room, the debate could have gone on for considerably longer. There is still a lot of meat on that bone. (**See the end of this blog for details of my argument).

The intermediary corner

The problem of getting books to readers

There are serious issues in the supply chain of getting books to readers, according to Dr Michael Jubb, Independent Consultant and Richard Fisher from Something Understood Scholarly Communication.

The problems are multi-pronged. For a start, discoverability of books is “disastrous” due to completely different metadata standards in the supply chain. ONIX is used for retail trade and MARC is standard for libraries, Neither has detailed information for authors, information about the contents of chapters, sections etc, or information about reviews and comments.

There are also a multitude of channels for getting books to libraries. There has been involvement in the past few years of several different kinds of intermediaries – metadata suppliers, sales agents, wholesalers, aggregators, distributors etc – who are holding digital versions of books that can be supplied through the different type of book platforms. Libraries have some titles on multiple platforms but others only available on one platform.

There are also huge challenges around discoverability and the e-commerce systems, which is “too bitty”. The most important change that has happened in books has been Amazon, however publisher e-commerce “has a long way to go before it is anything like as good as Amazon”.

Fisher also reminded the group that there are far more books published each year than there are journals – it’s a more complex world. He noted that about 215 [NOTE: amended from original 250 in response to Richard Fisher’s comment below] different imprints were used by British historians in the last REF. Many of these publishers are very small with very small margins.

Jubb and Fisher both emphasised readers’ strong preference for print, which implies that much more work needed on ebook user experience. There are ‘huge tensions’ between reader preference (print) and the drive for e-book acquisition models at libraries.

The situation is probably best summed up in the statement that “no-one in the industry has a good handle on what works best”.

Providing efficient access management

Current access control is not functional in the world we live in today. If you ask users to jump through hoops to get access off campus then your whole system defeats its purpose. That was the central argument of Tasha Mellins-Cohen, the Director of Product Development, HighWire Press when she spoke about the need to improve access control.

Mellins-Cohen started with the comment “You have one identity but lots of identifiers”, and noted if you have multiple institutional affiliations this causes problems. She described the process needed for giving access to an article from a library in terms of authentication – which, as an aside, clearly shows why researchers often prefer to use Sci Hub.

She described an initiative called CASA – Campus Activated Subscriber-Access which records devices that have access on campus through authenticated IP ranges and then allows access off campus on the same device without using a proxy. This is designed to use more modern authentication. There will be “more information coming out about CASA in the next few months”.

Mellins-Cohen noted that tagging something as ‘free’ in the metadata improves Google indexing – publishers need to do more of this at article level. This comment was responded with a call out to publishers to make the information about sharing more accessible to authors through How Can I Share It?

Mellins-Cohen expressed some concern that some of the ideas coming out of RA21 Resource Access in 21st Century, an STM project to explore alternatives to IP authentication, will raise barriers to access for researchers.

Summary

It is always interesting to have the mix of publishers, intermediaries, librarians and others in the scholarly communication supply chain together at a conference such as this. It is rare to have the conversations between different stakeholders across the divide. In his summary of the event, Mark Carden noted the tension in the scholarly communication world, saying that we do need a lively debate but also need to show respect for one another.

So while the keynote started promisingly, and said all the things we would like to hear from the publishing industry, there is still the reality that we are not there yet.  And this underlines the whole problem. This interweb thingy didn’t happen last week. What has actually happened  to update the publishing industry in the last 20 years? Very little it seems. However it is not all bad news. Things to watch out for in the near future include plans for micro-payments for individual access to articles, according to Mark Allin, and the highly promising Campus Activated Subscriber-Access system.

Danny Kingsley attended the Researcher to Reader conference thanks to the support of the Arcadia Fund, a charitable fund of Lisbet Rausing and Peter Baldwin.

Published 27 February 2017
Written by Dr Danny Kingsley
Creative Commons License

Copyright case study

In my presentation, I spoke about the children’s campfire song, “Kookaburra sits in the old gum tree” which was written by Melbourne schoolteacher Marion Sinclair in 1932 and first aired in public two years later as part of a Girl Guides jamboree in Frankston. Sinclair had to get prompted to go to APRA (Australasian Performing Right Association) to register the song. That was in 1975, the song had already been around for 40 years but she never expressed any great interest in any propriety to the song.

In 1981 the Men at Work song “Down Under” made No. 1 in Australia. The song then topped the UK, Canada, Ireland, Denmark and New Zealand charts in 1982 and hit No.1 in the US in January 1983. It sold two million copies in the US alone.  When Australia won the America’s Cup in 1983 Down Under was played constantly. It seems extremely unlikely that Marion Sinclair did not hear this song. (At the conference, three people self-identified as never having heard the song when a sample of the song was played.)

Marion Sinclair died in 1988, the song went to her estate and Norman Lurie, managing director of Larrikin Music Publishing, bought the publishing rights from her estate in 1990 for just $6100. He started tracking down all the chart music that had been printed all over the world, because Kookaburra had been used in books for people learning flute and recorder.

In 2007 TV show Spicks and Specks had a children’s music themed episode where the group were played “Down Under” and asked which Australian nursery rhyme the flute riff was based on. Eventually they picked Kookaburra, all apparently genuinely surprised when the link between the songs was pointed out. There is a comparison between the music pieces.

Two years later Larrikin Music filed a lawsuit, initially wanting 60% of Down Under’s profits. In February 2010, Men at Work appealed, and eventually lost. The judge ordered Men at Work’s recording company, EMI Songs Australia, and songwriters Colin Hay and Ron Strykert to pay 5% of royalties earned from the song since 2002 and from its future earnings.

In the end, Larrikin won around $100,000, although legal fees on both sides have been estimated to be upwards $4.5 million, with royalties for the song frozen during the case.

Gregory Ham was the flautist in the band who played the riff. He did not write Down Under, and was devastated by the high profile court case and his role in proceedings. He reportedly fell back into alcohol abuse and was quoted as saying: “I’m terribly disappointed that’s the way I’m going to be remembered — for copying something.” Ham died of a heart attack in April 2012 in his Carlton North home, aged 58, with friends saying the lawsuit was haunting him.

This case, I argued, exemplifies everything that is wrong with copyright.

We are going OPEN – the Open Research experiment has begun!

There has been much discussion recently about the reproducibility crisis and about the growing distrust among the public in the quality of research. As illustrated in our ‘Case for Open Research’ series of blog posts, one of the main reasons for this is that researchers are currently rewarded for the number of papers they publish in high impact factor journals, and not necessarily for the quality of work that they are doing.

Indeed, Cambridge researchers clearly indicated that the lack of incentives to do anything other than publishing in these types of journals is one of the main blockers discouraging them from adopting a more open research practice.

Joining forces with the Wellcome Trust

The Office of Scholarly Communication started talking about these problem with the Open Research team at the Wellcome Trust. The Wellcome Trust are natural allies, as they have consistently led their researchers towards greater openness. They were one of the first funding bodies to introduce policies on Open Access and on data management and sharing. Now the Wellcome Trust is moving towards proactively supporting Open Research beyond enforcing their compliance requirements.

To promote immediate and transparent research sharing, they have recently launched the Wellcome Open Research platform which allows researchers to submit articles about virtually any research output and get published within a couple of days. The Wellcome Trust is now considering making Open Research one of their strategic priorities.

We quickly realised that we have a lot of shared interests, and joining forces to tackle the problem together made a lot of sense. We came up with the idea to launch the Open Research Pilot Project.

The Open Research Pilot – understanding the barriers to “openness”

We conceived the project as a two year experiment, which would allow us to gain an understanding of what is needed for researchers to share and get credit for all outputs of the research process. These include non-positive results, protocols, source code, presentations and other research outputs beyond the remit of traditional publications.

The Project aims to understand the barriers preventing researchers from sharing (including resource and time implications), as well as what the incentives are. The Project aims to utilise the new Wellcome Open Research publishing platform, together with other channels, to share these outputs.

The invitation to take part in the Pilot was sent to all researchers at Cambridge funded by the Wellcome Trust. Participating researchers had to commit to sharing of research outputs beyond traditional publications and to engage with the Project, by participating in Project meetings and contributing to Project publications.

Is ‘doing the right thing’ enough incentive?

Our biggest question was whether anyone would be willing to participate in the Pilot. We did not offer any incentive other than encouraging researchers to contribute to the greater good. The only support available to those who wanted to take part in the project was that offered by the Wellcome Trust and Cambridge Open Research team members, but there was no financial aid available to prospective participants. We thought that regardless of the outcome, that inviting researchers would be a good exercise to go through – we thought that if no one applied, we would have learnt that doing ‘the right thing’ was not a good enough motivator.

Thankfully, we received several fantastic applications from individual researchers and research groups who demonstrated great interest in and motivation for Open Research. We initially planned to work with two research groups, but given the quality of applications received and passion for Open Research expressed by the applicants, we decided to extend the scope of the project to four research groups. We have selected researchers doing different types of research, with the aim of learning about distinct problems in sharing that are experienced in diverse research disciplines:

  •       Dr Laurent Gatto –is  doing computational biology research, with a special focus on proteomics data. His interest is: How to effectively share research data and the code needed to reproduce them?
  •       Dr David Savage – is researching molecular pathogenesis of the consequences of obesity. His question is: What are the problems with sharing data coming from human participants?
  •       Dr Benjamin Steventon – is a developmental biologist generating and analysing large-scale imaging datasets. He would like to know: Are there image repositories allowing one to share large image datasets in a re-usable way?
  •       Dr Marta Costa and Dr Greg Jefferis (and others) – researchers leading the work on two collaborative projects: Connectomics and Virtual Fly Brain, which will create interactive tools to interrogate Drosophila neural network connections. They would like to understand: What are the issues with sharing complex interactive datasets? How to ensure long-term preservation of complex digital objects?

Motivations

So what motivated these researchers to apply for the project? We asked this question at the application stage and were positively surprised by the altruistic answers that we received. Our researchers were largely driven by a desire to improve the research process. We have seen responses like:

  • “Openness in research, from data and software to publication, is a central pillar of good research.”
  • “I am very concerned (disappointed as a scientist) by the current wave of ‘unreproducible’ and/or ‘irrelevant’ research, and am very passionate about contributing to improving scientific endeavour in this regard.”
  • I am very enthusiastic about exploiting new ways of sharing my research output beyond the established peer-review journal system.”
  • “I believe that sharing research outputs fully, including data and code are essential to accelerate research, and I have benefitted from it in my own research.”

Summarising, researchers expressed a great desire for contributing to a cultural change. Researchers wanted to change the way in which research was disseminated and to increase research transparency and reproducibility.

Let’s get to work

We all met (the researchers, Wellcome Trust and Cambridge Open Research teams) on Friday 27 January to officially start the two year project. Each research group was appointed a facilitator – a dedicated member of the Cambridge Open Research team to support researchers during the Project. Research groups will meet with their facilitators on a monthly basis in order to discuss shareable research outputs and to decide on best ways to disseminate these outputs. Every six months all project members will meet together to discuss the barriers to sharing discovered and to assess the progress of the Project.

One of the main goals of the Project is to learn what the barriers and incentives are for Open Research and to share these findings with others interested in the subject to inform policy development. Therefore, we will be regularly publishing blog posts on the Unlocking Research blog and on the Wellcome Open Research blog with case studies describing what we have discovered while working together. There will be an update from each research group every six months. We will also be publicly sharing all main outputs of the Project.

We are all extremely excited about going “Open” and we suggest that anyone interested in the Open Research practice watches this space.

Published 08 February 2017
Written by Dr Marta Teperek
Creative Commons License

The art of software maintenance

When it comes to software management there are probably more questions than answers to problems – that was the conclusion of a recent workshop hosted by the Office of Scholarly Communication (OSC) as part of a national series on software sustainability, sharing and management, funded by Jisc. The presentations and notes from the day are available, as is a Storify from the tweets.

The goal of these workshops was to flesh out the current problems in software management and sharing and try to identify possible solutions. The researcher-led nature of this event provided researchers, software engineers and support staff with a great opportunity to discuss the issues around creating and maintaining software collaboratively and to exchange good practice among peers.

Whilst this might seem like a niche issue, an increasing number of researchers are reliant on software to complete their research, and for them the paper at the end is merely an advert for the research it describes. Stephen Eglen described this in his talk as an ‘inverse problem’ – papers are published and widely shared but it is very hard to get to the raw data and code from this end product, and the data and code are what is required to ensure reproducibility.

These workshops were inspired by our previous event in 2015, where Neil Chue Hong and Shoaib Sufi spoke with researchers at Cambridge about software licensing and Open Access. Since then the OSC has had several conversations with Daniela Duca at Jisc and together we came up with an idea of organising researcher-led workshops across several institutions in the UK.

Opening up software in a ‘post-expert world’

We began the day with a keynote from Neil Chue-Hong from the Software Sustainability Institute who outlined the difficulties and opportunities of being an open researcher in a ‘post-expert world’ (the slides are available here). Reputation is crucial to a researcher’s role and therefore researchers seek to establish themselves as experts. On the other hand, this expert reputation might be tricky to maintain since making mistakes is an inevitable part of research and discovery, which is poorly understood outside of academia. Neil introduced Croucher’s Law to help us understand this: everyone will make mistakes, even an expert, but an expert will be aware of this so will automate and share their work as much as possible.

Accepting that mistakes are inevitable in many ways makes sharing less intimidating. Papers are retracted regularly due to errors and Neil gave examples from a variety of disciplines and career stages where people were open about their errors so their communities were accepting of the mistakes. In fact, once you accept that we will all make mistakes then sharing becomes a good way to get feedback on your code and to help you fix bugs and errors.

This feeds into another major theme of the workshop which Neil introduced; that researchers need to stop aiming for perfect and adopt ‘good enough’ software practices for achievable reproducibility. This recognises that one of the biggest barriers to sharing is the time it takes to learn software skills and prepare data to the ‘best’ standards. Good enough practices mean accepting that your work may not be reproducible forever but that it is more important to share your code now so that it is at least partially reproducible now. Stephen Eglen built on this with his paper on ‘Towards standard practices for sharing computer code and programs in neuroscience’ which includes providing data, code, tests for your code and using licences and DOIs.

Both speakers and the focus groups in the afternoon highlighted that political work is needed, as well as cultural change, to normalise code sharing. Many journals now ask for evidence of the data which supports articles and the same standards should apply to software code. Similarly, if researchers ask for access to data when reviewing articles then it makes sense to ask for the code as well.

Automating your research: Managing software

Whilst sharing code can be seen as the end of the lifecycle of research software, writing code with the intention of sharing it was repeatedly highlighted as a good way to make sure it is well-written and documented. This was one of several ‘selfish’ reasons to share, where sharing also helps the management of software, through better collaboration, the ability to track your work and being able to use students’ work after they leave.

Croucher’s Law demonstrates one of the main benefits of automating research through software; the ability to track the mistakes to improve reproducibility and make fixing mistakes easier. There were lots of tools mentioned throughout the day to assist with managing software from the well-known version control and collaboration platform Github to the more dynamic such as Jupyter notebooks and Docker. As well as these technical tools there was also discussion of more straightforward methods to maintain software such as getting a code buddy who can test your code and creating appropriate documentation.

Despite all of these tools and methods to improve software management it was recognised by many participants that automating research through software is not a panacea; the difficulties of working with a mix of technical and non-technical people formed the basis of one of the focus groups.

Sustaining software

Managing software appropriately allows it to be shared but re-using it in the long- (or even medium) term means putting time into sustaining code and make sure it is written in a way that is understandable to others. The main recommendations from our speakers and focus groups to ensure sustainability were to use standards, create thorough documentation and embed extensive comments within your code.

As well as thinking about the technical aspects of sustaining software there was also discussion of what is required to motivate people to make their code re-usable. Contributing to a community seemed to be a big driver for many participants so finding appropriate collaborators is important. However larger incentives are needed and creating and maintaining software is not currently well-rewarded as an academic endeavour. Suggestions to rectify this included more software-oriented funding streams, counting software as an output when assessing academics, and creating a community of software champions to mirror the Data Champions scheme we recently started in Cambridge.

Next steps

This workshop was part of a national discussion around research software so we will be looking at outcomes of other workshops and wider actions the Office of Scholarly Communication can support to facilitate sharing and sustaining research software. Apart from Cambridge, five other institutions held similar workshops (Bristol, Birmingham, Leicester, Sheffield, and the British Library). As one of the next steps, all organisers of these events want to meet up to discuss the key issues raised by researchers to see what national steps should be taken to better support the community of researchers and software engineers and also to consider if there any remaining problems with software which could require a policy intervention.

However, following the maxim to ‘think global, act local’, Neil’s closing remarks urged everyone to consider the impact they can have by influencing those directly around them to make a huge difference to how software is managed, sustained and shared across the research community.

Published 29 January 2017
Written by Rosie Higman
Creative Commons License

‘Paperless research’ solutions – Electronic Lab Notebooks

The Office of Scholarly Communication started 2017 with a discussion about ‘going digital’ – on 13 January 2017 we organised an event at Cambridge University’s Department of Engineering to flesh out the problems preventing researchers from implementing Electronic Lab Notebook solutions. Chris Brown from Jisc wrote an excellent blog post with his reflections of the event* and agreed for us to re-blog it here.

For researchers working in laboratories the importance of recording experiments, results, workflows, etc in a notebook is engrained into you as a student. However, these paper-based solutions are not ideal when it comes to sharing and preservation. They pile on desks and shelves, vary in quality and often include printed data stuck in. To improve on this situation and resolve many of these issues, e-lab notebooks (ELNs) have been developed. Jisc has been involved in this work through funding projects such as CamELN and LabTrove in the past. Recently, interest in this area has been renewed with the Next Generation Research Environment co-design challenge.

On Friday 13 January I attended the E-Lab Notebooks workshop at the University of Cambridge, organised by Office of Scholarly Communication. Its purpose was to open up the discussion about how ELNs are being used in different contexts and formats, and the concerns and motivations for people working in labs. A range of perspectives and experience was given through presentations, group and panel discussions. The audience were mostly from Cambridge, but there was representation from other parts of the UK, as well as Denmark and Germany. A poll at the start showed that the majority of the audience were researchers (57%).

Institutional and researchers’ perspective on ELNs at Cambridge

The first part of the workshop focussed on the practitioners’ perspective with presentations from the School of Biological Sciences. Alastair Downie (Gurdon Institute) talked about their requirements for an ELN as well as anxieties and risks of adopting a particular system. Research groups currently use a variety of tools, such as Evernote and Dropbox, and often these are trusted more than ELNs. The importance of trust frequently came up during the day. Alastair conducted a survey to gather more detail on the use and requirements of ELNs and received an impressive 345 responses. Cost and complexity were given as the main reasons not to use ELNs. However, when asked for the most important features, cost was less important but ease of use was the most. Researchers want training, voice recognition and remote access. There is clear interest across the school at all levels, but it requires a push with guidance and direction.

Pic1Marko Hyvönen (Dept of Biochemistry) gave the PI perspective and the issues with an ELN for a biochemical lab. He reinforced what Alastair had said about ELNs. He showed how paper log books pile up, deteriorate over time and sometimes include printed information. They are hard to read and easy to destroy, a poor return on effort, often disappear and not searchable. It was interesting to hear about bad habits such as storing data in non-standardised ways, missing data, printing out Word documents and sticking them into the lab books.

With 99% of their data electronic many of the issues in the use of lab books generally are around data management and not ELNs. An ELN solution should be easy to use, cross platform, have a browser front end, be generic/adaptable, allow sharing of data and experiments, enforce Standard Operating Procedures when needed, have templates for standard work to minimise repetition, include inputting of data from phones and other non-specific devices. What they don’t want are the “bells and whistles” features they don’t use. Getting buy-in from people is the top issue to overcome in implementing an ELN.

Views on ELNs from outside the UK

Jan Krause from the École pPolytechnique Fédérale de Lausanne (EPFL) gave a non-UK perspective on ELNs. He described a study, as part of a national RDM project, where they separated ELNs (75 proprietary, 12 open source – 91 features) and Lab Info Management Systems (LIMS) (281 proprietary, 9 open source – 95 features) and compared their features. The two tools used mostly in Switzerland are SLims (commercial solution) and openBIS (homemade tool). To decide which tool to use they undertook a three phase selection process. The first selection was based on disciplinary and technical requirements. The second selection involved detailed analysis based on user requirements (interviews and evaluation weighted by feature) and price. The third selection was tendering and live demos.

Data storage, security and compliance requirements

When using and sharing data you need to make sure your data is safe and secure. Kieren Lovell, from the University Information Services, talked about how researchers should keep their data and accounts safe. Since he started in May 2015, all successful hacks on the university have been due to human error, such as unpatched servers, failures in processes, bad password management, and phishing. Even if you think your data and research isn’t important, the reputational damage of security attacks to the university is huge. He recommended that any research data is shared through cloud providers rather than email, never trust public wifi as is not secure so use Cambridge’s VPN service. If using a local machine you should encrypt your hard drive.

Pic2

Providers’ perspective

In the afternoon, presentations were from the providers’ perspective. Jeremy Frey, from the University of Southampton, talked about his experience of developing an open source ELN to support open and interdisciplinary science. He works on getting the people and technology to work together. It’s not just recording what you have done, you need to include the narrative behind what you do. This is critical for understanding and ELNs are one part of the digital ecosystem in the lab. The solution they’ve developed is LabTrove, partly funded by Jisc, which is a flexible open source web based solution. Allowing pictures to be added to the notes has really helped with accessibility and usability, such as dyslexia. Sustainability, as is often the case, came up and how a community is required to support such a system. It also needs to expand beyond Southampton. Finally, Jeremy used Amazon Echo to query the temperature within part of his lab. He hopes that this will be used more in the lab in the future when it can recognise each researcher’s voice.

In the next two presentations, it was over to the vendors to show the advantages of adopting RSpace (by Rory Macneil) and Dotmatics (by Dan Ormsby). The functionality on offer in these types of solutions is attractive for scientists and RSpace showed how it links to most common file stores. With any ELN, it should enhance researchers’ workflow and integrate with the tools they use.

Removing the barriers

After lunch there were three parallel focus group discussions. I attended the one on sustainability, something that comes up frequently in discussions, particularly when looking at open source or proprietary solutions. Each group reported back as follows:

Focus group 1: Managing the supplier lock in risk

Stories of use need to be shared. The PDF is not a great format for sharing. Vendors tell the truth like estate agents. Have to accept the reality that won’t have 100% exporting functionality so need to decide the minimum level. Determine specific users’ requirements.

Focus group 2: Sustainability of ELN solutions

What is the lifetime of an ELN? How long should everything be accessible? Various needs come from group and funder requirements, e.g. 10 years. There is concern if you are relying on one commercial solution as companies can die, so how can you guarantee the data will be available? Have exit policies and support standards and interoperability so data can be moved across ELNs. Broken links and file formats expiring is not just an ELN problem, but relates to the archiving of data in general. Should selection and support of an ELN be at group, department, institution or national level? This is difficult if it’s in one group as adopting any technical solution requires support in place. It requires institutional level support.

Focus group 3: Human element of ELN implementation

The biggest hurdle is culture change and showing the benefits of using an ELN. Training and technical support costs money and time. It would cost more initially but becomes more efficient. You can incentivise people by having champions. There are different needs in a large institution. You may join a lab and find the ELN is not adequate. Legal issues around sensitive data complicates matters. You need to believe it will save time. Long term solutions include using cloud base solutions, even MS Office, but what happens when people leave? Need support from higher level. Functionality should be based on user requirements. A start would be to set up a mailing list of people interested in ELNs.

Remaining barriers to wide ELN adoption

Finally, I chaired a panel session with all the presenters. Marta Teperek had kindly asked me to give a short presentation on what Jisc does as many researchers don’t know (in fact I was asked “what’s Jisc?” in the focus group) and to promote the Next Generation Research Environment co-design challenge. Following my presentation the discussion was prompted by questions from the audience and remotely via sli.do. Much of the discussion re-iterated what had been said in the presentations, such as the importance of an ELN that meets the requirements of researchers. It should allow integration with other tools and exporting of the data for use it other ELNs. Getting ELNs used within a department is often difficult so it does need institution level commitment and support. Without this ELNs are unlikely to be adopted within an institution, never mind nationally. One size does not fit all and we should not try to build an ELN that tries to satisfy the different needs of various disciplines. A modular system that integrates with the tools and systems already in use would be a better solution. Much of what was said tallied with the feedback received for the Next Generation Research Environment co-design challenge.

Closing remarks

Ian Bruno closed the workshop and he reiterated what was said in the panel discussion. I found the event extremely helpful and it provided lots of useful information to feed into the Next Generation Research Environment work. I’d like to thank Marta Teperek for inviting me to chair the panel and for all her hard work putting the event together with @CamOpenData. Marta has put together the tweets from the day into the following storify.  All notes and presentations from the event are now published in Apollo, the University of Cambridge’s research repository.

Follow-up actions at the University of Cambridge – give it a go!

Those of you who are interested in ELNs and who are based at the University of Cambridge might be interested in knowing that we are planning to do some trial access to Electronic Lab Notebooks (ELN). The purpose of this trial will be to test out several ELNs to decide on solutions which might best meet the requirements of the research community. A mailing list has been set up for people who are interested in being part of this pilot or would like to be involved in these discussions. If you would like to be added to the mailing list, please fill in the form here: https://lists.cam.ac.uk/mailman/listinfo/lib-eln

*Originally published by Jisc on 18 January 2017.

Published on 29 January 2017
Written by Chris Brown
Creative Commons License

2016 – that was the year that was

 In January last year we published a blog post ‘2015 that was the year that was‘ which not only helped us take stock about what we have achieved, but also was very well received. So we have decided to do it again. For those who are more visually oriented, the slides ‘The OSC a lightning Tour‘ might be useful. 

Now starting its third year of operation, the Office of Scholarly Communication (OSC) has expanded to a team of 15, managing a wide variety of projects. The OSC has developed a set of strategic goals  to support its mission: “The OSC works in a transparent and rigorous manner to provide recognised leadership and innovation in the open conduct and dissemination of research at Cambridge University through collaborative engagement with the research community and relevant stakeholders.”

1. Working transparently

The OSC maintains an active outreach programme which fits with the transparent manner of the work that the OSC undertakes, which also includes the active documentation of workflows.

One of the ways we work transparently is to share many of our experiences and idea through this blog which receives over 2,000 visits a month. During 2016 the OSC published 41 blogs – eight blogs each on Scholarly Communication and Open Research, 14 on Open Access,  nine on Research Data Management and two on Library and training matters. The blogs we published in Open Access week were accessed 1630 times that week alone.

In addition to our websites for Scholarly Communication and Open Access, our Research Data Management website has been identified internationally as best practice and receives nearly 3,000 visitors a month.

We also run a Twitter feed for both Open Access with 1100 followers, and Open Data with close to 1200 followers. Many of the OSC staff also run their own Twitter feeds which share professional observations.

We also publish monthly newsletters, including one on scholarly communication matters. Our research data management newsletter has close to 2,000 recipients. Our shining achievement for the year however has to be the hugely successful scholarly communication Advent Calendar (which people are still accessing…)

We practise what we preach and share information about our work practices such as our reports to funders on APC spend and so on, through our repository Apollo and also by blogging about it – see Cambridge University spend on Open Access 2009-2016. We also share our presentations through Apollo and in Slideshare.

2. Disseminating research

The OSC has a strong focus on research support in all aspects of the scholarly communication ecosystem, from concept, through study design, preparation of research data management plans, decisions about publishing options and support with the dissemination of research outputs beyond the formal literature. The OSC runs an intense programme of advocacy relating to Open Access and Research Data Management, and has spoken to nearly 3,000 researchers and administrators since January 2015.

2.1 Open Access compliance

In April 2016, the HEFCE policy requiring that all research outputs intended to be claimed for the REF be made open access came into force. As a result, there has been an increased uptake of the Open Access Service with the 10,000th article submitted to the system in October. Our infographics on Repository use and Open Access demonstrate the level of engagement with our services clearly.

Currently half of the entire research output of the University is being deposited to the Open Access Service each month (see the blog: How open is Cambridge?). While this is good from a compliance perspective, it has caused some processing issues due to the manual nature of the workflows and insufficient staff numbers. At the time of writing, there is a deposit backlog of over 600 items to put into the repository and a backlog of over 2,300 items to be checked if they have been published so we can update the records.

The OA team made over 15 thousand ticket replies in 2016 – or nearly 60 per work day!

2.2 Managing theses

Work on theses continues, with the OSC driving a collaboration with Student Services to pilot the deposit of digital theses in addition to printed bound ones with a select group of departments from January 2017. The Unlocking Theses project in 2015-2016 has seen an increase in the number of historic theses in the repository from 700 to over 2,200 with half openly available. An upcoming digitisation project will add a further 1,400 theses. The upgrade of the repository and associated policies means all theses (not just PhDs) can be deposited and the OSC is in negotiation with several departments to bulk upload their MPhils and other sets of theses which are currently held in closed collections and are undiscoverable. This is an example of the work we are doing to unearth and disseminate research held all over the institution.

As a result of these activities it has become obvious that the disjointed nature of thesis management across the Library is inefficient. There is considerable effort being placed on developing workflows for managing theses centrally within the Library which the OSC will be overseeing into the future.

3. Research Support

3.1  Research Data Support

The number of data submissions received by the University repository is continuously growing, with Cambridge hosting more datasets in the institutional repository than any other UK university. Our ‘Data Sharing at Cambridge’ infographic summarises our work in this area.

A recent Primary Research Group report recognised Cambridge as having ‘particularly admirable data curation services’.

3.2 Policy development

The OSC is heavily involved in policy development in the scholarly communication space and participates in several activities external to the University. In July 2016 the UK Concordat on Open Research Data was published, with considerable input from the university sector, coordinated by the OSC.

We have representatives on the RCUK Open Access Practitioners Group, the UK Scholarly Communication License and Model Policy Steering Committee and the CASRAI Open Access Glossary Working Group, plus several other committees external to Cambridge. The OSC has contributed to discussions at the Wellcome Trust about ensuring better publisher compliance with their Open Access policy.

We are also updating and writing policies for aspects of research management across the University.

3.3 Collaborations with the research community

The OSC collaborates directly with the research community to ensure that the funding policy landscape reflects their needs and concerns. To that end we have held several town-hall meetings with researchers to discuss issues such as the mandating of CC-BY licensing, peer review and options relating to moving towards an Open Research landscape. We have also provided opportunities for researchers to meet directly with funders to discuss concerns and articulate amendments to the policies. The OSC has led discussions with the sector and arXiv.org, including visiting Cornell University, to ensure that researchers using this service to make their work openly available can be compliant under the HEFCE policy.

A new Research Data Management Project Group brings researchers and administrators together to work on specific issues relating to the retention and preservation of data and the management of sensitive data. We have also recruited over 40 Data Champions from across the University. Data Champions are researchers, PhD students or support staff who have agreed to advocate for data within their department: providing local training, briefing staff members at departmental meetings, and raising awareness of the need for data sharing and management.

The initiative began as an attempt to meet the growing need for RDM training, provide more subject-specific RDM support and begin more conversations about the benefits of RDM beyond meeting funders’ mandates. There has been a lot of interest in our Data Champions from other universities in the UK and abroad, with applications for our scheme coming from around the world. In response to this we have proposed a Bird of a Feather session at the 9th RDA plenary meeting in April to discuss similar initiatives elsewhere and creating RDM advocacy communities.  

3.3 Professional development for the research community

The OSC provides the research community with a variety of advocacy, training and workshops relating to research data management, sharing research effectively, bibliometrics and other aspects of scholarly communication. The OSC held over 80 sessions for researchers in 2016, including the extremely successful ‘Helping researchers publish’ event which we are repeating in February.

Our work with the Early Career Research (ECR) community has resulted in the development of a series of sessions about the publishing process for the PhD community. These have been enthusiastically embraced and there are negotiations with departments about making some courses compulsory. While this underlines the value of these offerings it does raise issues about staffing and how this will be financed.

The OSC is increasingly managing and hosting conferences at the University. Cambridge is participating in the Jisc Shared Repositories pilot and the OSC hosted an associated Research Data Network conference in September. In July 2016, the OSC organised a conference on research data sharing in collaboration with the Science and Engineering South Consortium, which was extremely well received and attracted over 80 attendees from all over the UK.

In November, the OpenCon Cambridge group – with which the OSC is heavily involved – held a OpenConCam satellite event which was very well attended and received very positive feedback. The storify of tweets is available, as is this blog about the event. The OSC was happy to both be a sponsor of the event and to be able to support the travel of a Cambridge researcher to attend the main OpenCon event in Washington and bring back her experiences.

Increasingly we are livestreaming our events and then making them available online as a resource for later.

3.4 Developing Library capacity for support

We have published a related post which details the training programmes run for library staff members in 2016. In total 500 people attended sessions offered in the Supporting Researchers in the 21st century programme, and we successfully ‘graduated’ the second tranche of the Research Support Ambassador Programme.

Conference session proposals on both the Supporting Researchers and the Research Ambassador programmes have been submitted to various national and international conferences. Dr Danny Kingsley and Claire Sewell have also had an abstract accepted for an article to appear in the 2017 themed issue of The New Review of Academic Librarianship.

4. Updating and integrating systems

The University repository, Apollo has been upgraded and was launched during Open Access Week. The upgrade has incorporated new services, including the ability to mint DOIs which has been enthusiastically adopted. A new Request a Copy service for users wishing to obtain access to embargoed material is being heavily used without any promotion, with around 300 requests a month flowing through. This has been particularly important given the fact that we are depositing works prior to publication, so we have to put them under an infinite embargo until we know the publication date (at which time we can set the embargo lift date). The huge number of over 2,000 items needing to be checked for  publication date means a large percentage of the contents of the repository is discoverable but closed under embargo.

In order to reduce the heavy manual workload associated with the deposit and processing of over 4,000 papers annually, the OSC is working with the Research Information Office on a systems integration programme between the University’s CRIS system – Symplectic – and Apollo, and retaining our integrated helpdesk system which uses a programme called ZenDesk. This should allow better compliance reporting for the research community, and reduce manual uploading of articles.

But this process involves a great deal more than just metadata matching and coding, and touches on the extremely ‘silo’ed nature of the support services being offered to our researchers across the institution. We are trying to work through these issues by instigating and participating in several initiatives with multiple administrative areas of the University.  The OSC is taking the lead with a ‘Getting it Together’ project to align the communication sent to researchers through the research lifecycle and across the range of administrative departments including Communication, Research Operations, Research Strategy and University Information Systems, termed the ‘Joined up Communications’ group. In addition we are heavily involved in the Coordinated and Functional Research Systems Group (CoFRS) the University Research Administration Systems Committee and the Cambridge Big Data Steering Group.

5. Pursuing a research agenda

Many staff members of the OSC originate from the research community and the team have a huge conference presence. The OSC team attended over 80 events in 2016 both within the UK and major conferences worldwide, including Open Scholarship Initiative, FORCE2016, Open Repositories, International Digital Curation Conference, Electronic Thesis & Dissertations, Special Libraries Association, RLUK2016, IFLA, CILIP and Scientific Data Conference.

Increasingly the OSC team is being asked to share their knowledge and experience. In 2016 the team gave four keynote speeches, presented 18 sessions and ran one Master Class. The team has also acted as session chair for two conferences and convened two sessions.

5.1 Research projects

The OSC is undertaking several research projects. In relation to the changing nature of scholarly communication services within libraries, we are in the process of analysing  job advertisements in the area of scholarly communication, we have also conducted a survey (to which we have received over 500 respondents) on the educational and training background of people working in the area of scholarly communication. The findings of these studies will be shared and published during 2017.

Dr Lauren Cadwallader was the first recipient of the Altmetrics Research Grant which she used to explore the types and timings of online attention that journal articles received before they were incorporated into a policy document, to see if there was some way to help research administrators make an educated guess rather than a best guess at which papers will have high impact for the next REF exercise in the UK. Her findings were widely shared internationally, and there is interest in taking this work further.

The team is currently actively pursuing several research grant proposals. Other research includes an analysis of data needs of research community undertaking in conjunction with Jisc.

5.2 Engaging with the research literature

Many members of the OSC hold several editorial board positions including two on the Data Science Journal, and one on the Journal of Librarianship and Scientific Communication. We also hold positions on the Advisory Board for PeerJ Preprints. We have a staff member who is the Associate Editor, New Review of Academic Librarianship . The OSC team also act as peer reviewers for scholarly communication papers.

The OSC is working towards developing a culture of research and publishing amongst the library community at Cambridge, and is one of the founding members of the Centre for Evidence Based Librarianship and Information Practice (C-EBLIP) Research Network.

6. Staffing

Despite the organisational layout remaining relatively stable between 2015 and 2016, this belies the perilous nature of the funding of the Office of Scholarly Communication. Of the 15 staff members, fewer than half are funded from ‘Chest’ (central University) funding. The remainder are paid from a combination of non-recurrent grants, RCUK funding and endowment funds.

The process of applying for funding, creating reports, meeting with key members of the University administration, working out budgets and, frankly, lobbying just to keep the team employed has taken a huge toll on the team. One result of the financial situation is many staff – including some crucial roles – are on short-term contracts and several positions have turned over during the year. This means that a disproportionate amount of time is spent on recruitment. The systems for recruiting staff in the University are, shall we say, reflective of the age of the institution.

In 2016 alone, as the Head of the OSC, I personally wrote five job descriptions and progressed them through the (convoluted) HR review process.  I conducted 32 interviews for OSC staff and participated in 10 interviews for staff elsewhere in the University where I have assisted with the recruitment. This  has involved the assessment of 143 applications. Because each new contract has a probation period, I have undertaken 27 probationary interviews. Given each of these activities involve one (or mostly more) other staff members, the impact of this issue in terms of staff time becomes apparent.

We also conducted some experiments with staffing last year. We have had a volunteer working with us on a research project and run a ‘hotdesk’ arrangement with colleagues from the Research Information Office, the Research Operations Office and Cambridge University Press. We also conducted a successful ‘work from home’ pilot (a first for the University Library).

7. Plans for 2017

This year will herald some significant changes for the University – with a new Librarian starting in April and a new Vice Chancellor in September. This may determine where the OSC goes into the future, but plans are already underway for a big year in 2017.

As always, the OSC is considering both a practical and a political agenda. On the ‘political’ side of the fence we are pursuing an Open Research agenda for the University. We are about to kick off of the two-year Open Research Pilot Project, which is a collaboration between the Office of Scholarly Communication and the Wellcome Trust Open Research team. The Project will look at gaining an understanding of what is needed for researchers to share and get credit for all outputs of the research process. These include non-positive results, protocols, source code, presentations and other research outputs beyond the remit of traditional publications. The Project aims to understand the barriers preventing researchers from sharing (including resource and time implications), as well as what incentivises the process.

We are also now at a stage where we need to look holistically at the way we access literature across the institution. This will be a big project incorporating many facets of the University community. It will also require substantial analysis of existing library data and the presentation of this information in an understandable graphic manner.

In terms of practical activities, our headline task is to completely integrate our open access workflows into University systems. In addition we are actively investigating how we can support our researchers with text and data mining (TDM). We are beginning to develop and roll out a ‘continuum’ of publishing options for the significant amount of grey literature produced within Cambridge. We are also expanding our range of teaching programmes – videos, online tools, and new types of workshops. On a technical level we are likely to be looking at the potential implementation of options offered by the Shared Repository Pilot, and developing solutions for managed access to data. We are also hoping to explore a data visualisation service for researchers.

Published 17 January 2017
Written by Dr Danny Kingsley
Creative Commons License

 

 

Further developing the library profession in 2016

In this blog post, Claire Sewell, the OSC’s Research Support Skills Coordinator reflects on a busy year for the professional development of Cambridge library staff.

Librarians are always learning and 2016 was a bumper year for training in the Office of Scholarly Communication (OSC). The OSC has taken an active role in professional development since its foundation but things have stepped up since the dedicated training role of Research Support Skills Coordinator was established at the end of 2015.

The OSC runs two parallel professional development  schemes for library staff:

Supporting Researchers in the 21st Century Programme

The Supporting Researchers Programme offers training in the area of scholarly communication to all library staff at Cambridge University and is designed to equip staff with the skills they will need to work in a modern academic library.

In 2016 there were a total of 30 events attracting an audience of nearly 500 library staff. Attendees were drawn from across faculty, college and the University Library with several repeat attendees. Topics covered included:

  • Altmetrics
  • Bibliometrics
  • Copyright
  • Metadata
  • Open Access
  • Research data management
  • Research integrity
  • Presentation skills

Attendees have been quick to praise the sessions offered with an average of 71% rating sessions as excellent. Feedback has also been positive:

“[I learnt] a lot about metrics and the confidence to go and find out more”.

“Very engaging. Like the speed, got through a lot without it getting too boring or slow!”

“Appreciated that we were walked through the process and implications of funding requirements”

A presentation skills workshop – Presentations: From Design to Delivery – was by far our most popular session of 2016. Although originally scheduled to run twice, three extra sessions had to be added to cope with demand. In total 71 library staff attended these sessions and consistently rated them as excellent. We hope to build on this success by offering further presentation skills training in 2017.

Research Support Ambassador Programme

This intensive programme ran from June – October 2106 and included sixteen participants from across colleges, departments and the University Library. This spread across the University is particularly gratifying as participation is voluntary. The Research Ambassadors embarked on a training programme made up of three strands:

  1. Targeted training sessions in areas covered by the remit of the Office of Scholarly Communication such as Open Access and Research Data Management
  2. The development of transferrable skills such as leadership, presentation skills and working in teams
  3. Small group project work to create tangible training materials which can be shared across the wider library community

This programme has been adapted in response to feedback received after an initial pilot run in 2015. More structure was introduced through the regular training sessions which Ambassadors were required to attend. Extra optional sessions were also offered according to demand, mostly in relation to group projects. Lastly there was a narrower scope to the group project element to ensure that Ambassadors could complete the task within the time available.

The small group projects Ambassadors worked on aim to give back to the Cambridge library community by producing training materials that can be used by all under a Creative Commons licence. In 2016 Ambassadors worked on three projects:

  1. Digital Humanities webpages – webpages highlighting the work that Cambridge University Library is doing in this increasingly important area of scholarship.
  2. Metadata toolkit – these slides and associated activities can be used to teach the research community about the importance of metadata creation.
  3. Online videos – bite sized videos which showcase various different tools which will be of use to researchers in disseminating their research.

The Research Ambassadors are now able to work confidently in their own libraries to provide point-of-need help to the research community. At the same time they have improved their knowledge of the scholarly communication landscape and the range of ways in which they can support the research community.

Promotion

We’ve also been working hard to promote the training we offer in the OSC, both to Cambridge librarians and the wider world.

Webpages have been created for both the Supporting Researchers in the 21st Century and Research Support Ambassador programmes so that interested parties have something to refer to and all information is kept in an accessible place. We held two Research Support Ambassador Showcase sessions in April and October to allow Ambassadors to demonstrate their outcomes and reflect on their participation on both a personal and professional level. There have also been two blog posts about the initial run of the Ambassador programme from both an insider and observer perspective which helped to give new insight into the initiative.

We have more formal plans for promotion of the programme through conference proposals and journal article submissions. More details of these will be made available once we know the outcome!

Moving forward

We have some exciting plans for training in 2017. The OSC recently sent out a survey to help with planning our next round of training and the response has been overwhelming. Re-runs of some popular topics such as copyright and presentation skills were requested along with new sessions on search skills and researching in the workplace. It looks like 2017 is going to be an exciting year for training so please follow our progress via this blog and our training webpages.

Published 17 January 2017
Written by Claire Sewell 

Creative Commons License