Category Archives: Uncategorized

Tips for preparing and presenting online learning

This week we had a group of library staff contribute to a roundtable discussion about online training. We were lucky to have visiting Australian Tom Worthington* talk to the group. These are some notes from the wide-ranging discussion.

Online approaches

In face-to-face teaching, a unit in philosophy taught over a semester is very different to a single training session in how to find something in a library catalogue. However in practice in the online world they are the same.

Tom noted that five years ago he decided to stop giving lectures and only deliver courses online. It has taken that time for him to feel comfortable with the online delivery.

The electronic equivalent of the traditional lecture is you prepare a reading block, mail it to the students, give them exercises to do, they write it down and you give them comments. But there is an opportunity to do much more. An example is the book “ICT Sustainability: Assessment and Strategies for a Low Carbon Future“, used for an ANU course.

The ‘flipped classroom’ is an approach where the online component is first. However unless you give them a task they will come to the first day full of excuses. The convenor can give students blocks of exercises. At the face to face section you can have the informal discussion and help them with problems. That works well.

Text based courses can use video that someone else has recorded on the topic. The process is the students:

  • Read the summary of the course
  • Do the readings
  • Do the test
  • Then have a discussion online or in person together.

Asynchronous courses ask students to contribute to an online text based forum. As an example, the questions for week one of “ICT Sustainability”.  The students might be asked to answer questions – find a paper or video on the topic, say why it is relevant. They should post to the forum by the middle of the week, must reply to two other posts by two students by Friday. Then use peer assessment to mark each other’s work. It is good if their contribution is used in some way. Usually allocate 10-20% of the marks to their contribution to the marking of each other’s work. Students will go to remarkable lengths to get small number of marks. Needs to make sense to the long term goal of the student.

One way of presenting a course is to provide small ‘units’ of information which are not timed. At the end of a unit the student does a test and when they pass they move onto the next section.

Using traditional eLearning you would have at most 24 people in each group – you usually still have ‘loud’ people you have to tell to stop writing/talking.

Course structures

There are standards for learning materials. The University provides considerable resources for Learning Aims and Outcomes.

It helps to have rigid statements about what the course is about. These should include learning objectives, how the course is broken up eg: two components and three sub components. See the introduction to “ICT Sustainability”.  Without this structure the student does not know where they are up to. You need to show participants there is a plan.

Tom noted it is important to tell participants why the course will be useful to them and how it will be useful to them. It is very important to provide markers throughout the course. Where are we up to and what is this for? Eg: this will increase your chances of getting a paper published.

It is all in the preparation

Academic bravado is ‘I have a lecture in 5 minutes, I had better get something together’. With eLearning you have to design all the materials and exercises in advance before they start. Gather the materials together – but you always need to consider licenses. A repository – equivalent of an electronic book, videos and quizzes.

Don’t add things on the fly. Once it starts you need to keep things stable. You can take online material and deliver it in person easily. It is much harder to do it the other way.

Preparing online courses is very labour intensive and traditional universities do not provide for the preparation time. However the delivery is much quicker. If you are at the distance education university this is built into the system. But in a traditional university you only get paid for delivery. So first time you run a course, it is at a ‘loss’ but each time after that it is easier. So try to minimise the material beforehand.

The question about the inability to get feedback from people was raised. With online fixed courses you don’t have a way to improve the content for your students. Tom suggested observing the test results helps. The dropout rate is an indicator (and you can always ask people why they dropped out). You can look at what they have been accessing. There is considerable research on ‘Learning Analytics’ and products for extracting the data from Moodle.  There may be things they have not been looking at – it might be the link has broken. The flipped classroom will give people the chance to fix things.

Some concern was raised about reusing information. One person noted that ‘internalising the material requires creating it myself’. The group agreed it was important to ensure the information is stitched together well so there is a real narrative. Tom noted we do have standardised educational materials – they are called text books. You can still use text books in an online course. If we have standard published sources then we should use them.

Preventing cheating

If you just give students reading materials online then they will not read it. You need to give them tasks to do and monitor the results. Give them multiple choice using immediately marked systems. Them knowing there is a test at the end increases their education (even if they get all the answers wrong). Even if the test is not for credit, students will still cheat.

Ways to prevent people cheating at online tests:

  • Limited number of attempts
  • Questions are selected from a bank at random
  • Positions of the multiple-choice answers are randomised
  • If numeric answers, the system generates a random set of values so each student gets a different question

Note that young digital natives are still academically illiterate – they do not naturally know how to write things with proper referencing. They write assignments with broken jargon, not proper referencing and will copy things from Wikipedia. Tom said that he doesn’t call it ‘plagiarism’ in the first few weeks, they call it ‘poor referencing’.

Encouraging attendance

The conversation moved to the library training environment, where we often have an opportunity to see people only once face to face. Students need to get feedback on the quizzes – there might not be anything after that.

There is nothing telling the students they have to go to these sessions apart from them thinking it might be helpful. So how do we leverage off a one off teaching slot? In that one off session – eg: ‘How can you become an expert in 10 minutes’ – can we replicate that type of activity online in the same sort of way?

Suggestions for encouraging attendance included:

  • Doing this as part of something for someone they respect.
  • Provide students the materials in the live sessions eg: worksheets and reading and exercises, then collect them into a coherent ebook, step by step.
  • Give them a certificate at the end – shake their hands and give it to them.

Getting university buy-in

Tom suggested that it is a good idea to tap into the national standard for what a student needs to do in a particular area. Each department will have their own way of doing it.

He suggested going to the international standardised skills framework, finding the skill that is relevant, and using the text describing this to be part of the course outline. Accrediting bodies in some areas will be useful for this (for example Engineering). You can use the description from this body. Doing this makes it easier to get a course approved. The Executive of institutions will support that kind of course.

Examples include the Skills Framework for the Information Age (SFIA), which  is a computer science based set, and there is the Seoul Accord.

Online course technologies

Cambridge University uses the Moodle Virtual Learning Environment, which Tom noted is the equivalent of buying a Vauxhall – can buy lots of parts and find lots of people to fix it. It is not very exciting. But it is fine.

Moodle is not really built for creating an ePortfolio which shows evidence the students know how to use something. They collect material into your e-portfolio and then you present it. Use simple social media where you say “please work on this and discuss it”. This can shows evidence the student knows how to use something. One tool is the Mahara open source ePortfolio program.

Recording technology

Recording lectures is a challenge at Cambridge where there are not universal recording facilities in lecture theatres. But Tom noted that while sometimes having good technology can be useful – a document camera can show students how equations are done for example – using simple tools can work.

If you are giving a presentation, simply set a recording device on the desk. Students really like the recordings of a presentation. Tom noted that when recording something to go online is it much easier to give a presentation to a live audience rather than to an empty room. Note it is important to consider the legal issues of recording people – when they approach you to speak privately about something you need to turn the microphone off. It is also important to remember to repeat questions asked by the audience into the microphone.

Recording helps international students. They listen to the recording a minimum of six times. If use echo360 active learning program you can see how many views each part of the course is being looked at.

People are prepared to listen for 6-20 minutes. When putting recordings online you can have a talking head or show the powerpoint slides. A good way of presenting a video recording is to show a talking head for the first few seconds to see a human, then flip to powerpoint slides then have the human again at the end. An alternative is to just use a static photograph. People will treat a smiley face as a person.

Tom noted that he has never been to a webinar session at university that worked properly. You spend half the time trying to get the technology to work. It is necessary to train people in the technology. Unless there is a need for a live session don’t do it. Digital native young people still have trouble with the technology.

Examples of good online teaching

Universities UK have Open Learn and the Australian equivalent is Open 2 Study. The way these are set up – you do a short course for free then you can enrol in the longer one for a cost. The courses all started at 12 weeks, and they are now four weeks. MIT have created Open EdX.

Other useful links:

* About the speaker

Tom Worthington, is an independent computer consultant and educator. He is an Adjunct lecturer in the Research School of Computer Science at the Australian National University and a member of the ANU Climate Change and Energy Change Institutes. Also Tom designs and teaches on-line courses for the the Australian Computer Society (ACS) Virtual College. He was previously an IT policy advisor at the Australian Department of Defence. Tom is a Past President, Honorary Life Member and Fellow of the ACS, as well as a voting member of the Association for Computing Machinery and a member of the Institute of Electrical and Electronics Engineers.

Published 23 July 2015
Written by Dr Danny Kingsley
Creative Commons License

Dutch boycott of Elsevier – a game changer?

A long running dispute between Dutch universities and Elsevier has taken an interesting turn. Yesterday Koen Becking, chairman of the Executive Board of Tilburg University who has been negotiating with scientific publishers about an open access policy on behalf of Dutch universities with his colleague Gerard Meijer, announced a plan to start boycotting Elsevier.

As a first step in boycotting the publisher, the Association of Universities in the Netherlands (VSNU) has asked all scientists that are editor in chief of a journal published by Elsevier to give up their post. If this way of putting pressure on the publishers does not work, the next step would be to ask reviewers to stop working for Elsevier. After that, scientists could be asked to stop publishing in Elsevier journals.

The Netherlands has a clear position on Open Access. Sander Dekker, the State Secretary  of Education has taken a strong position on Open Access, stating at the opening of the 2014 academic year in Leiden that ‘Science is not a goal in itself. Just as art is only art once it is seen, knowledge only becomes knowledge once it is shared.’

Dekker has set two Open Access targets: 40% of scientific publications should be made available through Open Access by 2016, and 100% by 2024. The preferred route is through gold Open Access – where the work is ‘born Open Access’. This means there is no cost for readers – and no subscriptions.

However Gerard Meijer, who handles the negotiations with Elsevier, says that the parties have not been able to come close to an agreement.

Why is this boycott different?

It is true that boycotts have had different levels of success. In 2001, the Public Library of Science started as a non-profit organization of scientists ‘committed to making the world’s scientific and medical literature freely accessible to both scientists and to the public’. In 2001 PLoS (as it was then) published an open letter asking signatories to pledge to boycott toll-access publishers unless they become open-access publishers. The links to that original pledge are no longer available. Over 30,000 people signed , but did not act on their pledge. In response, PLOS became an open access publisher themselves, launching PLOS Biology in October 2003.

In 2012 a Cambridge academic Tim Gowers started the Cost of Knowledge boycott of Elsevier which now has over 15,000 signatures of researchers agreeing not to write for, review for, or edit for Elsevier. In 2014 Gowers used a series of Freedom of Information requests to find out how much Elsevier is charging different universities for licence subscriptions. Usually this information is a tightly held secret, as individual universities pay considerably different amounts for access to the same material.

The 2015 Dutch boycott is significant. Typically negotiations with publishers occur at an institutional level and with representatives from the university libraries. This makes sense as libraries have long standing relationships with publishers and understand the minutiae of the licencing processes . However the Dutch negotiations have been led by the Vice Chancellors of the universities.  It is a country-wide negotiation at the highest level. And Vice Chancellors have the ability to request behaviour change of their research communities.

This boycott has the potential to be a significant game changer in the relationship between the research community and the world’s largest academic publisher. The remainder of this blog looks at some of the facts and figures relating to expenditure on Open Access in the UK. It underlines the importance of the Dutch position.

UK Open Access policies mean MORE publisher profit

There have also been difficulties in the UK in relation to negotiations over payment for Open Access. Elsevier has consistently resisted efforts by Jisc to negotiate an offsetting deal  – where a publisher provides some sort of concession for the fact that universities in the UK are paying unprecedented amounts in Article Processing Charges on top of their subscriptions because of the RCUK open access policy.

Elsevier is the world’s largest academic publisher. According to their Annual Report the 2014 STM revenue was £2,048 million, with an operating profit of £762 million. This is a profit margin of 37%. That means if we pay an Article Processing Charge of $3000 then $1,170 of that (taxpayers’) money goes directly to the shareholders of Elsevier.

The numbers involved in this space are staggering. The Wellcome Trust stated in their report on 3 March 2015 The Reckoning: An Analysis of Wellcome Trust Open Access Spend 2013 – 14: ‘The two traditional, subscription-based publishers (Elsevier and Wiley) represent some 40% of our total APC spend’.

And the RCUK has had similar results, as described in a Times Higher Education article on 16 April 2015 Publishers share £10m in APC payments: “Publishers Elsevier and Wiley have each received about £2 million in article processing charges from 55 institutions as a result of RCUK’s open access policy”.

Hybrid open access – more expensive and often not compliant

Another factor is the considerably higher cost of  Article Processing Charges for making an individual article Open Access within an otherwise subscription journal (called ‘hybrid’ publishing) compared to the Article Processing Charges for articles in fully Open Access journals.

In The Reckoning: An Analysis of Wellcome Trust Open Access Spend 2013 – 14, the conclusion was that the average Article Processing Charge levied by hybrid journals is 64% higher than the average Article Processing Charge of a fully Open Access title. The March 2015 Review of the implementation of the RCUK Policy on Open Access concluded the Article Processing Charges for hybrid Open Access were ‘significantly more expensive’ than fully OA journals, ‘despite the fact that hybrid journals still enjoyed a revenue stream through subscriptions’.

Elsevier has stated that in 2013 they published 330,000 subscription articles and 6,000 author paid articles. There is no breakdown of how many of those 6,000 were in fully open access journals and how many were hybrid. However in 2014 Elsevier had 1600 journals offering their hybrid option, and 100 journals that were fully open access (6%). Note that the RCUK open access policy came into force in April 2013. It would be interesting to compare these figures with  the 2014 ones, however I have been unable to find them.

While the higher cost for hybrid Article Processing Charges is in itself is an issue, there is a further problem. Articles in hybrid journals for which an Article Processing Charge has been paid are not always made available at all, or are available but not under the correct licence as required by the fund paying the fee. Here at Cambridge, the five most problematic publishers with whom we have paid more than 10 Article Processing Charges have a non compliance rate from 11-25%. With this group of publishers we are having to chase up between three and 31 articles per publisher. This takes considerable time and significantly adds to the cost of compliance with the RCUK and COAF policies.

According to the March 2015 Review of the implementation of the RCUK Policy on Open Access, ‘Elsevier stated that around 40% of the articles from RCUK funding that they had published gold were not under the CC-BY licence and are therefore not compliant with the policy’ (p19).

We support our Dutch colleagues

In summary, the work happening in The Netherlands to break the stranglehold Elsevier have on the research community is important. We need to stand by and support our Dutch colleagues.

NOTE: This blog was subsequently reblogged on the London School of Economics Impact Blog and later listed as one of the Top Ten Posts for 2015: Open Access. It was also listed as one of the blogs that had an average minute per page measurement of over 6 minutes and 30 seconds.

Published 3 July 2015, added to on 22 January 2016
Written by Dr Danny Kingsley
Creative Commons License

Libraries of the future – insights from a talk by Lorcan Dempsey

There is no argument even from traditionalists that the library role is changing. But there is a great deal of confusion and sometimes fear about what that means, and what the future might look like.

On 3 June, Lorcan Demsey*  came to speak to staff at Cambridge University Library about how the role and purpose of libraries are changing. The slides from his talk are available on Slideshare.

The one sentence headline from the talk was that research libraries are moving from licensing published content to managing workflow and research outputs – which means the print collection needs to be managed down to free up resources for the new roles. The subhead is – if we don’t do this, the publishers are waiting in the wings to take over.

Modern libraries in research environments

The Library role is distilling from owned materials to facilitating access to many things. Changing focus to ‘discovery’ in the collection means there must be a loss of some of the items.

Lorcan noted that his sense is that there is still a low uptake of this concept. As someone who has been working in scholarly communications for over a decade. I agree.

The collection as a means to an end rather than an end it itself – in some ways this is obvious but in others it is a huge psychological shift. 

  • In a print world, researchers and learners organised their workflow around the library. The library had a limited interaction with the full process.
  • In a digital world the Library needs to organise itself around the workflows of research and learners. Workflows generate and consume information resources.

In libraries there is a separation of the discovery and the collection – library users are on the global level. The library will make some available and own some of those.

Change of focus

The research endeavour has moved from a focus on outcomes to begin to think about a range of activities around the process and the aftermath.

The traditional role of a library – outside of special collections and manuscripts – deals with outcomes like the books and journals. In this model students and researchers interact with the books and journals and then turn it into classically published works that come back into the library.

But we live now in an online world and the Library is interacting with the content in many different ways. There is interest in the process of research –methods, evidence and research data. There is also interest in the discussion around research through pre-prints, working papers and a variety of prepublication activity. This involves revision, derivative works and reuse. Copyright is important in these cases to let people know how things will be used.

This means that ‘collections’ from a library perspective now include the process, methods, discussions and outputs as well as books and journals.

From curation to creation

Mediated access to licensed material is becoming more streamlined, and other items becoming more available. Libraries are supporting creation, not just consumption.

Libraries need to be seen as a source for collaboration. There needs to be a partnership between the Library and the Faculty. The library is a partner in terms of the creation activity. The mediating role will continue.

Managing this transition is often done opportunistically – when people retire they are replaced with new skill sets.

The repository was seen by the Libraries initially as something in relation to artefacts but now it is seen as part of the workflow of the research lifecycle. There is less attention on what is coming in and more focus on sharing material back out.

Show me the money

There is a question around how much of this activity will be supported by the institution? And how is the resource shifting occurring in the libraries?

Lorcan said that while libraries talk about a growing interest in special collections and archives, there is no evidence from a budget perspective that this is being supported.

Publishers are trying to muscle in

Managing online identities

There is considerable interest amongst researchers in having a carefully tended online presence. This is time consuming, and would appear to be important to the researchers. This process is becoming intimately tied to publication – it is where people are announcing their publication.

Lorcan mentioned a study in Nature which was a survey of 3,000 scientists and engineers. They found 6% used Google Scholar, but more than half were using ResearchGate more regularly than LinkedIn. Not surprisingly this behaviour can be broken down by discipline. The social sciences tend to use Google Scholar, and academic.edu has low use by engineering and sciences. There are many solutions to the workflow to help researchers. Many of these will go away, but some are quite heavily used.

Thinking historically, a catalogue covers the material a library owns. The library has a discovery layer and a license. However this activity will have to shift to support creation. We have repositories, Google Scholar, ResearchGate etc. The incentive to use the repository is very low compared with these services.

A gap in the market?

Workflow is the new content – managing identity is where a lot of the focus is. Publishers are trying to position themselves as the service provider in this space.

Many libraries do not see their role as managing evolving scholarly records – the research and learning material. The curation of identity for researcher profiles is a big interest. This is often currently managed by research offices.

However this is a space into which publishers are moving. Several big publishers are now trying to be part of the full cycle for researchers.

For example Elsevier has two products – Pure is a content management system for research reporting and Mendeley is an academic social network. It is no coincidence that the word ‘solution’ is in the url thread. Similarly, Macmillian (publishers of Nature) recently bought Digital Science the company that created the equivalent products Symplectic and figshare. Digital Science was not included in the Macmillan Springer merger, possibly because they still need substantial investment. Lorcan noted that people see them as ‘plucky start-ups’ but they are owned by big publishers. There has been a big take-up of these services.

Lorcan showed a quote from Annette Thomas, CEO of Macmillian Publishers about ‘A publisher’s new job description’.

Her view is that publishers are here to make the scientific research process more effective by helping them keep up to date, find colleagues, plan experiments, and then share their results. After they have published, the process continues with gaining a reputation, obtaining funds, finding collaborators, and even finding a new job. What can we as publishers do to address some of the scientists’ pain points? 

As Lorcan observed – you can take out the word ‘publisher’ and replace with the word ‘library’.

Managing down collections

Libraries are increasingly wanting to organise their space around the student experience not around collections. Lorcan used a grid to illustrate the changing focus. The two distinctions were:

  • Whether items are in many collections or are rare or unique.
  • Whether items require stewardship. Items that are high stewardship items are looked after and resources are spent on them. Items that are low stewardship don’t get looked after in Libraries.

At one extreme is licensed materials which are high stewardship/many collections. The opposite corner is research materials which are only available in a few collections and are low stewardship.

Lorcan said he thinks in the future there will be a focus on distinctive collections. There needs to be a lot more money to do this. So licensed purchased material will be more streamlined. Management attention 15 years ago was on highly managed, licensed items, but now the focus has shifted to items of low stewardship.

Inside out library

Market materials: licensed/purchased stuff. Library as broker and telling users that these things are available in a special way.

Distinctive collections: Library is provider and want to maximise discoverability. Want other people to know about faculty expertise, and research data. Putting into own discovery layer doesn’t help there. Think about metadata and which aggregators are important. Been slow to realise that discoverability is vital.

These have very different dynamics. We want to share material held within the library with the rest of the world. The licensed stuff is external and libraries bring it in to share internally. This is inside out.

Traditionally libraries deal with published, purchased material (including special collections). However there is a shift away from acquisitions to demand. This means that libraries need to redirect their resources towards research support. One way of doing this is to manage down the print collection.

There was an explosion of publishing after the Second World War. In the same way that baby boomers are all retiring at the same time, we are now faced with the challenge of managing these collections down.

Challenges for identity

The managing down of print collections coincides with the push to repurpose space in libraries. There are many discussions with architects – managing down print means there must be refurbishment.

One of the issues emerging for libraries is: Without the books, does the campus see this as the Library? Is the space needed for the Library – could they be replaced by learning commons or the student union?

We can see the identity discussion about libraries emerging now. If we are managing down collections, what is the space for? What are the new services we offer? Lorcan mentioned media stories where librarians are being attacked by historians who see this as managerial, technocratic activity.

Lorcan described some of the shared collection activities happening in the USA.

Conclusion

We used to think of the Library as a collection. Now we need to think of the Library in terms of the user and their workflows.

We must move to more facilitated access to items, also move to the management and disclosure of curated materials. The print and digital scholarly record needs curation and co-ordination at a conscious national level.

The job is about restructuring the means but we need to make decisions about moving resources or bets on the future. Libraries must shift from an organisation where the end was known to one where we must take some risk.

Published 6 June 2015
Written by Dr Danny Kingsley
Creative Commons License

* Lorcan coordinates strategic planning and oversees Research, Membership and Community Relations at OCLC. He has worked for library and educational organizations in Ireland, the UK and the US. His influence on national policy and library directions is widely recognized. He is on Twitter – @LorcanID

 

In conversation with Ben Ryan from EPSRC

Cambridge University hosted Ben Ryan and Amanda Chmura from the Engineering and Physical Sciences Research Council (EPSRC) on Friday 15 May for a discussion about how the University is meeting the EPSRC expectations for sharing research data.

We started the conversation with a demonstration of the services we offer our researchers including our Research Data Management website, and talked about the open data sessions and other training events we have been holding. So far we have managed to speak to 764 researchers about data sharing requirements (the numbers continue to grow).

Managing expectations

In 2011 EPSRC published nine key expectations on research data management. The expectations are directed principally at research organisations and highlight their role in supporting researchers to ensure research data is properly managed. EPSRC set a deadline, 1 May 2015, for research organisation compliance with their expectations.

One of the expectations is that data supporting publications arising from funded research is openly available – this reflects the Common Principles on Data Policy published by RCUK (2011) and in the Royal Society’s subsequent (2012) report ‘Science as a Public Enterprise’. To monitor compliance with this expectation EPSRC have said that this autumn they will conduct checks of papers published after 1 May 2015 to ensure these provide appropriate directions to the supporting data.

Ben clarified that the checks will help to determine the level of awareness of the policy and expectations. He noted that there is a balance in what the EPSRC is trying to do. They are trying to create a new research culture, and they are primarily focused on what the institution should be doing to support that.

According to the EPSRC policy, in situations where research arises from collaborations, or from work partially funded by commercial partners, any potential problems with research data sharing should be addressed before the start of the project, in a data management plan. We therefore asked Ben why the EPSRC – of all the RCUK funding bodies– don’t require researchers to create a data management plan. Ben indicated that the main value in data management planning is to the researcher and the research organisation – adding them to EPSRC’s funding submission process would simply add to the admin and peer review burden without it being clear how peer reviewers could properly judge them because they don’t know the infrastructure available where the research is being conducted.

The question arose of whether a single RCUK policy on research data might be possible. Ben noted that the different councils fund different types of work, which informs their individual policies, and explained that although a single policy might be achievable it would require every council to change their existing policy and would be very disruptive of current processes across the whole system. As such he felt it would need a ‘very strong steer externally’ to drive such a change.

However, the research councils recognise the need for more guidance and are about to publish cross-council guidelines presenting a collective position on what should be done with particular types of data.

Clarification

A question that often arises from researchers is ‘what data are we expected to keep and make available’? We were able to get confirmation that it is:

  • the data that underpins publications
  • the data that validates research findings
  • the data that is worth keeping

All questions should be answered by considering the principles behind the policy. The default position is data should be open – in a way that does not damage the research process. The important thing is that the validity of the published research findings is testable.

An example of the way this principle can be used is when considering another common question – what to do in the situation where several papers are expected to come out of the one set of data. Researchers are concerned that if they release the data on the first publication it jeopardises their subsequent publications as they may be scooped. Ben acknowledged this is a concern but asked is it reasonable to sit on data for, say, five years so that other people end up being funded to generate the same data again?

He pointed out that the RCUK Common Principles state that those who undertake Research Council funded work may be entitled to a limited period of privileged use of the data they have collected to enable them to publish the results of their research. However, the length of this period varies by research discipline.

There is also the consideration of the way another user can access the data and reproduce results. The question is – how far do we go to enable a user to reproduce the work? The minimum is that we should provide the information that someone would need to be able to validate published work – this is also critical to maximise the impact of publicly funded research and to maintain public trust in science and research.

The software situation

We had representatives from Cambridge Enterprise and from the School of Technology at the meeting who had specific questions about sharing software. While Ben indicated he might need to reflect on some of the questions, we did come to some clarification on others.

Although software is different from other forms of intellectual property the same basic question arises: “is the institution best served by making it freely available or by commercialising it?” Both approaches can lead to the creation of jobs and economic impact. EPSRC is clear that the choice of exploitation strategy rests with the research organisation.

The EPSRC does not have an expectation about the licence under which software should be released.

It was agreed that if there is material that is potentially commercial, then we should take the steps to make it available and commercialise the software. It was confirmed we are able to make software arising from a research project available free for non-commercial re-use by other researchers (within the academic community) while at the same time making it available to others under a commercial licence

One can argue that since the taxpayer funded the work in the first place the taxpayer should not have to pay for it again, but this position, taken to its natural conclusion, of course would mean that no commercialisation of funded research should ever occur.

There is also the situation where a researcher has put their ‘life and soul’ into generating outputs and naturally feels they have some ownership of the work. Ben agreed that many of these questions are ‘very challenging’, but noted that researchers seldom ‘own’ their outputs – under RCUK grant conditions the research organisation owns all the intellectual assets arising from the funded research and is responsible for seeing that they are used to the benefit of society and the economy. Some of these questions stem from a mindset that insufficiently recognises the importance of ensuring that the economy and society as a whole benefits from publicly funded research, and a culture change is needed in addition to new processes.

The EPSRC do wish to avoid people sitting on data indefinitely because they don’t want to release their software. Ben said that in principle it is permissible for people to make software available through GitHub, but he would need to investigate how sustainable it is and how it is governed before being able to say whether GitHub is a reasonable option in terms of meeting EPSRC expectations..

Addressing (some) concerns

Time prevented us covering all of the topics we wished to raise. Many Cambridge researchers have raised questions about sharing data from collaborations – with concern that non-UK partners who do not have a data sharing requirement may find the UK requirements onerous and that this could decrease the amount of international collaborations in which UK institutions are involved.

There was also no magic bullet for the challenge of paying the not insignificant cost of storing research data safely for 10 years+. The problem is that where researchers were unaware of this expectation at the time they applied for their grant there is no allowance for it in their budget. This will not be an issue in the future as current grants are approved, but we are in a transition period now as the research from existing grants is published and the supporting data is being made available and stored. When we discussed this, Ben explained that the EPSRC does not have any additional funds to support this transition period, and that the costs need to be found within existing resources.

There have been some challenges with communication of the EPSRC policy. Many researchers at the University of Cambridge have said they would have liked to be informed about it directly by EPSRC (as, for example, they would expect to have been by e.g. the Wellcome Trust). Ben explained that the approach had deliberately been to communicate the policy through research organisation senior managers (e.g. ProVCs Research), and that this was because the expectations are addressed principally to research institutions, which have primary responsibility for ensuring that researchers manage their data effectively and have access to appropriate facilities to do so. However, he acknowledged that EPSRC could have communicated more with researchers and undertook to explore how more information could be made available directly to researchers.

Therefore it was helpful to be able to express some of the concerns and fears amongst the research community. We have been collating the questions that people have asked during our sessions and will compile a FAQ from this that will appear on our Research Data Management website. Ben indicated that there might be a possibility of a selection of these FAQs also appearing on the RCUK website to help address the universal questions about sharing research data. This step would be welcomed by the University.

Published 21 May 2015
Written by Dr Danny Kingsley
Creative Commons License

Data management – one size does not fit all

As the Research Data Facilitator at the University of Cambridge, I am part of the team establishing a Research Data Management (RDM) Facility at the University. This blog is a note of my impressions from the Digital Curation Centre (DCC) meeting held in London on the 28th April 2015: Preparing Data for Deposit.

As always, the DCC meeting was extremely useful for networking. I met with people at similar roles at other institutions. And again, the breakout sessions were invaluable – they allowed us to exchange precious experience, feedback gained and lessons learnt while developing RDM services.

What could have been done better though is more appreciation for differences between universities.

Unrealistic staffing

The talk from the keynote speaker, Louise Corti, the Associate Director at the UK Data Service, was very inspirational. I loved the uplifting expression that RDM supporters are like artists evangelising researchers. It was great to hear about RDM solutions available at the UK Data Service, and the professional approach to research data, with every aspect of data curation addressed by the excellent team of 70 dedicated people, with precise workflows for data processing.

However, how realistic it is for a university to develop similar solutions locally? Which University would be able to dedicate similar amount of resources for the development of an RDM facility?

At the University of Cambridge, I am the only full-time employee dedicated to work on establishment and provision of RDM services to our researchers. There is a team of people supporting the facility but these staff are shared with other projects. I would have very much appreciated what would be the scalable solution that the UK Data Service could recommend universities to develop, knowing that resources available are nowhere near what a 70 people team could offer.

Scalability

On the other hand, we had a presentation from the University of Loughborough. The University, represented by Gary Brewerton, teamed up with Figshare and Arkivum (Mark Hahnel and Matthew Addis, respectively). The three of them explained to us the infrastructure developed to support RDM management at the University of Loughborough. The University data repository, DSpace, has been equipped with archival storage provided by Arkivum, which guarantees 100% data integrity. Additionally, researchers at the University of Loughborough can benefit from the use of Figshare, which provides them with a user-friendly research data sharing platform.

These systems seemed to offer excellent solutions to researchers, but somehow I could not help having the impression of listening to sales pitches. Are there any disadvantages of these solutions? Are there any alternatives?

Figshare charges for the file transfer (downloading of openly accessible data is actually not free for institutions). How substantial would be these charges for bigger institutions, producing huge amounts of valuable research data, frequently sought after and downloaded by others? Would institutions be able to sustain the cost of data access to their most valuable research datasets?

Risk management

The Loughborough solutions do not appear to take into account risks associated with implementation of services from third party providers at bigger, research-intense universities. At the University of Cambridge we have almost 300 EPSRC-funded research grants. In April this year alone our data repository received 40GB of research data deposits coming from EPSRC-funded projects. Producing valuable research outputs is business-critical for universities.

What would be the costs associated with the data transfer of supposedly open-access datasets if these were available via Figshare? Is there any upper limit on possible transfer charges?

What is the long-term risk of handing over university’s research data holdings to a third party service provider? Note that some UK research funders expect data to be stored long-term, and in some cases in perpetuity (10 years from the last access). What will be the conditions for research data storage offered by these external providers in 10, 20, 30 years time? How will the cost change? Will it be easy/possible to transfer all research data somewhere else?

Figshare has recently entered into a legal partnership with Macmillan (you can read more about it in a blog post from Dr Peter Murray-Rust) – how will this partnership evolve in the future?

Suggestion

It would be extremely valuable if RDM solutions proposed at DCC meetings could be discussed taking into account the size of the institution, the amount of research conducted at the University, and the size of the RDM team locally available to work on the implementation of the solution.

One size does not and will not fit all, and a better recognition of differences between organisations would greatly help developing optimal solutions for each individual institution. Additionally, it seems to me of key importance to openly talk about drawbacks of each solution for universities to efficiently mitigate future risks.

Published 14 May 2015
Written by Dr Marta Teperek
Creative Commons License

Benchmarking the Cambridge RDM program

Cambridge University released its Research Data Management Policy Framework today.

This is a good opportunity to assess whether Cambridge is fulfilling the 10 recommendations for libraries on how to get started in data management presented in the final report of the LIBER working group on E-Science / Research Data Management. Since publication in July 2012, this is the most downloaded item from the Association of European Research Libraries (LIBER) website. We list below the 10 recommendations and what Cambridge is doing to meet them.

Benchmarking against RDM recommendations

  1. Offer research data management support, including data management plans for grant applications, intellectual property rights advice and information materials. Assist faculty with data management plans and the integration of data management into the curriculum.

The Open Data team at the University of Cambridge has created a comprehensive dedicated website for research data management. The website provides researchers with guidance on various aspects of research data management from project design and data management planning, through data collection and maintenance, to data curation and sharing.

The University also offers numerous workshops and training on research data management. An on-demand assistance with all aspects of research data management is available to researchers via a simple website support request form.

  1. Engage in the development of metadata and data standards and provide metadata services for research data.

The University of Cambridge is actively involved in developing metadata standards. All research data depositions to the University data repository occur via a simple website form. This form collects information on metadata descriptions and provides guidance on what should be included in each description field. All research data and metadata descriptions submitted to the University repository are carefully curated by our repository managers.

  1. Create Data Librarian posts and develop professional staff skills for data librarianship.

Cambridge Library has a dedicated research data management working group composed of librarians across various University departments who are actively involved in Open Access. The research data management working group is designing and delivering a series of training and workshops for the broader library community to equip them with professional research data management support skills.

  1. Actively participate in institutional research data policy development, including resource plans. Encourage and adopt open data policies where appropriate in the research data life cycle.

The newly released Research Data Management Policy Framework builds on policy frameworks in place since 2013.  The policy framework encourages the University researchers and research students to share their research data as widely and openly as possible, and provides guidance on best practice for data sharing.

  1. Liaise and partner with researchers, research groups, data archives and data centers [sic] to foster an interoperable infrastructure for data access, discovery and data sharing.

The Open Data Project Working Group at the University of Cambridge consists of members from several independent operational units at the University. These include the Cambridge University Library, the Research Operations Office, the Research Strategy Office and the University Information Services. This ensures a deep integration and engagement within the broader University structure.

Additionally, members of the Open Data team are conducting daily consultations with researchers and with research support staff across all departments at the University, to ensure that the developed research data management services are tailored to meet their needs.

  1. Support the lifecycle for research data by providing services for storage, discovery and permanent access.

At the University of Cambridge the University Information Services provide researchers with day to day research data management solutions, such as platforms for file sharing, data storage and backup. The Open Data team ensures that shareable research data is deposited into a suitable data repository (guaranteeing long term data sustainability) and shared as widely and openly as possible.

  1. Promote research data citation by applying persistent identifiers to research data.

The University of Cambridge data repository mints persistent links to each deposited research dataset. Additionally, the repository is currently being upgraded to enable minting of DOIs (digital object identifiers). These are all persistent links and their use ensures the access to data over the long term preservation period, as well as facilitates data citation.

  1. Promote research data citation by applying persistent identifiers to research data. Provide an institutional Data Catalogue or Data Repository, depending on available infrastructure.

The University of Cambridge provides both a data repository and a data registry. Our institutional repository has accepted research datasets since 2005. The University of Cambridge aims to ultimately be able to streamline and record in an automated way information about metadata descriptions from all repositories used by our researchers.

  1. Get involved in subject specific data management practice.

The respect for subject-specific differences in data management practice is recognised and affirmed throughout the University of Cambridge Research Data Management Policy Framework. The University recognises that research data management solutions need to be tailored to researchers working in different disciplines. Therefore, the Open Data team conducts daily consultations with researchers all different fields of study – to better understand individual needs and to tailor research data management support appropriately.

  1. Offer or mediate secure storage for dynamic and static research data in co-operation with institutional IT units and/or seek exploitation of appropriate cloud services.

The University Information Services (members of which are part of the Open Data Team) are currently developing a cloud-based, Dropbox-like storage solution to facilitate easy and secure data storage and sharing between collaborators.

Published 28 April 2015
Written by Dr Marta Teperek and Dr Danny Kingsley
Creative Commons License

Good news stories about data sharing?

We have been speaking to researchers around the University recently to discuss the expectations of their funders in relation to data management. This has raised the issue of how best to convince people this is a process that benefits society rather than a waste of time or just yet another thing they are being ‘forced to do’ – which is the perspective of some that we have spoken with.

Policy requirements

In general most funders require a Research Data Management Plan to be developed at the beginning of the project – and then adhered to. But the Engineering and Physical Sciences Research Council (EPSRC) have upped the ante by introducing a policy requiring that papers published from May 2015 onwards resulting from funded research include a statement about where the supporting research data may be accessed. The data needs to be available in a secure storage facility with a persistent URL, and that it must be available for 10 years from the last time it was accessed.

Carrot or stick?

While having a policy from funders does make researchers sit up and listen, there is a perception in the UK research community that this is yet another impost on time-poor researchers. This is not surprising. There has recently been an acceleration of new rules about sharing and assessing research.

The Research Excellence Framework (REF) occurred last year, and many researchers are still ‘recuperating’. Now the Higher Education Funding Council of England (HEFCE) is introducing  a policy in April 2016 that any peer reviewed article or conference paper that is to be included in the post-2014 REF must have been deposited to their institution’s repository within three months of acceptance or it cannot be counted.  This policy is a ‘green’ open access policy.

The Research Councils UK (RCUK) have had an open access policy in place for two years, introduced in 1 April 2013, a result of the 2012 Finch Report. The RCUK policy states that funded research outputs must be available open access, and it is permitted to make them available through deposit into a repository. At first glance this seems to align with the HEFCE policy, however, restrictions on the allowed embargo periods mean that in practice most articles must be made available gold open access – usually with the payment of an accompanying article processing charge. While these charges are supported by a block grant fund, there is considerable impost on the institutions to manage these.

There is also considerable confusion amongst researchers about what all these policies mean and how they relate to each other.

Data as a system

We are trying to find some examples about how making research data available can help research and society. It is unrealistic to hope for something along the lines of Jack Akandra‘s breakthrough for a diagnostic test for pancreatic cancer using only open access research.

That’s why I was pleased when Nicholas Gruen pointed me to a report he co-authored: Open for Business: How Open Data Can Help Achieve the G20 Growth Target – A Lateral Economics report commissioned by Omidyar Network – published in June 2014.

This report is looking primarily at government data but does consider access to data generated in publicly funded research. It makes some interesting observations about what can happen when data is made available. The consideration is that data can have properties at the system level, not just the individual  level of a particular data set.

The point is that if data does behave in this way, once a collection of data becomes sufficiently large then the addition of one more set of data could cause the “entire network to jump to a new state in which the connections and the payoffs change dramatically, perhaps by several orders of magnitude”.

Benefits of sharing data

The report also refers to a 2014 report The Value and Impact of Data Sharing and Curation: A synthesis of three recent studies of UK research data centres. This work explored the value and impact of curating and sharing research data through three well-established UK research data centres – the Archaeological Data Service, the Economic and Social Data Services, and the British Atmospheric Data Centre.

In summarising the results, Beagrie and Houghton noted that their economic analysis indicated that:

  • Very significant increases in research, teaching and studying efficiency were realised by the users as a result of their use of the data centres;
  • The value to users exceeds the investment made in data sharing and curation via the centres in all three cases; and
  • By facilitating additional use, the data centres significantly increase the measurable returns on investment in the creation/collection of the data hosted.
So clearly there are good stories out there.

If you know of any good news stories that have arisen from sharing UK research output data we would love to hear them. Email us or leave a comment!

Interview with Nigel Shadbolt on The Life Scientific

Sir Nigel Shadbolt was interviewed on ‘The Life Scientific‘ this morning  on BBC Radio4 about open data.

The general discussion ranged from his background and what got him interested in this area. The data being discussed is more about government public data (such as medical information or cyclist black spots) than that generated in research projects, but an interesting conversation nonetheless. A couple of items that jumped out to me:

16:50 – When we talk about data, really we are talking about information … Data and information and knowledge are kinda different and mostly when we talk about open data we are talking about information. Data (such as a number) only becomes information if it is placed in context. If you can do something with the information then it becomes knowledge – ‘actionable information’. These are different strains of stuff that the computer holds.  We need open information to build knowledge. The semantic web.

16:00 – Do the risks of making data available outweigh the benefits? And do we ask the general public’s opinion or just tell them that this is what we do? They want some sort of empowerment in this but often there is no empowerment.

29:00 – We are barely scratching the surface in terms of the insights as we anlayse and look for patterns in the information.  We are living in a world that is increasingly emitting data – people are increasingly able to collect data onto and off their phones (or supercomputers, depending on how you look at it). This data richness demands a new world for applications we haven’t thought of and ways of analysing the information.

Listen to the half hour interview here.

Blurb from the BBC webpage:

Sir Nigel Shadbolt, Professor of Artificial Intelligence at Southampton University, believes in the power of open data. With Sir Tim Berners-Lee he persuaded two UK Prime Ministers of the importance of letting us all get our hands on information that’s been collected about us by the government and other organisations. But, this has brought him into conflict with people who think there’s money to be made from this data. And open data raises issues of privacy.

Nigel Shadbolt talks to Jim al-Khalili about how a degree in psychology and philosophy lead to a career researching artificial intelligence and a passion for open data.

Published 14 April 2015
Written by Dr Danny Kingsley
Creative Commons License

A review of the RCUK review of implementation of its OA policy

The RCUK released its ‘Review of the implementation of the RCUK Policy on Open Access’ today and it makes interesting reading. First I should state that I think this is a good report, it seems well researched and balanced in tone and it is well written and laid out. Jisc also welcomes the report.

Overall findings

It seems that a ‘common factor’ amongst all of the people and groups interviewed was ‘a general acceptance and welcome given to the concept of open access’. However, the administrative effort to implement the policy and distribute the funds is significant. This is not helped by a level of confusion about different funding policies, particularly relating to embargo length, licence usage and expectations of data collection for compliance monitoring.

Not only is this an administrative problem but it is ‘leading to researchers ultimately not engaging with open access at all as it was perceived as being ‘too difficult’.’ (p16) Certainly there have been instances of this view expressed by researchers at Cambridge University.

This blog will concentrate on a few aspects of the review I thought interesting – support or otherwise of hybrid, reporting issues, non-compliance amongst publishers, lack of awareness amongst researchers and licenses. It finishes with an observation that the review validates some of the decisions Cambridge has made in relation to implementing the RCUK policy.

I should note the review includes some interesting information about learned societies, embargo periods and monographs but these are big issues that need teasing out on their own.

Supporting hybrid

As the Wellcome Trust found in their recent analysis of open access spend in 2013/2014 the RCUK reported that the amount charged for APCs for hybrid open access continue to be ‘consistently more expensive’ than fully OA journals, ‘despite the fact that hybrid journals still enjoyed a revenue stream through subscriptions’.

The review recommended that this should be monitored and ‘if these costs show no sign of being responsive to market forces, then a future review should explore what steps RCUK could take to make this market more effective’ (p25).

The reported amounts being spent on APCs are also interesting. The average APC paid during the first year, at £1,600 inc VAT was £472 less than the average APC assumed by the Finch Group, which was used as a proxy when calculating the size of the RCUK block grant (£1,727 + VAT = £2,072) (p11). While this in itself is not surprising as the amount quoted in the Finch report was seen to be high by open access advocates at the time, it is interesting to note that the average APC paid by Cambridge in 2014 was higher than the average quoted in the review at £1891.63.

Despite this large amount of money being spent on APCs, publishers offering hybrid – not the fully open access publishers, it should be noted – ’questioned’ level of the block grant currently offered by RCUK. These publishers expressed the view that the block grant ‘was too low to properly fund the transition to gold. Publishers felt that the transition to full gold open access publishing would be successful only if it was fully funded’ (pp15-16). It does beg the question as to what ‘fully funded’ means in this context.

Researcher awareness

Researchers appear to remain unaware of the tsunami that is occurring in scholarly communication. By centralising the payment of APCs we once again have a situation where researchers are divorced from the economic realities of publishing, in the same way libraries have traditionally been the foil between the economics of subscriptions and the access to the materials.

This concern is supported by the review’s observation that: ‘There is little evidence to suggest that the introduction 
of the RCUK policy had much of an impact on author behaviour, with publishers reporting that authors did not seem to be changing their choices on where to publish.
’ (p15)

If anything it has had a negative effect where ‘RCUK’s preference for gold has therefore been, at times, seen as a barrier to implementation and ‘buy-in’ from various communities across the disciplines’(p26). Anecdotally we are seeing this happening at Cambridge.

The review did note that ‘further transparency on what is being paid in APCs by institutions to publishers will be crucial in helping to change behaviours and ease the transition towards open access’.

Reporting issues

The review noted at several stages that there have been difficulties with collecting data and that they ‘have been more reliant on opinion than perhaps
 we might have liked to at the outset of the review’ (p4). They acknowledge the process would have been assisted greatly if there had been some standardisation in what the RCUK was asking for as the ‘template was, understandably, interpreted in a variety 
of ways’ (p9) I should note that Jisc is attempting to standardise the reporting.

When Cambridge was asked to report on compliance levels for the RCUK we were hampered by our inability to articulate the complete number of articles being published that have been funded  by RCUK. The review recognises that this was a widespread problem, particularly in ‘larger, distributed institutions (such as the research intensive universities)’. (p9). Many institutions provided estimates for the compliance reporting.

The review also looked at the (substantial) costs associated with collecting this data and noted that publishers could help given that the sources of data held by publishers ‘would be administratively simpler to collect’ (p10).

Not only could publishers reduce the costs of compliance by providing data, but, the review noted that  ‘complexities in working with publishers [was] one of the areas that had generated considerable administrative effort’ (p21). The problems include initial negotiations and ensuring that licences and invoicing were correct. The cost for this is borne by authors, library and administrative staff and the finance team.

Non compliant publishers

This then moves the focus to the compliance of publisher – which can be taken in a couple of ways. First, the review panel looked at how 
the publishers had helped institutions and researchers to comply with the policy by ensuring that their journals were ‘compliant’ (p11).

It seems that a considerable amount of funded research where an APC has been paid is not compliant with the RCUK policy because the license is not a CC-BY license. For example Elsevier stated that around 40% of the articles from RCUK funding that they had published gold were not under the CC-BY licence and are therefore not compliant with the policy. The American Society of Plant Biologists noted that its journal was not compliant as it did not offer the CC-BY licence and that was unlikely to change in the near future (p19).

Other publishers offer more than one type of license which makes it confusing for the authors, indeed  there was clear evidence that some publishers were offering a choice of licences, even when they knew that the author was RCUK-funded..

The question of publishers not making articles available even after an APC was paid was not singled out in the report but is implied in a  few of the statements in the review, particularly in the institutions having to double check if work is available post publication. This is an area which needs further analysis.

Licensing

The issue of the CC-BY licenses was a recurrent theme in the review. Many arts, humanities and social science disciplines hold ‘principled and practical objections to the use of CC-BY licences’ (p18). This is partly because work under a CC-BY license ‘could be both used commercially in ways of which the author does not approve and also might not be properly acknowledged as their work’ (pp19-20).

This does demonstrate a lack of full understanding of what a CC-By license allows, but  this is not surprising as  ‘Many publishers … reported a significant number of researchers were signing licence agreements without understanding what they were signing’ (p19).

Also highlighted in evidence was an issue with third
party copyright in that some rights owners (for example, image libraries) are reluctant to license material for digital reproduction, let alone for reproduction in an article that
is published under a CC-BY licence.

Support for the University of Cambridge approach

It was heartening to read of a couple of areas that support the position that Cambridge University has taken towards the implementation of the RCUK and HEFCE policies.

The review mentioned visits to institutions and noted how long it takes 
for researchers to learn about open access including the requirements, expectations and processes they need to follow. ‘One senior researcher commented that it had taken a full half a day to learn about open access.’ At Cambridge University we have taken a very soft touch approach to the researcher who simply has to fill in a few fields and upload a file through a simple interface and the Open Access team takes care of the rest.

Cambridge University has also taken a ‘first in best dressed’ approach to expenditure of the block grant. This seems to have been a good decision as the review has noted that there were concerns raised within both written and oral evidence that where institutions had distributed the block grant by department or faculty, as it had a detrimental impact on some disciplines.

About the review

The review covered the period from April 2013 to July 2014. When the RCUK policy was announced they did say that there would be a review within a year, however there was a need for a full year of implementation before they collected the data so hence the delay.

Chaired by an independent researcher, Professor Sir Robert Burgess, the review panel consisted of ‘knowledgeable members of the various communities and sectors with an interest in the policy and open access’. The evidence collected was through over 80 submissions,  some verbal evidence and a small number of visits to institutions to talk informally with researchers, librarians and institutional administrative staff about their experiences of implementing the policy.

The report mentions on no fewer than three occasions that it is a review of the policy implementation not a debate on the merits of open access.

The next planned review will be in 2016.

Published 26 March 2015
Written by Dr Danny Kingsley
Creative Commons License

Cambridge expenditure on APCs in 2014

Cambridge (along with many other institutions) were recently approached by Jisc to report on our article processing charges (APC) payments for 2014  as part of Jisc’s APC data collection project to address the Total Cost of Ownership of scholarly communication. Stuart Lawson, who is compiling these datasets has made the files available on Figshare.

A couple of caveats – This dataset only contains APCs which were paid centrally; there will be many other APCs paid by the University of Cambridge and its staff which are not included in this dataset.

Also we ended up listing the publications that were submitted to our system in 2014 because that was our starting point, rather than considering the payments from 2014 and working back. This might be an issue for the analysis – it will depend on which way people have interpreted ‘2014’. I should note that 74 (12.13%) of the invoices listed in this data were actually paid in January 2015.

Headline numbers

  • 610 funded articles were submitted in 2014 to our system for publication
  • 495 have been invoiced and paid as at March 2015
  • The amount spent on APCs (including VAT) for these invoices was £936,224.86
  • This gives an average cost per APC paid (including VAT if charged) of £1891.36
  • The range of APCs is from £94.61 for an article published by Magnolia Press, to £3,869.72 for an article published by Wiley

What does this mean?

It means we are spending a lot of (RCUK) money on APCs.  We also have supported payment of page and colour charges and have paid for researchers to join memberships that offer a discount for APCs out of the RCUK fund – neither of those categories of expenditure was captured in this data set.

The University is participating in the various Jisc Collections series of offsetting programs with publishers and we are discussing other ways of managing this expenditure. However, we really need to consider whether this is the way of the future.

Issues with reporting

Pulling the information together for this list revealed a few issues. First, while we agree with the collection of data to allow aggregation across the sector, for us to pull the required information together was challenging because we do not collect the information in this way.

However there are some indications this type of detail will be requested on a standard basis for reporting. Certainly Jisc suggesting this as a way forward. In their ‘APC data collection’ blog  they state:

HEIs will be able to benchmark their APC data. Using a standard template will help to produce comparable data between institutions which can be more easily aggregated. The data fields to be completed have been chosen from careful analysis of HEI needs. This means that the spreadsheet can be used for both internal reporting and also external reporting including to the Wellcome Trust for compliance monitoring of the Charity Open Access Fund, and potentially RCUK.

 So we therefore need to consider this information when designing new systems.

Issues with invoices

We have a considerable block of Purchase Orders that have not been invoiced. While there will always be a delay because of the length of time between acceptance and publication in some instances, some of these are very old.

The issue of items not being invoiced can partially be explained by the cancellation of Purchase Orders. In some cases the team has contacted the author and found that the email is bouncing because the author has moved to a different institution. In other cases the author decided not to go ahead with open access publication, so we have raised a Purchase Order against something that no longer exists.

Long standing Purchase Orders (over 14 months) are potentially a problem because it is money being held as committed funds. We are now adding the process of checking older  un-invoiced Purchase Orders to the ever-growing list of things to do in the workflow for ensuring compliance.

Published 26 March 2015
Written by Dr Danny Kingsley
Creative Commons License