Tag Archives: theses

Who is requesting what through Cambridge’s Request a Copy service?

In October last year we reported on the first four months of our Request a Copy service. Now, 15 months in, we have had over 3000 requests and this provides us with a rich source of information to mine about the users of our repository.  The dataset underpinning the findings described here is available in the repository.

What are people requesting?

We have had 3240 requests through the system since its inception in June 2016. Of those the vast majority have been for articles 1878 (58%) and theses 1276 (39%). The remaining requests are for book chapters, conference objects, datasets, images and manuscripts. It should be noted that most datasets are available open access which means there is little need for them to be requested.

Of the 23 requests for book chapters, it is perhaps not surprising that the greatest number  – 9 (39%) came for chapters held in the collections from the School of Humanities and Social Sciences. It is however possibly interesting that the second highest number – 7 (30%) came for chapters held in the School of Technology.

The School of Technology is home to the Department of Engineering which is the University’s largest department. To that end it is perhaps not surprising that the greatest number of articles requested were from Engineering with 311 of the 1878 requests (17%) from here. The areas with next most requested number of articles were, in order, the Department for Public Health and Primary Care, the Department of Psychiatry, the Faculty of Law and the Judge Business School.

What’s hot?

Over this period we have seen a proportional increase in the number of requests for theses compared to articles. When the service started the requests for articles were 71% versus 29% for theses. However more recently, theses have overtaken request for articles to a ratio of 54% to 46%.

The most requested thesis, by a considerable amount, over this period was for Professor Stephen Hawking’s thesis with double the number of requests of the following ten most requested theses. The remaining top 10 requested theses are heavily engineering focused, with a nod to history and social research. These theses were:

The top 10 requested articles have a distinctly health and behavioural focus, with the exception of one legal paper authored by Cambridge University’s Pro Vice Chancellor for Education, Professor Graham Virgo.

When are people requesting?

Looking at the day of the week people are requesting items, there is a distinct preference for early in the week. This reflects the observations we have made about the use of our helpdesk and deposits to our service – both of which are heaviest on Tuesdays.

When in the publication cycle are the requests happening?

In our October 2016 blog we noted that of the articles requested in the four months from when the service started in June 2016 to the end of September 2016, 45% were yet to be published, and 55% were published but not yet available to those without a subscription to the journal.  The method we used for working this out involved identifying those articles which had been requested and determining if the publication date was after the request.

Now, 15 months after the service began it is slightly more difficult to establish this number. We can identify items that were deposited on acceptance because we place these items on a very long embargo (until 2100) until we can establish the publication date and set the embargo period. So in theory we could compare the number of articles with this embargo period against those that have a different date.

However articles that would provide a false positive (that appear to have been requested before publication) would be ones which had been published but we had not yet identified this – to give an indication of how big an issue this is for us, as of the end of last week there were 1768 articles in our ‘to be checked’ pile. We would also have articles that would provide a false negative (that appear to have been requested after publication) because they had been published between the request and the time of the report and the embargo had been changed as a result. That said, after some analysis of the requests for articles and conference proceedings, 19% are before publication. This is a slightly fuzzy number but does give an indication. 

How many requests are fulfilled?

The vast majority of the decisions recorded (35% of the total requests for articles, but 92% of the instances where we had a decision) indicate that the requestor shared their article with the requestor. The small number (3%) of  ‘no’ recordings we have indicate the request was actively rejected.

We do not have a decision recorded from the author in 62% of the requests. We suspect that in the majority of these the request simply expires from the author not doing anything. In some cases the author may have been in direct correspondence with the requestor. We note that the email that is sent to authors does look like spam. In our review of this service we need to address this issue.

Next steps

As we explained in October, the process for managing the requests is still manual. As the volume of requests is increasing the time taken is becoming problematic. We estimate it is the equivalent of 1 person day per week. We are scoping the technical requirements for automating these processes. A new requirement at Cambridge for the deposit of digital theses means there will be three different processes because requests for these theses will be sent to the author for their decision. These authors will, in most cases, no longer be affiliated with Cambridge. Requests for digitised theses where we do not have the author’s permission are processed within the Library and requests for articles are sent to the Cambridge authors.

Given the challenges with identifying when in the publication process the request has been made, we need to look at automating the system in a manner that allows us to clearly extract this information. The percentage of requests that occur before publication is a telling number because it indicates the value or otherwise of having a policy of collecting articles at the acceptance point rather than at publication.

Published 12 September 2017
Written by Dr Danny Kingsley
Creative Commons License

Theses – releasing an untapped resource

As part of Open Access Week 2016, the Office of Scholarly Communication is publishing a series of blog posts on open access and open research. In this post Dr Matthias Ammon looks at theses and their use.

It may sound obvious, but PhD theses are a huge reservoir of original research content, given that each thesis represents at least three or four years’ focussed engagement with a specialised research topic. Traditionally, however, the results of this work have not been easily accessible.

A print copy of the approved thesis would be deposited in the library of the university where the PhD was undertaken so that access was mainly restricted to other members of that university. Interested readers have to travel to visit the library or rely on frequently costly interlibrary loans. While some of the research contained in theses would be published in articles or monographs, this still means that an enormous amount of research was and is effectively locked away.

Increasing access

With the changes in technology in recent decades allied with the rise of Open Access and institutional repositories, the accessibility of PhD theses in general has improved. In Australia, the Australian Digital Theses program began in 1998, expanding to the Australasian Digital Theses program in 2005. This used VT-ETD software to host digital theses at individual institutions which were collated to one search engine. The ADT website, a central metadata repository, was hosted at the University of New South Wales. This was decommissioned in 2011 as theses were migrated to their various institutional repositories. All Australian theses are now findable in Trove, the National Library of Australia’s Trove service. There are 334, 000 theses listed in Trove of which over 119,000 are available online.

A significant number of UK universities now require the deposit of a digital copy of a thesis in the university’s repository as a condition for awarding the PhD degree. Usually this entails making the thesis openly available although embargoes may be placed for reasons of confidentiality or commercial concerns. In addition, PhD students funded by any of the UK research councils under the RCUK Training Grant are required to make their theses available Open Access.

Although it is not yet mandatory at the University of Cambridge for PhD students to provide a digital copy of their thesis, students can voluntarily upload their approved dissertations to the institutional repository, Apollo. Approximately one in 10 PhD students do so. In the next couple of weeks, the Office of Scholarly Communication is embarking on a pilot for the systematic submission of digital theses with selected departments.

Finding theses

There are national and international repositories that aggregate access to PhD theses, such as the British Library’s EThOS (for the UK) or DART-Europe (for European universities), making it easier for interested researchers to find relevant material without having to trawl through individual repositories.

Open Access Theses and Dissertations aims to be the best possible resource for finding open access graduate theses and dissertations published around the world. Metadata (information about the theses) comes from over 1100 colleges, universities, and research institutions. OATD currently indexes 3,422,634 theses and dissertations.

NDLTD, the Networked Digital Library of Theses and Dissertations provides information and a search engine for electronic theses and dissertations (ETDs), whether they are open access or not. The service also provides ‘Guidance Briefs’ on topics such as Copyright and Preserving and Curating ETD Research Data and Complex Digital Objects.

Proquest Theses and Dissertations (PQDT) is a database of dissertations and theses published digitally or in print. Note these are made available for a fee that does not benefit the author. [In September 2017 ProQuest contacted us to say they do pay royalties. Their policy is here.] In addition access to PQDT may be limited depending on local library licensing arrangements.

Looking to the past

So while it is looking likely that most future PhD theses will be available online (either freely or requestable), what about the vast number of PhD theses written up to this point? For context, Cambridge alone holds over 40,000 printed theses, with approximately 1100 being added every year. Approximately 2,000 of these have been digitised at the request of individuals wishing to have access to the theses.

Last year we ran an ‘Unlocking Theses’ project to increase the number of Open Access theses in the repository, which stood at about 600 at the beginning of 2015. The Library also held over 1200 scanned theses on an internal server. The Unlocking Theses project added all of these scanned theses held by the Library into the University repository. The Development and Alumni Office were able to provide contact details for just over 600 of these authors. The majority of these authors have now been contacted and we have had a 35% positive response rate from them.

As of today we hold 2257 theses in the repository of which half are Open Access. The remaining theses are currently held in a Restricted Theses Collection but the biographical information about these theses is searchable. Approximately one third of requests we have from our Request a Copy service is for these theses. In addition some authors have found their restricted thesis online and requested we open access to it.

Cambridge is currently working with the British Library to digitise some of the 14,000 Cambridge theses they hold on microfilm. Our finances do not stretch to the whole corpus, so we have decided to digitise ten percent. This has meant a process to determine which theses we choose to have digitised. Considerations have included the quality of digitisation from microfilm for typeset versus typewritten theses (and indeed whether the thesis is printed single or double sided because of shadowing). We have also chosen theses on the basis of those disciplines are highly requested from our Digital Content Unit. This has proved to be challenging, not least because of the difficulty of determining disciplines of theses from our library catalogue.

We are hoping to upload these theses to the repository towards the end of the year, and with the addition of several hundred theses that have been digitised this year from the Digital Content Unit will double the number of theses we hold in the repository.

Considerations

There are several issues that need to be considered before theses can be made available openly. The first concerns third party copyright, that is to say the inclusion of quotations, images, photographs or other material that does not represent original work on behalf of the thesis author but has been taken from previously published work. There is generally no problem with including such material in the copy of the thesis submitted for examination and the print version deposited in the University library, but making the thesis freely available online constitutes a change of use and requires separate permissions. This is a problem that applies to both current and older theses and requires checks on behalf of the author and possibly the library.

Another issue related to copyright is the author’s permission to make the thesis available which is necessary because the author retains the copyright for his work. For current theses, this permission can be incorporated into the submission process, either as part of the requirement for the PhD or by the author signing an agreement when the thesis is voluntarily uploaded.

However, it is not so easy to obtain permission for retrospective digitisation as we discovered during our Unlocking Theses project. The contact details of alumni are not always known and in cases where the original author is deceased it may be challenging to establish the copyright holder, making it difficult to obtain an explicit ‘opt-in’ permission. Finally, there are financial considerations as the digitisation of large number of theses requires a significant outlay for staff, equipment and administrative costs.

Big projects

In recent years, a number of universities have undertaken large-scale digitisation projects of their holdings of PhD theses and have dealt with the permission issue in different ways.

The experience of these UK universities also appears to indicate that alumni are for the most part happy to see their theses made openly available. If more institutions follow suit and dedicate funding to opening up the research undertaken by generations of students this large reservoir of research will no longer remain untapped.

There are other challenges related to digital theses that still remain to be solved, such as the problem of linking theses to their associated data and the question of persistent identifiers to seamlessly integrate the output of both individual researchers and institutions. In the future, consideration should be given to non-text or multimedia PhDs, as was debated at a recent panel discussion at the British Library.

For now though, opening up access to decades’ or even centuries’ worth of scholarship sitting on university library shelves in the form of physical copies of PhD theses sounds like a good start.

Published 26 October 2016
Written by Dr Matthias Ammon and Dr Danny Kingsley
Creative Commons License

Research Support Ambassadors – a progress update

On Thursday 19th November the participants of the Research Support Ambassadors programme presented their work to date. This blog from Yvonne Creba, a member of the Research Data Facility team in the Office of Scholarly Communication, summarises these presentations of their progress so far.

A good start

Attending the Research Support Ambassadors programme presentation I can only say how impressed I was with the amount of time and effort contributed by the participants of each group. This is even more notable considering that the following was achieved outside of their normal working hours. Each of the groups produced an informative and interesting session on each of the topics.

What is the Ambassadors programme? It’s an opportunity for interested library staff to receive specialised training, to allow them to become the local ‘go to’ person on some scholarly communication issues. The programme is intended to develop a team of Ambassadors who feel confident and able to assist researchers with queries about publishing processes, data management, open access/open data policies and research sharing options, to name but a few.

The Ambassadors programme aims to provide ‘what the researchers want, where & when they want it’.  To start, the Ambassadors have embarked on development of training and information materials on the following four topics: Research Lifecycle, Research Support Services, Managing your Online Presence and Open Access to theses. Below are some of the highlights from their presentations.

Open Access to theses

The Ambassadors team assigned to this project – Matthais Ammon, Phillipa Grimstone, Charlotte Hoare, and Stephanie Palek – aimed to develop guidance materials on how to make PhD theses Open Access.

There is a need for a one-stop webpage for PhD students to answer basic questions about making their theses Open Access and the need for thesis submission to the institutional repository (now called Apollo) to be clarified in terms of Open Access.

The team  have already developed an impressive amount of resources and collated information about Open Access to theses and the advantages for PhD students, challenges with Open Access to theses and (traditional) publishing, copyright concerns and patenting & sensitive data. They referred to some of the material they have found in their research such as ‘Benefits of making theses available online’.

The team is now trying to answer how theses fit into the Open Access research landscape, the potential impact of making theses available online, fulfilment of funder requirements.

Managing your Online Presence

This team, consisting of Andrew Alexander, Céline Carty, Kasia Drabek, Agnieszka Drabek-Prime, Agnieszka Kurzeja and Brendan King, initially discussed and brainstormed this subject, as it is a large area and they wanted to define the scope of support to be offered.  The team’s strategy was focused on creating a potential outline for a session that the Ambassadors could run.

The group presented a demonstration on the creation of an ORCiD ID. ORCiD stands for “Open Researcher & Contributor ID” and it is a free, unique, individual, global, permanent identifier ideal for researchers and scholars to help them keep track of their research outputs. The group proposed some ideas on how to attractively present ORCiD to researchers.

The group thought that those who attend the session will be asked to bring along their laptops, so that after a short demonstration on how to create an ORCiD each participant will actually create their own. This will provide a tangible output of the session.

Research Support Services across the University

The idea for this topic was to provide clear signposts to the range of help on offer, rather than reinventing the wheel by creating something new. The group working on this topic are Colin Clarkson, Lindsay Jones, Mary Kattuman and Claire Sewell. There is a great deal of support available for researchers, both within the University and outside but there’s no one place where everything is listed in an accessible format.

The research doughnut available on the Office of Scholarly Communications (OSC) website has a nice, intuitive graphical display, hence the idea to use this format to present the services of the research lifecycle. The group hopes to make keywords within the cycle into clickable links, which will thus allow users to find related information and resources.

One of the sources that were highlighted in the directory was the LibrarySearch. Rather than just including a link to the static LibrarySearch interface, the group thought it would be a good idea to create a predefined search on various stages of the research process. That way the researcher can just click on a link and go straight to the required search results.

The group suggested promotional activities including a pop-up presentation of a maximum of ten minutes which could be included at the start or finish of other taught sessions. Something that will briefly introduce the concept of the site and showcase what it contains. This could be delivered by any Research Ambassador and would be a ‘presentation-in-a-box’ that people could just pick up and deliver.

One of the first things the group intends to do is to improve the general look and feel of the site and they intend to do some user testing with researchers to see how they use the site and get their feedback about the content.

Research Lifecycle

This is intended to be a web resource using the Research Lifecycle with links out to information about each of the points in the cycle – presented by Clemens Gresser, Jo Milton, Veronica Phillips, Meg Westbury.

This team reviewed the Research Lifecycle from the perspective of a researcher. They have looked at existing websites to see what information is already available and reviewed the graphical displays used by different universities – to look for content which is accessible in a user-friendly manner.

Ideas provided by the group on reaching the required audience were to plug into orientation sessions, advertisements by faculty librarians and plugging into sessions on managing an online presence.

The group also suggested that having a glossary of various terms related to the Research Lifecycle would be useful. The group is still reviewing what type of information to put up for the cycle and which format would be the most fit for purpose to best suit researchers in Cambridge.

Published 14 December 2015
Written by Yvonne Creba and Dr Danny Kingsley
Creative Commons License