Category Archives: Open Research at Cambridge Conference

2017 – That was the year that was

The fact that we are sending this out in the fourth week of January reflects how busy 2018 is already shaping up to be. But it is important to take stock and reflect on the achievements of the past year.

In many ways 2017 was a year of numbers for the Office of Scholarly Communication. Some of them were large – we celebrated 1000 datasets deposited to the repository and 1 million downloads of items in the repository in 2017. Other numbers were enormous, like the overwhelming tide of people who tried to access Professor Stephen Hawking’s PhD thesis when we released it, and the very impressive Almetrics score of over 1500 for his thesis because the release “broke the internet”. There were some small but significant numbers too, including farewelling a few integral members of staff.

Celebrations

The OSC held a celebration in September to mark 1000 datasets in the repository. It was supposed to be a ‘garden party’ but inclement weather put paid to that! Apollo now contains the largest body of datasets of any UK HEI and downloads from the site account for a 36% (27,346) of all dataset downloads from 132 UK HEI repositories in 2017, according to IRUS-UK.

Our Repository Integration Manager, Dr Agustina Martinez was a joint winner of a 2017 Professional Services Recognition Award.  The judges used their discretionary powers to make this award that did not fit into the given categories. The award was for a “particularly successful cross-departmental team partnership” between the OSC, the Research Office, and the University Information Service for “preparing, implementing and evaluating the synchronisation of two internal research systems, a complex and challenging project that succeeded because of effective collaborative working”.

This is a prestigious achievement and we are very proud of her.

Open access activity

The Open Access Team within the OSC processes a huge number of articles, datasets and theses into the Apollo repository to ensure the University is compliant with various funder requirements. Over the past few years this work has been a primarily manual process, causing significant backlogs for the team.

A huge 18 month project has been the linking of our DSpace repository to the University’s CHRIS system, Symplectic Elements. In June 2017, the OSC successfully launched a new deposit system for article manuscripts for the research community, with new workflows eliminating the manual uploading of information to the repository.

The Request a Copy function was very popular, with 3108 requests in 2017 alone. The vast majority of these were for articles (1599) and theses (1282). Currently these requests are manually processed and forwarded to the authors manually, a very time consuming process.

Theses fun and games

During 2017 the team developed a new deposit form for theses which has allowed current and past students to easily upload their theses. The University introduced a new policy for PhD students to provide digital versions of their theses and the number of theses in the repository is now increasing exponentially.

In Open Access Week we opened up access to Professor Stephen Hawking’s thesis Properties of expanding universes. The response was incredible, with over 22,000 access an hour at one stage and the story that this thesis had ‘broken the internet’ hit the world news. Over a million unique IP addresses hit the page during and over the weeks after Open Access Week. There were over 42,000 accesses of Professor Hawking’s thesis in December alone.

Possibly off the back of that huge publicity, our (relatively) quiet call to alumni to also share their PhDs openly online has been met with enthusiasm and we continue to receive regular queries from our alumni community.

Experiments/research

We are now half way through our Open Research Pilot Project which has identified primary issues in this area as a lack of sustainable support for infrastructure and a lack of reward and incentive to work openly. The participants agree that having the dialogue is helping the participating researchers exchange ideas and the Wellcome Trust develop new services and policies.

We started investigating Text and Data Mining in 2017, with a meeting of interested library staff in February. This was followed up with a workshop at the RLUK conference which identified the fact that we do not have any Service Level Agreements with our publishers. Within Cambridge, our colleague James Cauldwell prepared a TDM libguide for our research community. In July we hosted an extremely well attended Text and Data Mining symposium which showcased many experiments and methods. The presentation slide decks and some of the recordings of the presentations are available on our website.

The Data Champions programme has been the subject of considerable interest worldwide and the website has been updated to be more useful to visitors. Regular meetings and training fora were held in 2017.

Training of the library community

Part of the work of the OSC is to develop the skills and knowledge of the large Cambridge library community in scholarly communication issues. Our Research Skills Support Coordinator Claire Sewell had an impressive 403 attendees at 23 events with a 70% rating of ‘excellent’. Claire launched successful online webinars in 2017 in addition to her face to face programme delivery. An example is ‘How to spot a predatory publisher’.

Claire has also been involved with Danny Kingsley on a sector wide project that is looking at the broader issue of training provision in scholarly communication. This project arose from a workshop held at UKSG in April. In 2017 the group, which includes representatives from Jisc, SCONUL, ARMA, RLUK, UKCORR, Vitae and CILIP, had its first meeting in June and will continue work this year.

Outreach to the research community

In 2017 the OSC consolidated its training provision for PhD students and Early Career Researchers to create a termly programme each for HASS and STEM. We have worked with other libraries across Cambridge to share our training capabilities, an approach that has been very successful.

We ran several very popular events, and in keeping with our philosophy of opening up our work to the widest audience, filmed events which are publicly available on our website and livestreamed where possible. Events included “How to get the most out of modern peer review”, a Text and Data Mining symposium and several “Helping Researchers Publish” events. We ended the year with a large event at St Catherine’s College called ‘Engaging Researchers in Research Data Management’ in conjunction with Jisc and SPARC Europe.

As a follow on from that event we are encouraging those involved in data to pitch for some funds to develop an idea to further the engagement of researchers with data. Information about the project can be downloaded.

Some events fed into further research. A popular January event to discuss Electronic Lab Notebooks then morphed into a trial of ELNs amongst our research community at Cambridge. This trial completed towards the end of 2017 and the findings and advice are now online with a link to what is already becoming a useful online discussion resource.

Engagement

We continue to blog through Unlocking Research and the new Open Research- Adventures from the Frontline blog. Between the two we published 45 blogs in 2017, all of which received hundreds of visits. The most visited blog was Milestone -1000 datasets in Cambridge’s repository with 1300 visits. Looking at the most popular blogs it shows a wide range of topics that have grabbed the imagination of the readership which seem to alternate between interest in the processes and assessment of Cambridge’s scholarly communication services:

and discussions about the politics surrounding policy in this area:

Interestingly, the second most visited blog in 2017 was one published in February 2016 – What does a researcher do all day?, demonstrating the usefulness of this as a communication resource.

We continue to send out our two newsletters. The KaleidOSCope newsletter now has over 1100 subscribers of which just over 50% go to a Cambridge email address. The readership of the Research Data Management newsletter, with over 2200 subscribers, is more Cambridge focused, with approximately 77% having a Cambridge email address.

We also launched our new logo in 2017 – with the dandelion motif – to symbolize dissemination. As one of our team mentioned, they are a weed, but they are also very effective at spreading themselves around. We are not sure what that says about us.

Review of the Office of Scholarly Communication

After two years of operation, the Office of Scholarly Communication was the subject of a Review by the University. The actual Review took place in mid July, after a lead in requiring considerable work to reflect and summarise the work that has been done in this initial stage of the Office.

The process did represent a significant amount of extra work for the team in document preparation, and there is no denying this was a very stressful period. But it was extremely edifying to have such a high level endorsement of the approach the OSC has taken, with the Panel commending the team “for a highly creative and professional approach to setting up the service, and for achieving rapid growth in awareness, engagement and compliance in open access throughout the University”.

The Panel noted they had received “very positive feedback from service users who had responded to the consultation, and from external stakeholders”.

The Panel also commented on a “pro-active stance towards external engagement and advocacy for scholarly communication, being recognised as an internationally renowned and pioneering leader in the field of scholarly communication”. The blogs here on Unlocking Research are an example of this work.

Generally the OSC staff have continued to contribute to professional activities within and outside the University to help develop robust policies and services in the area of scholarly communication.

The first recommendation of the Review of the OSC was that a Working Party should be convened to “clarify the University’s needs and expectations in relation to Open Research”. This will be an important piece of work for the OSC with the University throughout 2018.

Staff changes

In April 2017 the Cambridge Library community welcomed Dr Jess Gardner as the University Librarian. In October last year Professor Stephen Toope was admitted to the office of Vice Chancellor at Cambridge University. These are significant leadership changes for the Library and the University.

Locally, scholarly communications was further embedded in the fabric of core business for the Library with the creation of a new Directorate in Scholarly Communication and Research Services, which Dr Danny Kingsley is heading in her new role as one of the Deputy Directors of the Library. Drs Lauren Cadwallader and Arthur Smith moved into Acting Deputy Head roles for the OSC.

The OSC said farewell to two members of the Research Data Management Facility. In April Rosie Higman moved to a new role at Manchester University and in July, Dr Marta Teperek moved to a role at Delft University in the Netherlands. This has been a big loss for the OSC team, although we maintain a strong working relationship with them both and have now been able to extend our collaborations across institutional and country lines.

On the other side of the ledger, we welcomed several new people to the team. In January, Dr Andre Sartori joined the Open Access Team. In May, Dr Marta Busse started work as our Research Data Coordinator. Dr Melodie Garnier joined the team in a Scholarly Communication Support role. We have had Tony Malone working with us, undertaking a wide range of support across our activities. Most recently, Zoe Walker-Fagg has joined as our Project Manager and lead on the thesis work.

There were some internal changes from within the Cambridge library community as well. Dr Matthias Ammon moved to a new role in a departmental library. Katie Hughes and Lucy Welch joined the team on secondments from other areas of the library network to cover some of the internal movement resulting from those staff who had left. For a team of 17 people, this movement within one year (four out, seven in) has been substantial.

We continue to need to consolidate the contractual and funding arrangements for the staffing of the OSC, and while the situation is considerably more stable than it was at the beginning of 2017, there is still some significant work to be done.

Looking ahead

There are some large projects underway in 2018. We are consulting with our research community and continuing to refine the thesis requirement policy.  We are working with Cambridge Digital Humanities and Cambridge University Press on a Text and Data Mining “Test Kitchen” to explore techniques, corpus and copyright issues. We are also looking to understand better how our research community interacts with the published literature.

On the technical side, one of our biggest challenges this year will be to automate our Request a Copy service and start promoting this more actively. We have some interesting plans for engagement including a competition for researchers to have their work explained by a professional storyteller for the Cambridge Science Festival.

The outcome of the University’s work on a position statement on Open Research will affect the direction of the OSC into the future. Regardless, we will continue to support and innovate in the area of scholarly communication with our local community while contributing to the discussions and activities nationally and globally. Plus ça change, plus c’est la même chose sums it up nicely.

Published 22 January 2018
Written by Dr Danny Kingsley
Creative Commons License

How do you know if you’re achieving cultural change?

On 15th November 2017, the University of Cambridge held its first research data management (RDM) conference, Engaging Researchers in Good Data Management. The Office of Scholarly Communication collaborated with SPARC Europe and Jisc, hosted the one-day event at St. Catherine’s College. In attendance were researchers, administrators, and librarians all sharing their experiences with promoting good RDM. Having a mixture of people from various disciplines and backgrounds allowed many different points of view on engaging researchers to be discussed. In the afternoon, the attendees split off into focus groups to concentrate on a number of nagging questions.

Our group’s topic of discussion: How do we effectively measure cultural change in attitudes towards data management? Leading the discussion was Marta Teperek from Delft University of Technology. There was a mixture of around 30 librarians and researchers from all over the world discussing strategies for engaging with researchers.

How do we set about achieving ‘cultural change’?

Marta started the conversation off by asking what everyone present was already doing at their institutions to engage researchers. Many shared their experiences and some frustrations at pushing good data management habits. One person shared that at his university the initial push toward better data management was achieved by creating and delivering RDM workshops for PhDs and young researchers in the Digital Humanities. These students were already interested in digital preservation, so they were a keen audience. Targeting PhD students and early career researchers may be a more effective strategy because they could develop good data management habits early in their careers. The earlier the intervention, the easier it would (hopefully) be.

Overall, most agreed that directly speaking to researchers is more effective than having initiatives relayed from the top-down. Attendees perceived compliance as a driver rather than a useful stick to persuade researchers to take data management seriously. Even if only a few researchers turned up to data management events, it was still increasing exposure.

Some argued for a multi-prong strategy. Initiatives like the Data Stewards at Delft TU and the Data Champions at the University of Cambridge were perceived as good ways to reach out to researchers in their departments and provide more customized advice. At the same time, having expectations of good data management relayed from on high could help creating greater impetus.

What do we mean by ‘cultural change’?

Naturally, the conversation progressed to what the phrase ‘cultural change’ actually means. It was difficult to determine in 45 minutes what kind of ‘cultural change’ we wanted to see within our different institutions. We started by asking some questions. What were our goals? What would need to happen before we said yes, the culture is changing? Which really meant what do we measure to find evidence of cultural change? Is it better metadata, more awareness of copyright, researchers reaching out to us for help, or an increase in number of grants awarded that would signal an actual change? It would seem that there could be many definitions of ‘cultural change’, but the crucial takeaway is that it is essential to define what your parameters of cultural change will be in the planning stages of any RDM programme.

Where is the evidence?

The conversation progressed to how do we find and gather evidence. With all of the work being done by researchers, librarians, and administrators, how do we know what is actually effective? We cannot state that engaging with researchers (which can be time-consuming) is working without having actual evidence to confirm it. A number of different ideas were discussed, with the time when feedback was gathered being a particular point of variance.

Quantifiable information such as number of datasets deposited, number of datasets downloaded and re-used, and number of grants with a Data Management Plan could be collected. For example, the University of Illinois conducted a detailed analysis of 1,260 data management plans using a controlled vocabulary list and looked at possible correlations between solutions for data management listed in funded and unfunded proposals.

Another method of benchmarking included asking researchers to periodically complete short surveys on data management practice in order to measure any noticeable changes. In that way, an institution can assess whether their engagement strategies work and whether it achieves the desirable effects (improvement of data management practice). Delft, EPFL, Cambridge and Illinois collaborated on development of an agreed set of survey questions. Conducting this same survey across different institutions enables benchmarking and comparison of the different techniques and how effective they are in achieving cultural change in data management. In addition to this survey, the team also interviews some researchers in order to gather additional qualitative data and more detailed insights into data management practice. The hope is that carrying out these quantitative surveys and qualitative interviews periodically will correct for the potential problem of self-selecting participants.

In the future

Ultimately, it turned out that most of those attending the focus group discussion were already working actively to develop systems to measure impact and gather feedback. However, the possibility of carrying out long-term cross-institutional research that would allow comparisons between different data management programmes is very tantalising. The final takeaway from this focus group discussion was that the majority of those attending would be very keen to take part in such research, so watch this space!

Published 18 December 2017
Written by Katie Hughes and Lucy Welch
Creative Commons License

From data curators to intellectual entrepreneurs: observations from IFLA

In this blog post, Clair Castle, Librarian, University of Cambridge, Department of Chemistry reflects on her experience at the IFLA Satellite Meeting 2017 in Warsaw, Poland.

Earlier this year I was invited by the Office of Scholarly Communication (OSC) at the University of Cambridge to present a paper on Data Curator’s Roles and Responsibilities: International and Interdisciplinary Perspectives. This was my first time writing a paper for a conference and presenting it; it was slightly daunting but exciting too!

IFLA is the International Federation of Library Associations and Institutions, the international body that represents the interests of library and information services and their users. It celebrates its 90th birthday in 2017. This conference was a pre-Congress Satellite Conference, taking place just before the IFLA World Library and Information Congress held in Wrocław, Poland, from 19–25 August.

There were three sessions of four presentations in the programme – which includes links to every presentation. You can find most of the papers that were presented here. The main conference hashtag on Twitter was #wlic2017 (learn more about the 2017 and upcoming 2018 congress by following @iflawic).

Conference focus

Data curation has emerged as a new area of responsibility for researchers, librarians, and information professionals in the digital environment. The huge variety and amount of data that needs to be processed, preserved, and disseminated is creating new roles, responsibilities and challenges for researchers and the library and information professionals who support them. The primary goal of the conference was to engage the international scholarly community in a conversation that led to a better understanding of these challenges, and to discuss the main trends in data curation and Research Data Management (RDM) practices and education.

To ‘curate’ means to ‘take care of’. What resonated with me the most from the conference was the fact that while we are curating data we are curating people as well. We are doing this by changing research culture, evolving the profession, changing research (and research support) practices, doing outreach and advocacy work, and liaising with related university support services. The conference presentations returned to this theme again and again.

I won’t discuss every presentation here, instead I will collate and relate the ideas that I found most thought-provoking.

Intellectual entrepreneurship

This term was introduced to me by Nitecki and Davis’ presentation ‘Expanding librarians’ roles in the research life cycle’. The definition I have since found that explains this the best is from Charles J. Chumas at Stony Brook University:

“Take … the textbook definition of entrepreneur: A person who organizes and manages any enterprise, especially a business, usually with considerable initiative and risk. Now, switch out the words “enterprise” and “business” with words such as “research” or “education”. This is the concept of intellectual entrepreneurship. It is the concept of taking risk, seizing opportunity, discovering and creating knowledge and employing one’s own innovation and strategies, with the ultimate goal of solving problems in corporate, societal or governmental environments. An intellectual entrepreneur … actively seeks out their own education … The philosophy of IE embodies four core values: vision and discovery, ownership and accountability, integrative thinking and action, and collaboration and teamwork”.

I feel that this describes the role of data curators exactly: researchers and the people supporting them are planning data curation strategically and innovatively, acquiring the necessary knowledge and skills to develop it in their institution, and working to bring systems, services and people together to achieve their overall goal of managing data effectively.

Zhang’s presentation ‘Data curators: A glimpse at their roles at the academic libraries in the United States’ mentioned the Association of Research Librarians’ Strategic Thinking and Design Initiative: Extended and Updated Report (2016) which estimates that the research librarian will have shifted from knowledge service provider to collaborative partner within the research ecosystem by 2033. In one example of this, librarians have shifted from providing a service support role to working with researchers to further open science: the FOSTER portal is an e-learning platform that brings together the best training resources addressed to those who need to know more about Open Science, or need to develop strategies and skills for implementing Open Science practices in their daily workflows. It provides training materials for many different users – from early-career researchers, to data managers, librarians, research administrators, and graduate schools. This reflects the self-education aspect of intellectual entrepreneurship.

Upskilling librarians

Many library science curricula around the world do not (yet) include an RDM module. Experienced librarians may not therefore have the necessary knowledge or skills to support RDM. Many data curation post advertisements require leadership, partnership, outreach and collaborative responsibilities but not a professional library qualification. Data curation posts have been repurposed from experienced librarian posts, taken up by new graduates, contractors, PhDs, or sometimes are joint appointments with different academic units. A review of the library profession with regard to RDM skills and knowledge is required to inform future education and training.

Peters’ presentation ‘Reskilling academic librarians for data management services’ highlighted Lewis’ research data management pyramid for libraries (p.16). Areas of early engagement with RDM are situated at the bottom of the pyramid, and as you get to the top you can take on the world!

Role of IT in data curation

Several speakers touched upon this: after all, IT underpins everything and IT support staff are often closer to researchers than librarians are. However, there may be a perception that data curation is not an IT role, per se. In another example of intellectual entrepreneurship, IT and data librarians can work together to provide research data support services: IT can bring UX (User Experience) skills e.g. design of systems, project management, and data librarians can bring their expertise in repository infrastructures, digital preservation, discovery and indexing methods for example.

The definition of data curation is evolving

The IFLA Library Theory and Research Panel Data Curation Project identified the role and responsibilities of data curators in international context. One aspect of the methodology was to undertake a review of literature and vocabulary describing data curation roles (using a cool keyphrase digger tool!), and analysing the content analysis of job advertisements (in 35 countries). They found varying terms to describe data curation (e.g. data stewardship, digital preservation, data science, and RDM, the preferred term). Outreach and advocacy to researchers was found to be an important aspect of roles, which again relates back to the theme of intellectual entrepreneurship.

Central vs. discipline-specific RDM activities at the University of Cambridge

As I have mentioned, I presented my paper on behalf of the OSC. Since its establishment in 2015 the OSC has developed many services to support RDM at the University, including a central website, RDM training and support, and a data repository. It communicates with researchers and support staff including librarians and administrators across the University using a variety of methods. There is therefore a considerable amount of outreach into departments and faculties where research takes place. However, its resources are limited: it is not possible for it to deliver RDM training for example in every department or faculty in the University, especially on a discipline-specific basis.

Most departments and faculties in the University have an embedded library service, which is discipline-specific. Librarians are in a key position to be able to collaborate with the OSC and their own researchers in developing and implementing RDM services locally. My paper presents a case study of how centralised RDM services have been rolled out in the Department of Chemistry, thus adapting the central RDM messages to discipline-specific needs. I describe how customising centralised RDM training to all new graduate students in the Department, being a member of the University’s RDM Project Group, and being involved in the OSC’s Data Champions programme has benefitted both the OSC and the Department.

Identity crisis?

The conference taught me that the identity of data curators is constantly evolving. Does it even matter what we call ourselves? Whatever the term used to describe us, we have similar roles and goals, and need to equip ourselves for future challenges. The concept of intellectual entrepreneurship is worth exploring further as a way of empowering ourselves.

The conference gave me a great opportunity to share and learn about RDM best practice from practitioners across the world. It reinforced for me the fact that we are all in it together, facing the same challenges and working together to come up with solutions.

Observations

The conference took place at the very impressive University of Warsaw Library, which is centrally located beside the Old Town in Warsaw, right next to the Vistula River. Around 40 delegates attended from all over the world.

Warsaw itself is a lively city, though with a rich, if at times tragic, history. After the conference dinner (a BBQ outside on a very warm evening!) we were treated to an entertaining evening bus tour around the city. We passed the amazing POLIN Museum of the History of Polish Jews, travelled through the area where the Warsaw Ghetto had been, and took in examples of communist era architecture (in particular the imposing Palace of Culture and Science).

        

Published 15 December 2017
Written by Clair Castle @chemlibcam
Creative Commons License