Tag Archives: Open Research

Open Research for Inclusion – event round up

Dr Mandy Wigdorowitz, Open Research Community Manager, Cambridge University Libraries

On Friday 17 November 2023, participants from across Cambridge and beyond gathered for a hybrid meeting on Open Research from different perspectives. Hosted by Cambridge University Libraries at Downing College, ‘Open Research for Inclusion: Spotlighting Different Voices in Open Research at Cambridge‘ drew attention to different areas of Open Research that have been at the forefront of recent discussions in Cambridge by showcasing the scope and breadth of open practices in typically under-represented disciplines and contexts. These included the Arts, Humanities, and Social Sciences, the GLAM sector (Galleries, Libraries, Archives, and Museums), and research from and about the Global South. A total of 84 in-person and 75 online attendees participated in the day-long event consisting of a keynote address, two talks, two panels, and a workshop.

The conference opened with a welcome address from Professor Anne Ferguson-Smith CBE FRS FMedSci, Pro-Vice-Chancellor for Research and International Partnerships and the Arthur Balfour Professor of Genetics. Professor Ferguson-Smith emphasised the significance and timeliness of the conference and how it underscores the importance of the Open Research movement. She encouraged attendees to be open to new ideas, approaches, and perspectives that center around Open Research and to celebrate the richness of diversity in research.

Our keynote speaker, Dr Siddharth Soni, Isaac Newton Trust Fellow at Cambridge Digital Humanities and affiliated lecturer at the Faculty of English, then addressed the audience with a talk on Common Ground, Common Duty: Open Humanities and the Global South, providing an account of how to think against neoliberal conceptions of ‘open’ and to reimagine what openness might look like if the Global South was viewed as a common ground space for building an open and international university culture. Dr Soni’s address set the tone for a rich, multi-layered exploration of Open Research on the day, urging attendees to think of open humanities as a form of knowledge that seeks to alter the form and content of knowledge systems rather than just opening Euro-American knowledge systems to global publics.

Dr Siddharth Soni Common Ground, Common Duty: Open Humanities and the Global South

The next talk was from Dr Stefania Merlo from the McDonald Institute for Archaeological Research and Dr Rebecca Roberts from the McDonald Institute for Archaeological Research and Fitzwilliam Museum who further explored the theme of the Global South in their practical perspective on how they managed the curation of digital archives for heritage management from their work on the projects: Mapping Africa’s Endangered Archaeological Sites and Monuments (MAEASaM) and Mapping Archaeological Heritage in South Asia (MAHSA). They reflected on the opportunities and challenges relating to the production and dissemination of information about archaeological sites and monuments in projects across Africa and South Asia as well as their experience working with and learning from local communities.

Dr Stefania Merlo and Dr Rebecca Roberts Open Data for Open Research – Reflections on the Curation of Digital Archives for Heritage Management in the Global South

An Open Research panel session was next which featured panellists with diverse backgrounds and expertise who addressed registrants’ pre-submitted and live questions. Some questions included the meaning of Open Research, its advantages and challenges, how Open Research can be engaged with by researchers (and in particular, early career researchers), and how it can be rewarded and embedded into the culture of research practices. There was engaging insights and debate amongst the panellists, led by Bertrand Russell Professor of Philosophy, Professor Alexander Bird. He shared the platform with Philosophy of Science Professor, Professor Anna Alexandrova, Psychiatry PhD student Luisa Fassi, Cambridge University Libraries (CUL) Interim Head of Open Research Services Dr Sacha Jones, Cambridge University Press & Assessment’s Research Data Manager Dr Kiera McNeice, and Cambridge’s Head of Research Culture Liz Simmonds.

Open Research Panel

Following lunch, a second panel of scholars working across the GLAM sector (Galleries, Libraries, Archives and Museums) took place. The panel was chaired by CUL’s Scholarly Communication Specialist, Dr Samuel Moore, and brought together experts who showcased their diverse work in this sector, from software development and museum practices to infrastructure and archiving support. The panel included Dr Mary Chester-Kadwell, CUL’s Senior Software Developer and Lead Research Software Engineer at Cambridge Digital Humanities, Isaac Newton Trust Research Associate in Conservation Dr Ayesha Fuentes from the Museum of Archaeology and Anthropology, Dr Agustina Martinez-Garcia, CUL’s Head of Open Research Systems, and Dr Amelie Roper, CUL’s Head of Research. Each panellist presented on a specialist area, including Open Research code and data practices in digital humanities, collections research, teaching and learning collections care, and Open Research infrastructure. A lively discussion followed from the presentations.

GLAM panel

In a workshop session, Tim Fellows, Product Manager for Octopus, outlined how Octopus is a free and alternative publishing model that can practically foster Open Research. The platform, funded by UKRI, is designed for researchers to share ‘micro publications’ that more closely represent how research is conducted at each stage of a project. In a demonstration of the platform, Tim Fellows showed how Octopus works, it’s design, user interface, and application all with the aim of aiding reproducibility, facilitating new ways of sharing research, and removing barriers to both publishing and accessing research. An in-depth discussion followed which centered on the ways the platform can be used as well as its uptake and application across various disciplines.

Tim Fellows Octopus.ac: Alternative Publishing Model to Foster Open Research

The final talk of the day was on Open Research and the coloniality of knowledge presented by Professor Joanna Page, Director of CRASSH and Professor of Latin American Studies. She discussed the topic with a specific focus on the questions of possession and access by outlining projects by three Latin American artists who have engaged with Humboldt’s legacy and the coloniality of knowledge. Using videos and imagery, Professor Page encouraged the audience to consider how they might identify where the principles of Open Research conflict with those of inclusion and cognitive justice, and what might be done to reconcile those ambitions across diverse cultures and communities. An engaging discussion ensued.

Professor Joanna Page Open Research and the Coloniality of Knowledge

A drinks reception brought the event to a close, allowing attendees a chance to mingle, network and continue the discussions. 

Special thanks to all speakers, attendees, and volunteers for making this event such a success. Stay tuned for information about our 2024 Open Research conference.

Data Diversity Podcast #1 – Danny van der Haven

Last week, the Research Data Team at the Office of Scholarly Communication recorded the inaugural Data Diversity Podcast with Data Champion Danny van der Haven from the Department of Material Science and Metallurgy.

As is the theme of the podcast, we spoke to Danny about his relationship with data and learned from his experiences as a researcher. The conversation also touched on the differences between lab research and working with human participants, his views on objectivity in scientific research, and how unexpected findings can shed light on datasets that were previously insignificant. We also learn about Danny’s current PhD research studying the properties of pharmaceutical powders to enhance the production of medication tablets.   

Click here to listen to the full conversation.

If you have heart rate data, you do not want to get a different diagnosis if you go to a different doctor. Ideally, you would get the same diagnosis with every doctor, so the operator or the doctor should not matter, but only the data should matter.
– Danny van der Haven

   ***  

What is data to you?  

Danny: I think I’m going to go for a very general description. I think that you have data as soon as you record something in any way. If it’s a computer signal or if it’s something written down in your lab notebook, I think that is already data. So, it can be useless data, it can be useful data, it can be personal data, it can be sensitive data, it can be data that’s not sensitive, but I would consider any recording of any kind already data. The experimental protocol that you’re trying to draft? I think that’s already data.   

If you’re measuring something, I don’t think it’s necessarily data when you’re measuring it. I think it becomes data only when it is recorded. That’s how I would look at it. Because that’s when you have to start thinking about the typical things that you need to consider when dealing with regular data, sensitive data or proprietary data etc.   

When you’re talking about sensitive data, I would say that any data or information of which the public disclosure or dissemination may be undesirable for any given reason. That’s really when I start to draw the distinction between data and sensitive data. That’s more my personal view on it, but there’s also definitely a legal or regulatory view. Looking for example at the ECG, the electrocardiogram, you can take the electrical signal from one of the 12 stickers on a person’s body. I think there is practically nobody that’s going to call that single electrical signal personal data or health data, and most doctors wouldn’t bat an eye.   

But if you would take, for example, the heart rate per minute that follows from the full ECG, then it becomes not only personal data but also becomes health data, because then it starts to say something about your physiological state, your biology, your body. So there’s a transition here that is not very obvious. Because I would say that heart rate is obviously health data and the electrical signal from one single sticker is quite obviously not health data. But where is the change? Because what if I have the electrical signal from all 12 stickers? Then I can calculate the heart rate from the signal of all the 12 stickers. In this case, I would start labelling this as health data already. But even then, before it becomes health data, you also need to know where the stickers are on the body.   

So when is it health data? I would say that somebody with decent technical knowledge, if they know where the stickers are, can already compute the heart rate. So then it becomes health data, even if it’s even if it’s not on the surface. A similar point is when metadata becomes real data. For example, your computer always saves that date and time you modified files. But sometimes, if you have sensitive systems or you have people making appointments, even such simple metadata can actually become sensitive data.   

On working within the constraints of GDPR  

Danny: We struggled with that because with our startup Ares Analytics, we also ran into the issues with GDPR. In the Netherlands at the time, GDPR was interpreted really stringently by the Dutch government. Data was not anonymous if you could, in any way, no matter how difficult, retrace the data to the person. Some people are not seeing these possibilities, but just to take it really far: if I would be a hacker with infinite resources, I could say I’m going to hack into the dataset and see the moments that the data that were recorded. And then I can hack into the calendar of everybody whose GPS signal was at the hospital on this day, and then I can probably find out who at that time was taking the test… I mean is that reasonable? Is anybody ever going do that? If you put those limitations on data because that is a very, very remote possibility; is that fair or are you going hinder research too much? I understand the cautionary principle in this case, but it ends up being a struggle for us in in that sense.  

Lutfi: Conceivably, data will lose its value. If you really go to the full extent on how to anonymise something, then you will be dataless really because the only true way to anonymise and to protect the individual is to delete the data.  

Danny: You can’t. You’re legally not allowed to because you need to know what data was recorded with certain participants. Because if some accident happens to this person five years later, and you had a trial with this person, you need to know if your study had something to do with that accident. This is obvious when you you’re testing drugs. So in that sense, the hospital must have a non-anonymised copy, they must. But if they have a non-anonymized copy and I have an anonymised copy… If you overlay your data sets, you can trace back the identity. So, this is of course where you end up with a with a deadlock.  

What is your relationship to data?  

Danny: I see my relationship to data more as a role that I play with respect to the data, and I have many roles that I cycle through. I’m the data generator in the lab. Then at some point, I’m the data processor when I’m working on it, and then I am the data manager when I’m storing it and when I’m trying to make my datasets Open Access. To me, that varies, and it seems more like a functional role. All my research depends on the data.  

Lutfi: Does the data itself start to be more or less humanised along the way, or do you always see it as you’re working on someone, a living, breathing human being, or does that only happen toward the end of that spectrum?   

Danny: Well, I think I’m very have the stereotypical scientist mindset in that way. To me, when I’m working on it, in the moment, I guess it’s just numbers to me. When I am working on the data and it eventually turns into personal and health data, then I also become the data safe guarder or protector. And I definitely do feel that responsibility, but I am also trying to avoid bias. I try not to make a personal connection with the data in any sense. When dealing with people and human data, data can be very noisy. To control tests properly, you would like to be double blind. You would like not to know who did a certain test, you would like not to know the answer beforehand, more or less, as in who’s more fit or less fit. But sometimes you’re the same person as the person who collected the data, and you actually cannot avoid knowing that. But there are ways that you can trick yourself to avoid that. For example, you can label the data in certain clever way and you make sure that the labelling is only something that you see afterwards.   

Even in very dry physical lab data, for example microscopy of my powders, the person recording it can introduce a significant bias because of how they tap the microscopy slide when there’s powder on it. Now, suddenly, I’m making an image of two particles that are touching instead of two separate particles. I think it’s also kind of my duty, that when I do research, to make the data, how I acquire it, and how it’s processed to be as independent of the user as possible. Because otherwise user variation is going to overlap with my results and that’s not something I want, because I want to look at the science itself, not who did the science. 

Lutfi: In a sentence, in terms of the sort of accuracy needed for your research, the more dehumanised the data is, the more accurate the data so to speak.   

Danny: I don’t like the phrasing of the word “dehumanised”. I guess I would say that maybe we should be talking about not having person-specific or operator-specific data. If you have heart rate data, you do not want to get a different diagnosis if you go to a different doctor. Ideally, you would get the same diagnosis with every doctor, so the operator or the doctor should not matter, but only the data should matter. 

             ***  

If you would like to be a guest on the Data Diversity Podcast and have some interesting data related stories to share, please get in touch with us at info@data.cam.ac.uk and state your interest. We look forward to hearing from you!  

Reflections from the Edinburgh Open Research Conference

Dr Mandy Wigdorowitz holds the position of Open Research Community Manager for Cambridge University Libraries where she is developing an open research community across Cambridge. She has a PhD in Theoretical and Applied Linguistics from the University of Cambridge and is a registered Research Psychologist with the Health Professions Council of South Africa. She also holds the position of Associate Editor for the Journal of Open Humanities Data.

The Edinburgh Open Research Conference 2023, offered by the University of Edinburgh Library Research Support Team and grassroots group Edinburgh ReproducibiliTea, provided a platform for the exchange of ideas and discussions about open research under the theme ‘Open Research as a Tool for Addressing Global Challenges’. Living up to its theme, the conference held numerous presentations focussing on the various ways in which open research practices can positively support efforts to address various challenges centring around open initiatives. The conference provided an opportunity for people from across the world to come together in a hybrid format to discuss how adopting the open research principles of open access, participation in research, transparency, and open data can ensure that the efforts of research are set up to help address global challenges, including in education, climate action, and global pandemics.

As a presenter and attendee, I reflect on the main take-homes from this event.  

With any conscientious and inclusive movement, clarification of terminology is important. The open research movement is no exception. Throughout the conference, many speakers acknowledged ‘open science’ as being an inclusive term, encompassing all areas of ‘openness’ or ‘open scholarship’, and one which extends beyond the ‘sciences’ to include all disciplines where knowledge synthesis and open research is considered. It was proposed that the phrase ‘open science’ is about intent and the larger goal of open research, and it should not be reduced to disciplines that fall under the ‘sciences’ umbrella per se. While the sentiment of this stance is reassuring and inclusive in intent, it is undeniable that there is weight behind the words we use. Instead, I would argue that it would be more inclusive to replace ‘science’ with ‘research’ when referring to the broad ‘open research’ movement. Doing so would safeguard against unintended misinterpretations about who may partake in and benefit from this movement.

A highlight from the conference was its celebration and acknowledgement of the growing impact of public engagement and citizen-led research. Case studies offered insight into how involving the public in data collection, analysis, and decision-making processes can enhance the relevance and societal impact of open research endeavours. For instance, UCL’s Institute for Global Prosperity aims to understand what prosperity means for people as informed by members of their respective communities. In addition, the Extreme Citizen Science: Analysis and Visualisation project employs the use of culturally appropriate geographical analyses and visualisation tools that can be used by varying communities with differing degrees of literacy to formulate research questions and collect relevant data. Attendees were encouraged to explore innovative ways of collaborating with non-academic communities to foster a culture of inclusivity, knowledge sharing, and insights that are driven from the communities under investigation, and to think about the value of smaller, local-scale projects in addition to large-scale projects.

Much attention was afforded to the dissemination avenues that prioritise FAIR principles (Findable, Accessible, Interoperable, and Reusable) and open practices, as well as who the contributors and accessors of such research outputs are. These efforts have largely been attributed to the increased availability of digital collections, the development of new data-intensive methods, increased pressure from funders, the requirement of data management plans for preservation purposes, the involvement and collaboration of research libraries, and the rollout of rights retention policies. Discussions centred around digital objects and data, including how these are produced, how and where they are openly and transparently shared, how they can be accessed and preserved, and what the potential of their reuse is. Such questions lead to the need for reputable sharing outlets that service people from all parts of the world and across all disciplines. Significant outlets that were mentioned included repositories, data dashboards, and data papers.

Data dashboards provide an overview of the various aspects associated with a research project, which allows for clear access to data insights when conducting large projects. An effective use of a data dashboard comes from DecodeME, the world’s largest study of ME/CFS (myalgic encephalomyelitis). Data papers are peer-reviewed publications that describe curated datasets. Data papers can be shared in traditional research journals as one subtype of article publication, or in data journals which are dedicated to the publication of data papers. This avenue of dissemination has been active in the STEM and Health disciplines, but it is being increasingly recognised and promoted within the Humanities and Social Sciences, largely driven by data journals in these areas, such as the Journal of Open Humanities Data. Overall, these discussions shed light on the challenges and potential solutions to ensure the quality and accessibility of open outputs derived from various research projects.

In addition to the many discussions about open software, which are ubiquitous in open research, open hardware was recognised as an emerging area in this arena. Open hardware can include, for instance, computing devices, scientific instrumentation, and remote sensing satellites that contribute to the conduction of research and discovery of knowledge. Typically, legal restrictions prohibit the investigation and modification of closed source hardware, resulting in a lack of reproducibility, duplication of effort, obsolescence, and financial burdens which ultimately reinforce global inequities. There have been recent efforts, however, to develop open-source hardware tools and devices to address global challenges particularly in under-resourced communities. Real-world case studies were presented that explore where and how open hardware has been used to address global challenges (e.g., in microscopy, space exploration, environmental monitoring) and make a difference in the lives of everyday people. The Gathering for Open Science Hardware was identified as a community whose mission is to promote open hardware and the practices ensuring its success. Open hardware presents an exciting opportunity for progress as its potential for solving global problems is far-reaching and scalable.

Education also emerged as key to the open research movement. The conference presented best practices in research data management and open educational resources for postgraduate students and educators from the perspective of a university lecturer. Training and mentoring programmes about open practices were mentioned, where people interested in applying open principles in their work and becoming ambassadors in their communities could sign up to Open Life Science to participate in an open research training course, and to Open Hardware Makers to support open hardware projects.

In sum, the Edinburgh Open Research Conference was successful in showcasing the advancement of open research with a focus on addressing global challenges. Open research is a fundamentally iterative process where we can all learn and build upon the accumulated work and knowledge that has been done before us. In this way, the event illustrated the remarkable progress that has been made in various domains and throughout the research lifecycle. By bringing together individuals from diverse backgrounds and contexts, this conference provided a platform for knowledge sharing and community-building at the forefront of open research.

You can find all the talks and slides from the conference here.