Regulating big data research

Author
Prof. Bernice Simone Elger
University of Basel

Home > Data ownership, control, access, and transfer > Regulating big data research

13. July 2021

Interview with the principal investigators of this NRP75 project.

What was the aim of your project?

Our project aimed at furthering a sensible, efficient, and safe framework for beneficial Big Data research and reducing ethical and regulatory uncertainty that surrounds Big Data research by providing ethical guidance for individual researchers and research institutions, and addressing the needs for harmonization concerning the collection, storage, and analysis of digital data.

What are the main messages of the project?

From an ethical standpoint, Big Data research raises unpredictable ethical challenges including risk of harm for research participants – such as discrimination, interference with privacy and possible misuse of their data. Our project highlighted how the analysis of the ethical issues of Big Data and the assessment of the harm that could result for research participants is still at its infancy and there is still considerable uncertainty – both in the literature and among scholars globally – on how to appropriately regulate it. Specifically, the context-dependent nature of the ethical issues of Big Data creates huge barriers towards the creation of standardized and uniform guidelines and regulations.

According to our results, academic scholars carrying out human subject research are at least aware of the possibility of these unpredictable challenges, however the difference between human subject research and research using anonymous data is often blurred and guidance is lacking on how to appropriately prevent possible harm. Therefore, digitalisation of research practices in Switzerland should be accompanied by programmes to promote ethical education and reflection for scientists and scholars who want to undertake Big Data studies.

Further important findings?

There is a notable interdependency between legal, ethical, and social issues. This problem is not new, and regulatory uncertainty is common to various new technologies entering the field of science and being implemented in societies. Ubiquitous datafication, however, makes Big Data research particularly visible and relevant to society. Mitigating risks and harms would entail a procedural approach that entails the translation of all governance aspects from the ethical principles to guiding law towards concrete implementations of mechanisms.

The fast pace of technological evolvement would be subject to iterative cycles and several adjustments along the path. For example, the introduction of the GDPR has affected the concrete implementation of consent mechanisms. The discovery of new deidentification algorithms would affect anonymisation. Those two examples are only two of myriads of examples that affect the implementation, but they show the necessity of revising and updating guidelines, procedures, and code of conducts on a regular basis.

What does that mean for ethical committees?

Due to the trend towards digitalisation of research and the numerous ethical issues that Big Data is raising, implementation of more specific ethical oversight and assessment should be considered. While one solution is to add experts of big data research to cantonal research ethics committees, this does not solve the problem of providing guidance for research that is considered to be outside of the scope of the human research act. The role and purview of recently implemented Swiss ethics review boards at an institutional (universities or faculties) level should be expanded to be able to provide specific guidance to Big Data research. One of our papers provides some suggestions and best practices for the enhancement of the ethics committee system, that are extremely relevant also for the Swiss context.

Review boards should be professionalized and monitor the ethical soundness of research projects throughout their whole data lifecycle and become an integral part of the research project, thus also sharing the responsibility and burden for the protection of research participants with scholarly investigators themselves. In addition, due to the increased involvement of private companies within the academic research ecology, ethical assessment should be expanded also to research conducted by private companies, thus enhancing protection of research participants and sustainable collaborative practices with academic institutions.

Your project has some policy recommendations. What are these?

Indeed, the results of this project are relevant mostly on a policy recommendation level. Regarding the construction of a comprehensive and harmonised framework for Big Data Research, our results show how such an endeavour is intrinsically difficult. There are currently many heterogeneous types of Big Data research and analysis that can result in different ethical issues which cannot be reduced to a simple overarching one-size-fits-all regulation.

Therefore, guidelines and regulations for human subject research that use Big Data methods, instead of imposing inflexible standardised norms, should put emphasis on the importance of contextually driven decision making, ethical deliberation and balanced trade-off analysis throughout the lifespan of a research project – from the design of a particular study to after the study is.

Furthermore, in the context of ethical oversight, our analysis underscores the importance of expanding ethical appraisal and regulation also to corporate research. This will allow, on the one hand, to avoid having harmful human subject Big Data research taking place outside of academia and, on the other, contribute to the development of appropriate collaborative practices between private corporations and public and academic institutions towards the design and conduction of beneficial Big Data studies.

And for the regulation of Big Data?

As for suggestions on how to appropriately regulate Big Data, results from our empirical study underscore that Big Data exceptionalism (treat Big Data differently than other types of data) is not the appropriate model to follow. An inclusive “all data” approach is more in line with the Swiss legislative context that puts emphasis on consent and notice and with the preferences expressed by Swiss lawyers and data protection officers during the interviews. The study amongst lawyers and data protection officers has shown that there are still many ambiguities that need to be addressed in more detail in the future. For example, the concept of data ownership that is regarded especially by laypersons as a solution to offer better control over one’s data is regarded as contentious by lawyers. Although many lawyers on a personal level find the concept that patients should benefit in some form of sharing their data understandable, on the professional level the concept of ownership is incompatible with today’s legal dogma.

Our analysis has shown that many safeguards to data already exist in various sectorial laws, such as the Human Research Act, the Federal act on Copyright and related rights and many more, which different institutional bodies have to comply with. Making this know-how accessible to laypersons and scientists is a crucial task for the future. Consent has been at the heart of research with subjects during the last half century. However, the tremendous growth of data subjects participating in research in large scale big data studies makes it difficult to uphold this mechanism both for researchers and data-subjects/ participants. Policymakers should therefore foster either dynamic and automatic consent portals or data cooperatives, to present an alternative to the only other possible option of broad consent.

Big Data is a very vague term. Can you explain to us what Big Data means to you?

Our research project contributed to the understanding of Big Data by performing an analysis of its definition. Our results point to an overall uncertainty or uneasiness towards the term itself within the academic milieu that might be a symptom of the tendency to recognise Big Data as a shifting and evolving cultural and scholarly phenomenon – or a cluster concept that includes a plethora of sophisticated and evolving computing methodologies – rather than a clearly defined and single entity, or methodology.

However, our research study underlines that conceptual clarity of the term Big Data would be of the outmost importance in order to strategize appropriate guidelines to protect research subjects in Big Data research in different disciplines. We argue that as long as definitions are unclear, laws, regulations and guidelines that are bound to govern Big Data research are unlikely to be fully effective, especially if researchers are unaware of the regulatory framework or refrain from defining their research as Big Data research out of fear of regulatory restrictions.

A strategy to overcome this issue and advance the understanding of Big Data is to proceed with the deconstruction or unfolding of the term Big Data into its different constituents, thus shifting from broad generalities – such as considering Big Data as an umbrella term – to specific qualities relevant for each specific subcategory of Big Data. This will enable us to appropriately make sense of the different ethical issues and create applicable strategies to protect research participants.