Big Data in Sozialwissenschaften

Big data in practice: sociology, data sciences and journalism

Author
Prof. Sophie Mützel
University of Lucerne

Interview with the principal investigator of the NRP75 project “Confronting Big Data: Methods and skills for a 21st-century sociology”.

What are the main conclusions from your project?

Methods, tools and skills are vital for developing scientific disciplines, whether new or existing ones. The process of learning and teaching methods plays a key role, not only for training skilled practitioners but also for developing a disciplinary and professional identity.

As our research into developments in data science and data journalism has shown, a wide range of expertise and activities across disciplinary and professional boundaries are all vital for innovation.

Methods, tools and even visualisations thus form part of the transversal skills enabling the exchange and further development of areas of knowledge across said boundaries. Methods, tools and skills are important. However, alone they are not enough to handle the challenges and opportunities of the digital age. Insights into how the social environment is structured, how algorithmically determined patterns in data (e.g. that gained from social interactions online) can be interpreted in terms of content, and how structural distortions inherent in data sets can be remedied all require transdisciplinary approaches combining knowledge of social science and humanities with technical expertise. We view sociology as serving in the vital role of translator between data, analyses, interpretations and potential effects.

What do you see as the potential implications?

Data science is an interdisciplinary field sat between industry, business, politics and science, requiring transversal skills. Data journalism is also highly interdisciplinary and incorporates a wide range of expertise. The implication of this: creating fields of work must be viewed as an interdisciplinary development based on a wide range of expertise and key players who are competing with each other but can also work collaboratively.

What are your recommendations?

Given the changing data and methods, the new skills required will call for changes to specialist fields and the promotion and improvement of transversal skills. However, remedies for the lack of data, visualisation and modelling skills and computer-aided reasoning should not be limited to developing and providing programmes in the fields of engineering and computer science. Instead, big data and the methods required to analyse it call for transversal skills that cross disciplinary boundaries. Sociological expertise – e.g. to structure the social environment, identify the pitfalls thrown up by the distortions found in data sets, and interpret the samples identified – can serve as a particular bridge between disciplines in this area. Another recommendation would be to further promote open knowledge and the shared use of data. This means not just open access to publications and data sets being made available in repositories, but also the comprehensive concept of the shared use of knowledge for the benefit of science. This is something to which decision-makers should be paying particular attention.

Particularly in the social sciences?

Social scientists are faced with the ongoing problem of how to gain access to social media data if they are not a member of a research team within a large technology company. Twitter provides researchers with API access, whilst Facebook only releases very limited data sets on selected topics for research purposes. However, working with a company is a laborious process. Access to data is often dependent on positions of privilege within the company, or on access to such positions. Some social scientists who are interested in current data sets choose to seek workarounds in experimental settings or develop their own software to gather online data. Guidelines in this area can help researchers gain access to data from archives and companies.

About the project

Philippe Saner’s thesis ‘Data. Science. Society’, on the data science subject matter covered by the NRP 75 programme, was awarded the 2021 Ulrich Teichler Prize by the GfHf (Society for University Research). GfHf board member and chairman of the judging panel Dr Roland Bloch, from the Center for School and Educational Research at the Martin Luther University of Halle-Wittenberg, described Saner’s dissertation as follows during the online award ceremony: ‘This paper examines the emergence of the field of data science in Switzerland, a very topical subject standing at the crossroads between university-based and scientific research. Philippe Saner has achieved this in exceptionally exciting, thorough and innovative form.’

Related links