Big Data in Sozialwissenschaften

What is “Big Data”?

Author
Prof. Sophie Mützel
Universität Luzern

The NRP 75 project Facing Big Data: Methods and skills for a 21st-century sociology started from the premise that “Big Data” are data that are large, diverse, often unstructured, and concern a wide array of phenomena. The project has investigated different meanings and uses, as well as methods, tools, and skills necessary for the analysis of Big Data in three data-analytic fields: sociology, data science, and data journalism. In doing so, the project has also approached a definition of “Big Data.”

For respondents from the field of data sciences, a prominent position from the computer sciences is that the development of the field is strongly related to the existence of the sheer mass of data in different fields of society as well as the technical infrastructure to collect, store, process and analye them in newly huge sizes. Data sciences are thus understood as a set of methods, tools, and skills in order to be able to analyse “Big Data.”

Less and less in use, not very precise

However, the analysis of job advertisements for data scientists showed a discursive shift away from the term and idea of “Big Data” between 2017 and 2020. For many reasons, including ethical considerations and public image, organisations in various fields are abandoning the term for alternatives such as artificial intelligence or machine learning. We also found this in degree programs that have changed or re-specified their denominations within the last couple of years.

Moreover, other respondents, mainly from statistics, found the term to lack precision and scientific substance; it addresses predominantly the realm of engineering, therefore they decided against using the term if in charge. To be sure, at the level of degree programs, an engineering approach to big data is currently dominating the curricula at Swiss universities. In many degree programs, courses on “Big Data engineering” or “Big Data analysis” are part of the mandatory core curriculum in data sciences.

Overall, in data sciences the category “Big Data” is characterized by a certain ambiguity and “conceptual vagueness.”[1] Its use serves both to orient and to differentiate from related disciplines and fields. While this was helpful for stakeholders during constitutive phase of the field, our research indicates that with increased expertise and institutionalization of the larger field, the usage of the term “Big Data” will wane.

Integrating Big Data in social sciences

In the social sciences, and in sociology in particular, the situation presents itself differently: On the one hand, we found that respondents from the field of German-speaking sociology considered “Big Data” to be about large data sets, which also includes large and longitudinal survey-data, administrative data, as well as data derived from social media sites. The initially anticipated hesitancy in teaching on Big Data in the obligatory courses has remained.

Nevertheless, over the past four years, academic discussions on large scale data in the social sciences, in the US and in Europe, have greatly developed.[2] The institutionalisation of computational social science has continued to proceed, including several training initiatives for doctoral students and early postdocs (e.g. The Summer Institutes in Computational Social Science with global satellites, GESIS summer schools) and conferences (e.g., IC2S2). Throughout Europe, we find new professorships and study programs for computational social sciences. In the German-speaking world, their leads typically come from computer science or engineering, and only very rarely but more and more so social scientists are in charge (see e.g. MA in Computational Social Science at the University of Lucerne, co-initiated by Sophie Mützel, the Principal Investigator of this project). It is apparent that the ability to work with big data, i.e. old and new types of data, and to know its strengths but also its limitations is an invaluable asset for successful applicants in the academic and professional job market – even in non-technical subjects.

In Switzerland, universities, also following initiatives and policies on digital transformation, have picked up the pace in teaching digital skills, data literacy, and computational thinking. Some have launched institution-wide efforts to teach digital skills as part of the general curriculum, e.g. the “Studium Digitale” at Bachelor’s level at the University of Zurich, the “Certificate in Data Science Fundamentals” at the University of St. Gallen, or the introductory course “Comprendre le numérique” at the University of Geneva. The University of Lucerne launched the university-wide module “Computational Sciences and Digital Skills“, offering its BA students courses to strengthen their digital skills and work with large data sets.[3]

[1] Favaretto, M./E. De Clercq/C.O. Schneble/B.S. Elger, 2020: “What is your definition of Big Data? Researchers’ understanding of the phenomenon of the decade”, in: PLoS ONE 15 (2): e0228987, https://doi.org/10.1371/journal.pone.0228987

[2] See for example for overviews: Salganik, M., 2017: Bit by Bit. Princeton: Princeton University Press; Lazer, D.M.J./A.

Pentland/D.J. Watts et al., 2020: “Computational social science: Obstacles and opportunities”, in: Science 369: 1060-1062; Edelmann, A./T. Wolff/D. Montagne/C.A. Bail, 2020: “Computational Social Science and Sociology”, in: Annual Review of Sociology 46: 61-81

[3] Many of these institutional initiatives have been supported by swissuniversities on the occasion of the programme “Strengthening Digital Skills in Teaching”, financed through financial means by the Swiss Confederation as part of the strategy “Digitale Schweiz”.

About the project

Related links