Women in Big Data 2019

Author
Beatrice Huber
NRP 75 “Big Data”

On 20 and 21 June 2019, women from academia and industry met in Zurich at the 2nd “Women in Big Data” workshop.

“Women in Big Data” is a cross-disciplinary activity of NRP 75 with the goal to bring together women working on Big Data research. Participants from academia and industry gather at the workshop to discuss their work and to exchange views on career challenges and possible solutions. The workshop format encourages the exchange of research and career insights across different disciplines and gives participants a unique opportunity to celebrate and support each other.

Machine Learning for Social Good

This June was time for the second workshop, again in Zurich but at another location. Yvonne-Anne Pignolet, co-organiser of the workshop, welcomed the approximately 70 female participants, then handed over to keynote speaker Mihaela van der Schaar from the University of Cambridge. Mihaela spoke about “Machine Learning for Social Good: Joining Female Intelligence to Artificial Intelligence”. She emphasized the importance — for her personally and as good advice for young scientists — of working on motivating applications, not just on games and smart cars. She found her passion for machine learning in medicine, and her research involves developing cutting-edge machine learning, AI and operations research theory, methods, algorithms and systems to deliver precision medicine at the patient level. Her work is about understanding the basis of health and disease, supporting clinical decisions for patients, and informing and improving clinical workflows to better utilize resources and reduce costs. Ultimately, she aims to help transform public health and policy.

Mihaela presented her recent work with cancer, which encompasses a common, costly and important array of disorders. We learned that England has a national cancer data collection service, the goal of which is to learn and forecast patient-level trajectories. Why has machine learning not been used so far in medicine for decision support? The answer is simple: Inadequate, simplistic models on the population level with a one-size-fits-all approach are not useful when you want to predict patient-level trajectories. Not only black-box predictions, but also trustworthy interpretations are needed. She concluded her keynote with some advice to young researchers: Find a problem that motivates you, find collaborators who are eager to teach and learn and are supportive, and build a stimulating, supportive environment.

Panel discussion about career trajectories

After her keynote, Mihaela van der Schaar joined the first panel discussion about career trajectories together with Anita Schmid, data scientist at Migros, Sabine Suesstrunk, professor at EPFL, and moderator Olivia Kühni. Asked about the problems faced by women in Big Data, Sabine Suesstrunk pointed out that women are frequently not heard; women are interrupted. She mentioned mansplaining and quotas. It is also crucial to make judicious choices about what projects you want to pursue, so it is important not to make a commitment if your heart is not in it. Mihaela van der Schaar expressed the need to be assertive, even if is not in one’s nature. Anita Schmid pointed out the challenges of being the only female in a team, trying to belong but never really being “part of the club”, feeling inadequate, as well as personal issues such as navigating pregnancy and parenthood in the context of career expectations. She recommended keeping a balance between time spent with colleagues and with people outside work. Sabine Suesstrunk stressed the importance of building both professional and private networks.

Olivia Kühni mentioned the challenge that papers by male authors are more likely to be cited. Mihaela van der Schaar reminded us that citations are but one metric. To her, for example, good results are a successful PhD student or an impact on the real world. Sabine Suesstrunk noted that for an academic career, one has to play along until tenure, when you can finally pursue what you want. Anita Schmid confirmed that in industry as well, you have to “play the game”, but to do so, she recommended that you find a good fit. In summary, Sabine Suesstrunk pointed out that it is the majority that can change the state of the minority, not the minority itself.

After the first part of the madness session, Day 1 of Women in Big Data ended with networking at the dinner buffet.

Women in Big Data: Day 2

Sara Irina Fabrikant, is professor at the Department of Geography of the University of Zurich and director of the Digital Society Initiative (DSI), a sponsor of WiBD. DSI is a network within the university that spans across departments and disciplines. Sara gave an overview on the mission and organization of DSI.

The first keynote of Day 2 was by Natalie Schluter, Associate Professor at the IT University of Copenhagen, on “Summarising biases in public perception”. Women are scarce in computer science. After a peak in the 1980s, less than 20 percent of degrees in computer science are awarded to women today. The situation in the field of Natural Language Processing (NLP) is somewhat better with 33 percent. However, this percentage says nothing about who holds power or enjoys success. Therefore, she evaluated 52 years of papers from different sources with automatic and manual annotation. Regarding power, just look at the authors of papers: the first author in a paper is the mentee and the last author is the mentor. She observed that, with the advent of machine learning around 2005, the gap between female and male mentors started to widen. She also found that it takes women significantly longer than men to reach mentor status, which of course is the so-called glass ceiling, the invisible, unethical, and yet virtually impenetrable barrier that prevents highly achieving women and minorities from obtaining equal access to senior career opportunities. In her own experience, the glass ceiling is real.

Diversity – beyond gender

Natalie Schluter then shared her findings in a panel discussion about “Diversity – beyond gender” joined by Daniela Gunz, Lydia Chen, professor at TU Delft and principle investigators of Dapprox and WiBD, Andrea Francke, Senior Software Engineer at Google, and moderator Olivia Kühni. The panel members described their experiences with the challenges of blending in, which can vary from country to country. Andrea Francke remarked that, in her team, there is no majority culture, so everyone needs to adapt. Daniela Gunz mentioned the availability of company manuals on how to deal with different cultures. Lydia Chen spoke for many by saying that sometimes you just have to play along. Andrea Francke shared the constant uncertainty of not knowing whether there is bias at all. Natalie Schluter encouraged young scientists to grasp every opportunity to be a spokesperson in order to help open the door for other women (or other minorities), even if it can be challenging.

Change the system, not the woman

Caitlin Kraft-Buchmann, CEO and Founder of Women@TheTable, held a very powerful talk about “Triple A: Affirmative Action for Algorithms – A concrete call for action”. She presented studies showing that women tend to be rated lower than men for the same work. The standard is male, whether the topic be crash tests, VR helmets, uniforms, or thermostat settings. Hence, algorithms simply reflect the “normal” world. How can there be fairness when the data is biased? She encouraged us to ensure that machine learning does not embed an already biased system into all our futures, for which there must be affirmative action for algorithms in automated decision making. The system has to be changed, not women. She cited the Buenos Aires Declaration on Trade & Women’s Economic Empowerment as a step in the right direction, and said that we must establish new tools and new norms for lasting institutional and cultural systems now and for coming centuries.

Session on Data Ethics

Christin Schaefer, data scientist and member of the German Ethics Commission, gave an invited talk presenting a data ethics kaleidoscope. Issues such as data protection, privacy and sustainability need to use Big Data in order to flourish. Different aspects of ethics include power and responsibility: it is the boss, not the technician, who makes a system ethical. She also mentioned the ethical obligation to use data for the common good, such as for healthcare, as urged by Mihaela van der Schaar in her keynote on Day 1. One important issue is the very definition of “privacy”, being the places where one is not observed. How private are our homes? What is an adequate depth of a personality analysis? Consider the Facebook algorithm that searches for possible suicides — is this ethical? Who decides? Who sets the targets, for example regarding one’s health? For Christin Schaefer, it is clear that these decisions should not be made by a single person behind a desk.

For the panel discussion on data ethics, Petra Arends from the Davos Digital Forum, Sarah Etter, statistician at the Gesundheitsdirektion Zürich, Eiko Yoneki from the University of Cambridge, and moderator Olivia Kühni joined Christin Schaefer. Petra Arends opened the discussion by stating that the issue is still largely unexplored. Sarah Etter pointed out that statistics are strictly regulated in the public sector. Should data be more open or more closed? Christin Schaefer feels that this should be decided carefully. Eiko Yoneki mentioned differential privacy. Petra Arends encouraged us to see the big picture, to look at data as a source for information for whatever purpose, not from a feeling of danger,  but from the bright side.

Best poster

The WiBD workshop included a poster session. The Best Poster award went to Preethi Lahoti for her contribution about fairness in algorithm decision making. She is PhD student at the Max Planck Institute for Informatics. She presented her work entitled “iFair: Learning Individually Fair Representations for Algorithmic Decision Making” in a short talk.

Day 2 ended with the second part of the madness session and closing remarks by the organizers.

About the project

Related links