Celebrating Women in Data Science and Fighting Algorithmic Bias: Part 1

Written by Nathan Babcock: 

Many STEM fields, such as tech, and especially data science have a diversity problem. Only a fraction of women with STEM degrees make it into careers related to data science and AI, and as of 2020, only 15% of data scientists are women. Some of this imbalance may be due to self-selection. According to a 2020 report by BCG, data science careers are seen as offering work cultures and environments that alienate women. In particular, the report suggests some STEM students hold negative perceptions of the work and culture of the field, with the general sense that “data science feels abstract and without sufficient purpose”. The gender gap is a problem for us all, because the lack of representation of women can be harmful to data science work. It is widely understood that AI algorithms are susceptible to bias, yet they impact a lot of what we see and do each day. It takes a diverse team of scientists to ensure that models in danger of bias produce accurate and fair results, so it is imperative that women are represented in the work that produces them. The following post will be the first of several that introduce us to women making strides in the field and their work that is anything but “abstract’ and “without purpose”!

 

The Women in Data Science (WiDS) initiative at Stanford is centered around empowering women, as well as educating all data scientists regardless of their gender or identity. One of the projects from the initiative is a podcast called the ‘Women in Data Science Podcast’–hosted by Professor Margot Gerritsen and Cindy Orozco Bohorquez. The show features female trailblazers in the data science field, where they share their work, provide advice for women wanting to break into the field, and describe their journeys in becoming leaders in data science work and research. The podcast does a superb job of interviewing guests from diverse backgrounds, diverse cohorts of thought, and a diverse collection of disciplines and expertise. This blog will feature WiDS talks with women involved in the career building and healthcare fields. 

In the WiDS episode ‘Using Data to Create Economic Opportunities For All Members Of Global Workforce’, we hear from Ya Xu, the head of LinkedIn’s global data science team and manager of projects across the company’s entire platform. In the podcast Ya Xu dives into LinkedIn’s efforts to create job opportunities and connections for all of the app’s members and how the company takes immense responsibility for data privacy, especially for the sake of fairness. Xu leads multiple initiatives within the company that are centered around making the platform more diverse and fair. The fairness initiative’s goal is to ensure that people with equal talent and abilities are given an equal opportunity to secure a job. LinkedIn is constantly testing new features and products. Xu stated on the podcast that her team is conducting 500-600 synchronous experiments for different products and conducting analyses on the data in order to determine not only if the feature/product will be beneficial to LinkedIn, but also if it contributes to the company’s fairness initiative. As a trailblazer working in technology and data science, Xu offers career and leadership advice to women, saying “It’s up to you to define what kind of role you play in an organization. The more reactive you are, the more that people are going to give you orders. The more proactive you become, the better it is for the company.” She goes on to say that “women can be a lot more vulnerable, and it’s actually a strength. When we are vulnerable in front of our team, then they relate to us. There’s something different about women, and in a very good way.” Xu’s advice is very empowering to hear and underscores the message that women are important leaders. 

Another podcast guest, Dr. Marzyeh Ghassemi, Assistant Professor at the University of Toronto, also addresses the common theme of eliminating algorithmic bias in data science.  She is taking on this issue through the lens of data science in health care and machine learning. According to Dr. Ghassemi, there are existing biases in the data she works with surrounding patients’ access to care, the treatment that patients receive, and the outcomes of the care they receive. Her central goal is to eliminate these biases in the data models so that the bias isn’t perpetuated when the models are trained. She illustrates this on her featured episode of the WiDS podcast where she says “[there is a] mistrust between patient and provider, which we can capture and model algorithmically, and is predictive of who gets this aggressive end-of-life care,” when giving an example of bias and referring to research showing that end-of-life care for minorities is significantly more aggressive.  Another key aspect of her research is concerning data collection pertaining to health. Traditionally, health data is collected from people when they go to the doctor on an irregular basis, but she argues that data science in health care should shift toward self-reporting on a daily basis because that will provide data on what it truly means to be healthy. Dr. Ghassemi is a mother, minority, and a woman in a traditionally male field, so she provides some especially important advice for women in the field: there is no one defined path for anyone, don’t stress about checking other people’s boxes, develop a passion for your work, and surround yourself with great people and a great mentor.

The Quantitative Methods in the Social Sciences program at UM tries to foster both diversity of people and diversity of thought, and the insights provided by these notable women underscores the importance of those efforts. In a field that is so susceptible to bias, it is imperative that women are supported and pushed to the forefront of the sector. Feel free to listen to some of the episodes of the WiDS podcast featured here, and explore their entire catalog to see how women are innovating in data science, and check back soon for features on a couple of other trailblazing women!

Sources: 

  • “What’s Keeping Women Out Of Data Science?”. BCG Global, 2020, https://www.bcg.com/publications/2020/what-keeps-women-out-data-science. Accessed 16 Mar 2022. (Graphic included).

  • “How AI Could Help—Or Hinder—Women In The Workforce”. BCG Global, 2020, https://www.bcg.com/publications/2019/artificial-intelligence-ai-help-hinder-women-workforce. Accessed 16 Mar 2022.

  • Orlovic, Martina et al. “Racial And Ethnic Differences In End-Of-Life Care In The United States: Evidence From The Health And Retirement Study (HRS)”. SSM – Population Health, vol 7, 2019, p. 100331. Elsevier BV, doi:10.1016/j.ssmph.2018.100331. Accessed 16 Mar 2022.

  • “Wids Podcast”. Women In Data Science (Wids), 2022, https://www.widsconference.org/podcast.html. Accessed 16 Mar 2022.

  • Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. New York: Crown Publishers, 2016. 272p. Hardcover, $26 (ISBN 978-0553418811)