CLASS PROFILE: DATA AND SOCIETY | STS.005J / 11.155J

The Social Life of Data
New course engages students in the ethics and societal implications of data
   


Eden Medina, Assoc Prof of Science, Technology & Society; Sarah Williams, Assoc Prof of Technology & Urban Planning

Medina is particularly well versed in the social, historical, and ethical aspects of computing, and Williams brings expertise as a practicing data scientist. Their multi-layered course is designed to train practitioners who know how to use data in responsible ways.


 

On a typical day in our data-saturated world, Facebook announces plans to encrypt its Messenger data, prompting uproar from child welfare activists who fear privacy will come at the cost of online safety. A new company Tillable, an AirBnB for farmers, makes headlines for allowing the public to rent farmland while collecting and tracking massive swathes of data on land use and profitability. Tesla comes under fire for concealing autopilot data, while the FTC announces that 2019 was a record year in protecting consumer privacy.

Given the daily avalanche of news in the contemporary tug of war between privacy and safety, Data and Society (STS.005J /11.155J)  always begins with a discussion of current events.

One of 36 classes in the new Computing and Society Concentration in MIT-SHASS, Data and Society focuses on two linked concepts: the process of data creation and analysis; and the ethical quandaries and policy vacuums surrounding how that data impacts society.

A gestalt approach to data

“The purpose of this class is to engage MIT students in thinking about data — data creation, data analysis — in ways that are not only technical but are also societal,” says Eden Medina, Associate Professor of Science, Technology, and Society, who co-taught the class this spring with Sarah Williams, an Associate Professor of Technology and Urban Planning.

Medina is particularly well versed in the social, historical, and ethical aspects of computing, and Williams brings expertise as a practicing data scientist. Their multi-layered course is designed to “train practitioners who think about the ethics of the work that they’re doing” and who know how to use data in responsible ways.

Medina and Williams crafted the inaugural semester of Data and Society around the life cycle stages of a normal data science project, guiding students to consider project facts such as who is collecting the data, how is the data is created, and how it is analyzed. Students then explore broader questions, including: How can power intersect with the way that data are created? What is the role of bias in data creation? What is informed consent and what role might it play into the way that data sets are generated and then eventually used and reused?
 



Assignment IV

In the fourth assignment students were given a data set with Covid-19 case numbers along with other socio-demographic information. They were asked to use the data to answer the same policy question two different ways using the same data as evidence policy. The question - which urban areas in the US should receive federal funding assistance for Covid-19 recovery. The goal of the exercise was to show that that same data can have vastly different results depending on how it is visualized and interpreted, enabling students to look more critically at data uses for policy arguments.
 

Image 1, above: Ann Zhang recommended funding be given to counties with the highest percentage of people who were 65 and over.


Image 2, below: Lilia Staszel recommended funding be given to counties with the highest number of Covid cases per uninsured people in the population



The impacts of data collection in daily life

As the course continues, students begin to discover the fine threads of cause and effect that can often slip under a purely technical radar. Bias in data collection, for instance, can have subtle and insidious effects on how the world is constructed around us; for instance, the way in which data is collected could further pre-existing bias rooted in social inequality. Practices of data collection, aggregation, and reuse can also present challenges for ethical practices such as informed consent. How can we make an informed decision without fully understanding how our data might be used in the future and the ramifications of that use?  

“I have worked a lot on the technical side with data both in my computer science classes, and with work experiences and my UROP,” says Darian Bhathena, a senior in the class whose studies span Computer Science and Engineering, Biomedical Engineering, and Urban Studies and Planning. “As engineering students we sometimes forget that, to be useful and applicable, all the technical material we’re learning has to fit within society as a whole.”

The intricate impacts of data collection in the students’ daily lives — from what they see in their Twitter feeds to how they interact with health-tracking apps — are front and center in the class, making the curriculum material and its implications personal.

A challenge at the core of a data-driven society

For one assignment, students created visualizations from data they collected, endeavoring to be as neutral as possible, then wrote about the decisions they made, including non-technical decisions, to build the dataset and use it for analysis.

One student downloaded all her text messages for a week, trying to track a correlation between weather and texting patterns. Another tried to determine which MIT dorm was the healthiest, entering diet data into a program they designed. Another student tried to track her own water usage against self-reported norms across the Cambridge area. All of the students ran into assumptions in their data models — for instance, about how much water is used to wash hands or how diets change over time. One by one, the students faced a series of built-in human decisions that prevented their data from being truly neutral.

The exercise illustrated the challenge at the core of our data-driven society: data is easy to gather, but its implications are far less easy to discern and manage. “A lot of decisions around data in the world are ours to make,” says Williams. “Technology moves much more quickly than regulation can.”
 


The Computing and Society Concentration

Zach Johnson, a sophomore majoring in Computer Science and Engineering, is also pursuing the new Computing and Society concentration. He says his experience in simultaneous technical and humanistic instruction has been eye-opening. “While I am learning how to write the code in my course 6 classes, this class is showing me how that code is used to do incredible good or incredible harm for the world.”


 
Fluency in the ethics of technology

The new Computing and Society Concentration, of which Data and Society is a core course, is part of a larger push across the Institute, echoed in the mission of the new College of Computing, to enable a holistic view of how technology both shapes, and is shaped by, the nuances of the world, and to develop Institute-wide fluency in the ethics of technology.   

Zach Johnson, a sophomore majoring in Computer Science and Engineering, is also pursuing the new Computing and Society concentration. He says his experience in simultaneous technical and humanistic instruction has been eye-opening. “I get to see all the application of what I am learning in the real world and get to learn the ethics behind what I am doing,” he says. “While I am learning how to write the code in my course 6 classes, this class is showing me how that code is used to do incredible good or incredible harm for the world.”

In the current public health crisis, Johnson is eager to apply his new insights to this unprecedented moment in the course’s final project. The assignment: study how another country is using data to address the coronavirus pandemic and identify which aspects of this approach, if any, the United States should adopt.

Johnson says, “I am entering this final project with a much greater sense of excitement than I would normally have. While all the topics of this course are interesting, it is particularly fascinating to be able to apply what is happening in the world during a time of crisis to my study of data science.”

Does tech provide more objective decisions?

Medina, herself a graduate (PhD ’05) of the MIT STS program, joined the faculty last July. Her current research centers on technology and human rights with a focus on Chile. Much of her previous and current scholarship relates to how people use data to bring certainty to highly uncertain situations and how our increased trust in technology and its capabilities echo through social realities.

“I see [this research] as very relevant to emerging issues in artificial intelligence and machine learning — because we are now putting our faith in new technological systems that are built on large repositories of data and whose decision-making processes are often not transparent. We are trusting them to give us a more objective decision — often without having the means to consider how flawed that 'objective' decision might be. What harms can result from such practices?”

Williams' Civic Data Design Lab is immersed in questions of how data can be used to expose and inform urban policies. In one example from her book Data Action, she created a model to identify cities in China that were built but never inhabited. The model was based on the idea that thriving communities need amenities (grocery stores and schools) — analysis of Chinese social media data showed in many Chinese cities these basic resources did not exist and therefore they were “Ghost Cities.” Williams lab went further to visualize the data to “ground truth” the results with Chinese officials. The approach allowed more candid conversations with the government and a more accurate model for understanding the phenomenon of China’s vacant cities. 

“We hear a lot about how data can be used for bad things, which is true, but it also can be used for good,” reflects Williams. “Like anything in the world, data is a tool, and that tool can be used to improve society rather than cause harm.”

Based on the inaugural class, Williams thinks Data and Society is exactly the kind of rigorous, thoughtful environment that will empower MIT graduates, helping them develop the awareness, analytical/ethical framework, and skills needed to act consciously as data practitioners in the field. “Engaging students across disciplines — that’s how innovation happens,” she says. 

 

Suggested links

Eden Medina: STS webpage

Medina Commentary: Humanistic Perspectives on Computing and AI

Sarah Williams: DUSP webpage

Sarah Williams: Civic Data Design Lab

Concentration in Computing and Society

MIT Program in Society, Technology, and Society

 


Story prepared by MIT SHASS Communications
Office of Dean Melissa Nobles
Editorial and Design Director: Emily Hiestand
Writer: Alison Lanier, Senior Communications Associate
Photograph of Eden Medina by Allegra Boverman