Tune in to the Data Games on April 16th!
Week 1 is all about using Data Science to understand representation in the media. We’ll learn what data science is and how you can use it to explore impactful questions by collecting your own data and spotting any patterns and insights!
Hey teachers! Check out this facilitator guide before you walk through this lesson with your students. It provides step by step instructions to help you set up the group activity.
If you are a student going through the lesson on your own, you can also read through this guide and do your own version of the activity!
Part 1: Introduction
Getting Started
Start by watching the intro video.
Make sure to follow the Norms and Expectations around being respectful when having discussions with team mates!
Example
Watch a video from London Holmes on the representation of her natural hair!
Reflect
Have you ever thought about how represented you feel by mass media?
Think about your favorite movies (or TikToks, or even news stories). How often do you see people who look like you or your peers?
How about people of all skin tones and races?
Part 2: What Question Are We Going to Answer?
Suppose you are on a committee for representation and equity for a media outlet or company. Your group is tasked with understanding how well different skin tones are represented in the media your organization produces.
How would you collect data to answer this?
Take a couple minute to think about this question and discuss amongst yourself before moving on.
What are we going to be doing?
We're going to be collecting data on skin tones!
How? By collecting RGB (Red Green Blue) values of the skin tones we see in online magazines.
Then, we will analyze the data we've collected
Finally, we will make some conclusions
Part 3: Time to Collect Some Data!
How do we collect this data?
Choose a magazine or media outlet to represent. You may choose your own, or use one of the following:
Vogue (link to archive)
Time (link to archive)
New York Times “T Magazine” (link to archive)
See More Archives at Magazine Lib (link)
Find examples of human faces in the media produced by your chosen source.
For each picture you see of a person’s face, right-click and copy the image.
For each image you copied, open ImageColorPicker (imagecolorpicker.com) and Ctrl+V to load the image into the tool.
Get a skin tone color sample by clicking on a spot on the person’s face. Copy the RGBA value from the tool (shown below).
Finally, submit the value to the Google Form link from your teacher (note that R G and B all have their own entry box)! Try to collect at least 5 unique examples for each member in the group.
Part 4: Let's Visualize and Explore Our Findings!
Now that you've collected data as a class, the problem is, it’s not represented in a form that’s very easy to understand - there are a lot of numbers!
As young data scientists we need to visualize and model our data.
One way we can visualize the data is by using a chart called a histogram, where different data ranges are shown on the horizontal axis and counts are shown vertically.
Here's an example of the histogram created by the data our young data scientists in 2021!
Fig 1. Real histogram produced by students in YDSQ 2021
This histogram sorts the skin tones that the students collected into buckets on the horizontal axis. Each bucket represents an equal number of possible skin tones, sorted from lightest to darkest (according to their RGB values). The height of each chart represents how many data points fall into that category.
Take a look at the histogram you made with your class, and use that to answer a few questions:
What do you notice in the histogram? What questions does it generate for you?
What conclusions can we draw from this? What can / can’t this visual tell us?
Can you think of some reasons why the data appears this way?
Note: As students you may be familiar with a related visual called a bar chart, which also resembles a histogram! The key difference between these concepts is that histograms are used specifically to compare frequencies of data points in sorted ranges of data (such as skin tone ranges from dark to light).
If you want to dig a little deeper...
You may have noticed that this activity can be error prone. This is done on purpose. We want to show that data is generated through a human process - it can be “messy”:
If you were to repeat this experiment again, or with another classroom, do you think the results would look the same? Why or why not?
Instructor Note: This can illustrate a data science concept known as variability.
How did you pick which faces to copy (did you skip any)? How did you select the spot on the person’s face? How might those choices affect the results?
Instructor Note: This can illustrate a data science concept known as sampling bias.
How reliable is RGB at measuring skin tone? Did your group find any images with different lighting, filters, or shadows? What happens then?
Instructor Note: This can illustrate a data science concept known as measurement error.
What other features would you measure besides RGB? (e.g. gender, race, etc.)
Part 5: Important Takeaways
Data helps us understand the world around us, and empowers us to tell our story. It is not a replacement for lived experiences and personal opinions. But it can help us spot and quantify trends that can generate further discussion and inquiry.
Data scientists use modelling and visualization to bring clarity to complex datasets. Data science is not just about Excel and number crunching - it is inherently creative.
There is a need for data scientists who understand how to work with biased datasets, such as the one we experienced in this activity. These have real-world consequences - see: Scientific American - Healthcare AI are Biased
The world needs students like you to help address these unsolved problems!
Here are Some Other Articles on the Data Science of Skin Color!
Colorism and Fashion: New Yorker - Shudu Gram Instagram model
The Bias of Facial Recognition Software: Ford Foundation - Fighting the “coded gaze”
Colorism and Healthcare: Scientific American - Healthcare AI are Biased
The Science of Skin Tone: TED Video
Bias and Photography: New York Times - “The Ratio Bias”
Work in groups or individually to work through the lab below!
This week, as a team, decide upon three common interest areas for your story-project that you will enter into the Data Games.
Here are some prompts to help you think about topics of interest:
Make a list of what is important to you.
Think about how you enjoy spending your time, some of your favorite things - hobbies, activities, sports, etc.
Make a list of what is impactful for your community - locally or the world. Send this list to your mentors!