Understanding data science is vital for today’s students, says Mahmoud Harding, Instructional Design Specialist at Data Science 4 Everyone, a coalition that advocates for equitable access to data science for all students.
“We’re in a data-rich world,” he says. “When I talk about data science, I'm coming at it from the perspective of how can we take something that's abundant in our society and use it throughout the content area in K-12 to bring relevancy to the classroom. I'm not unrealistic. I'm not thinking that every lesson is going to be this rich data-infused lesson.”
Instead, Harding believes data science lessons can be added into STEM content as well as other subject areas that educators are already working on across grade levels.
“Why can't we take relevant data and design instructional activities that help students develop data literacy skills where they can look at a graph and read it and understand what it's saying?” he asks. “Or they can look at poll results. They can look at probabilities and statistics surrounding their financial choices or their health choices, and they can just be better-informed citizens.”
Teaching Data Science Sites & Resources
One of the first steps to teaching data science is finding data resources.
Data Science 4 Everyone offers a resource page that has tools for teaching data science, learning data science, and other resources.
Educators can also find free data sets from a variety of providers, Harding says. For instance, Kaggle.com offers computer science-related data sets. Data is Plural features free data sets on various topics, and Tuva Labs has a variety of data sets and data visualization tools.
Once you find data relevant to a topic you already teach, you want to incorporate it into a lesson. For instance, Harding says a middle school teacher teaching about health and nutrition might use a Tuva Labs data set that looks at the composition of 40 different food groups.
“They can plot distributions and see what the pattern is in the frequency of carbohydrates,” he says. “Is there a relationship between protein and total calories? Which food groups have the most calories? Which food groups have the most protein?”
This process teaches students about which foods are more nutritious but the teacher doesn’t impart that knowledge to them, they discover it on their own, which can be a more powerful way to learn.
Start Small
Sometimes educators make the mistake of approaching the subject of data science as an all-or-nothing endeavor and get data sets that are too vast and can overwhelm students, particularly younger students.
“You don't need a million records. You need more than it would be reasonable to use by hand on paper,” Harding says. For middle school students, he recommends about 40 records or so, enough to see patterns and trends without overwhelming students.
In addition, teaching data science to every student doesn’t mean preparing all to become data scientists, the same way students who learn history don’t necessarily become historians.
“When we say, 'data science,' one of the main things I want educators to understand is that we're not saying we're going to try to take high school students and turn them into Google engineers once they graduate,” he says. “We're saying there is a data science method, similar to the scientific method, in which the teacher has a framework where they infuse data into their lesson. Students go through this iterative process of analyzing, and then in the end, the students are able to draw conclusions propose solutions, and speak about what they've gained by investigating data through this process.”