Clickers in chemistry: a classic case of data munging

Clickers are a common piece of educational technology across college campuses, used for in-class quizzes or to allow many voices to “speak” in a class discussion. At Reed, clickers are most commonly used in the first two years of Chemistry courses, which are some of our larger classes (75 students per section in Chem 101) and bring together students from a variety of academic backgrounds. Clickers allow faculty to check student comprehension throughout class and actively engage students in a larger lecture environment.

In addition to on-the-fly uses in class, the data from clicker polling could be used to check student comprehension over time. Are a handful of students consistently incorrect in their answers? (If so, the instructor may be able to reach out to those students and offer additional assistance.) At a more basic level, “clicking in” could be used as a metric for attendance.

python

A friendly (bit of) Python

However, before any of that happens — one would need information on the clickers themselves, the individual students, and what person is connected to what device.

Ladies and gentlemen, welcome to a classic data problem. We will solve this given three pieces of information, a good dose of logic, and a little help from our friends (in the form of a Python script).

Here are our three pieces of (separate but equally useful) information:

1) Class lists. These come by request from the Registrar’s Office, and contain:

  • Student first name
  • Student last name
  • Student Reed ID number
  • Student email (e.g. bottk@reed.edu)
  • Course name and number (Chem 201 + Chem 101)
  • Course section (two sections each)

This should represent the population of clicker users for the term.

2) Clicker sales. The Reed Bookstore manages clicker sales, and is kind enough to provide reports featuring:

  • Clicker number
  • Student name (last, first)
  • Student full name
  • Reed ID (number) of student

However, not all students buy clickers from the Bookstore. Someone enrolled in Chem 201 may still have a clicker from Chem 101; a student may buy or borrow a clicker from a friend. So we need a way to capture those not purchased this term.

3) Clicker self-report. Students in Chem 101/201 are asked to fill out a Google Form with their:

  • Clicker number
  • Full name (first last)
  • Reed username (e.g. bottk)
  • Course name
  • Course number
  • Section number

These three pieces can certainly be combined manually. You might start with the class list, add data from the bookstore file one-by-one, and then flesh out the rest with the self-report data. If matching manually, you might use student name as your criteria for a match, or Reed ID, depending on what data you are linking together. These primary keys uniquely identify datapoints, and are crucial when combining datasets.

This same process can be executed using scripting tools; in this case, I worked with a colleague to write a Python script that would combine all relevant pieces of information in a few seconds. This required knowing what information we (a) had available as inputs (b) trusted as inputs and (c) needed for the final output.

You can find the script here. The basic steps are:

  • tell the script where to look for files
  • define your input datafiles and all fields
  • define an error file and output files
  • use data dictionaries to match individuals across datasets
  • write errors to file
  • write outputs to file

This is one example of a data management problem which mirrors work I see students doing in their theses: understanding input data from different sources, making decisions about what data one believes; designing a data structure and defining specific goals for the data project.

This entry was posted in General Instructional Technology and tagged , . Bookmark the permalink.