with teachers administering quizzes, they would see immediately how well students are tracking the study material and use the results to guide further discussion or study.) Unit exams were the normal pencil- and- paper tests given by the teacher. Exams were also given at the end of the semester and at the end of the year. Students had been exposed to all of the material tested in these exams through the teacher’s normal classroom lessons, homework, worksheets, and so on, but they had also been quizzed three times on one- third of the material, and they had seen another third presented for additional study three times. The balance of the material was neither quizzed nor additionally reviewed in class beyond the initial lesson and what ever reading a student may have done.
The results were compelling: The kids scored a full grade level higher on the material that had been quizzed than on the material that had not been quizzed. Moreover, test results for the material that had been reviewed as statements of fact but not quizzed were no better than those for the nonreviewed material. Again, mere rereading does not much help.
In 2007, the research was extended to eighth grade science classes, covering ge ne tics, evolution, and anatomy. The regimen was the same, and the results equally impressive. At the end of three semesters, the eighth graders averaged 79 percent (C+) on the science material that had not been quizzed, compared to 92 percent (A−) on the material that had been quizzed.
The testing effect persisted eight months later at the end-of- year exams, confi rming what many laboratory studies have shown about the long- term benefi ts of retrieval practice. The effect doubtless would have been greater if the retrieval practice had continued and occurred once a month, say, in the intervening months.8
Make It Stick ê 36
The lesson from these studies has been taken to heart by many of the teachers at Columbia Middle School. Long after concluding their participation in the research studies, Patrice Bain’s sixth grade social studies classes continue today to follow a schedule of quizzes before lessons, quizzes after lessons, and then a review quiz prior to the chapter test. Jon Wehrenberg, an eighth grade history teacher who was not part of the research, has knitted retrieval practice into his classroom in many different forms, including quizzing, and he provides additional online tools at his website, like fl ashcards and games.
After reading passages on the history of slavery, for example, his students are asked to write down ten facts about slavery they hadn’t known before reading the passages. You don’t need electronic gadgetry to practice retrieval.
Seven sixth and seventh graders needing to improve their reading and comprehension skills sat in Michelle Spivey’s En-glish classroom one period recently with their reading books open to an amusing story. Each student was invited to read a paragraph aloud. Where a student stumbled, Miss Spivey had him try again. When he’d gotten it right, she probed the class to explain the meaning of the passage and what might have been going on in the characters’ minds. Retrieval and elaboration; again, no technology required.
Quizzes at Columbia Middle School are not onerous events.
Following completion of the research studies, students’ views were surveyed on this question. Sixty- four percent said the quizzing reduced their anxiety over unit exams, and 89 percent felt it increased learning. The kids expressed disappointment on days when clickers were not used, because the activity broke up the teacher’s lecture and proved enjoyable.
Principal Chamberlain, when asked what he thought the study results indicated, replied simply: “Retrieval practice has a signifi cant impact on kids’ learning. This is telling us that
To Learn, Retrieve ê 37
it’s valuable, and that teachers are well advised to incorporate it into their instructional technique.”9
Are similar effects found at a later age?
Andrew Sobel teaches a class in international po liti cal economics at Washington University in St. Louis, a lecture course populated by 160– 170 students, mostly freshmen and sopho-mores. Over a period of several years he noticed a growing problem with attendance. On any given day by midsemester, 25– 35 percent of the class would be absent, compared to earlier in the semester when maybe 10 percent would be absent.
The problem wasn’t unique to his class, he says. A lot of professors give students their PowerPoint slides, so the students just stop coming to class. Sobel fought back by withholding his slides, but by the end of the semester, many students stopped showing up anyway. The class syllabus included two big tests, a midterm and a fi nal. Looking for some way to leverage attendance, Sobel replaced the big tests with nine pop quizzes. Because the quizzes would determine the course grade and would be unannounced, students would be well advised to show up for class.
The results were distressing. Over the semester, a third or more of the students bailed out. “I really got hammered in the teaching reviews,” Sobel told us. “The kids hated it. If they didn’t do well on a quiz they dropped the course rather than get a bad grade in it. Of those who stayed, I got this bifurcation between those who actually showed up and did the work, and those who didn’t. I found myself handing out A-plusses, which I’d never given before, and more Cs than I’d ever given.”10
With so much pushback, he had little choice but to drop the experiment and reinstate the old format, lectures with a midterm and fi nal. A couple of years later, however, after hearing a
Make It Stick ê 38
pre sen ta tion about the learning benefi ts of testing, he added a third major test during the semester to see what effect it might have on his students’ learning. They did better, but not by as much as he’d hoped, and the attendance problems persisted.
He scratched his head and changed the syllabus once again.
This time he announced that there would be nine quizzes during the semester, and he was explicit about when they would be. No surprises, and no midterm or fi nal exams, because he didn’t want to give up that much of his lecture time.
Despite fears that enrollments would plummet again, they actually increased by a handful. “Unlike the pop quizzes, which kids hate, these were all on the syllabus. If they missed one it was their own fault. It wasn’t because I surprised them or was being pernicious. They were comfortable with that.” Sobel took satisfaction in seeing attendance improve as well. “They would skip some classes on the days they didn’t have a quiz, particularly the spring semester, but they showed up for the quizzes.”
Like the course, the quizzes were cumulative, and the questions were similar to those on the exams he used to give, but the quality of the answers he was getting by midsemester was much better than he was accustomed to seeing on the midterms. Five years into this new format, he’s sold on it. “The quality of discussions in class has gone way up. I see that big a difference in their written work, just by going from three exams to nine quizzes.” By the end of the semester he has them writing paragraphs on the concepts covered in class, sometimes a full- page essay, and the quality is comparable to what he’s seeing in his upper division classes.
“Anybody can design this structure. But I also realize that, Oh, god, if I’d done this years ago I would have taught them that much more stuff. The interesting thing about adopting this strategy is I now recognize that as good a teacher as I
To Learn, Retrieve ê 39
might think I am, my teaching is only a component of their learning, and how I structure it has a lot to do with it, maybe even more.” Meanwhile, the course enrollment has grown to 185 and counting.
Exploring Nuances
Andy Sobel’s example is anecdotal and likely refl ects a variety of benefi cial infl uences, not least being the cumulative learning effects that accrue like compounded interest when course material is carried forward in a regime of quizzes across an entire semester. Nonetheless, his experience squares with empirical research designed to tease apart the effects and nuances of testing.
For example, in one experiment college students studied prose passages on various scientifi c topics like those taught in college and then either took an immediate recall test after the initial exposure or restudied the material. After a delay of two days, the students who took the initial test recalled more of the material than those who simply restudied it (68 v. 54 percent), and this advantage was sustained a week later (56 v. 42
percent). Another experiment found that after one week a study- only group showed the most forgetting of what they initially had been able to recall, forgetting 52 percent, compared to a repeated- testing group, who forgot only 10 percent.11
How does giving feedback on wrong answers to test questions affect learning? Studies show that giving feedback strengthens retention more than testing alone does, and, interestingly, some evidence shows that delaying the feedback briefl y produces better long- term learning than immediate feedback. This fi nding is counterintuitive but is consistent with researchers’
Make It Stick ê 40
discoveries about how we learn motor tasks, like making lay-ups or driving a golf ball toward a distant green. In motor learning, trial and error with delayed feedback is a more awkward but effective way of acquiring a skill than trial and correction through immediate feedback; immediate feedback is like the training wheels on a bicycle: the learner quickly comes to depend on the continued presence of the correction.
In the case of learning motor skills, one theory holds that when there’s immediate feedback it comes to be part of the task, so that later, in a real- world setting, its absence becomes a gap in the established pattern that disrupts per for mance.
Another idea holds that frequent interruptions for feedback make the learning sessions too variable, preventing establish-ment of a stabilized pattern of per for mance.12
In the classroom, delayed feedback also yields better long-term learning than immediate feedback does. In the case of the students studying prose passages on science topics, some were shown the passage again even while they were asked to answer questions about it, in effect providing them with continuous feedback during the test, analogous to an open- book exam. The other group took the test without the study material at hand and only afterward were given the passage and instructed to look over their responses. Of course, the open-book group performed best on the immediate test, but those who got corrective feedback after completing the test retained the learning better on a later test. Delayed feedback on written tests may help because it gives the student practice that’s spaced out in time; as discussed in the next chapter, spacing practice improves retention.13
Are some kinds of retrieval practice more effective for long-term learning than others? Tests that require the learner to
To Learn, Retrieve ê 41
supply the answer, like an essay or short- answer test, or simply practice with fl ashcards, appear to be more effective than simple recognition tests like multiple choice or true/false tests.