Have We Overdone Retrieval Practice? A Timely Recalibration for Language Teachers

For the past decade or so, retrieval practice has enjoyed near-mythical status in education circles, often presented, somewhat uncritically, as a universal good : test more, retrieve more, and learning will automatically follow. And while the underlying principle is undoubtedly sound, recent research and, in my experience, classroom reality suggest that the picture is rather more nuanced than the slogan would have us believe.

Let us be clear from the outset: retrieval practice DOES work. A robust body of research shows that retrieving information strengthens memory more effectively than re-exposure alone (Roediger & Karpicke, 2006; Karpicke & Blunt, 2011), and in the context of vocabulary learning this has been repeatedly confirmed, with retrieval supporting long-term retention when used appropriately (Nation, 2013; Kang, 2016). However, what more recent syntheses in cognitive psychology and its application to SLA are beginning to emphasise is that retrieval is highly condition-dependent, and that when those conditions are not met, its benefits may be attenuated, or, in some cases, even reversed.

The following three variables, in particular, appear to be critical:

First, success rate: retrieval needs to be largely successful—typically in the region of 60–80%—for it to strengthen memory traces effectively, because repeated failure leads not to learning but to guesswork or disengagement (Karpicke, 2017; Rowland, 2014).

Second, prior encoding: the material must have been sufficiently processed before retrieval is attempted, since asking learners to retrieve poorly encoded information places excessive demands on working memory and often results in the reinforcement of incorrect hypotheses (Sweller, 2010; Baddeley, 2000). Hence the importance of input processing and effective elaborative rehearsal prior to retrieval, a tenet of the EPI methodology, which occurs in the MAR segment of the MARSEARS sequence.

Third, feedback quality: errors must be corrected promptly and clearly, otherwise retrieval risks consolidating inaccuracies rather than strengthening correct representations (Butler & Roediger, 2008).

And here, if one may pause for a moment, lies the pedagogical crux: in beginner and mixed-ability classroomwheres, learners’ lexical and grammatical representations are still fragile, and where processing capacity is easily overwhelmed, these conditions are not always present, and in my observation this is precisely when retrieval tasks become less a tool for consolidation and more an exercise in frustration; and is it not reasonable, therefore, to question whether what we are witnessing in such cases is retrieval at all, or merely the appearance of it?

From a cognitive load perspective, the issue becomes even clearer. Working memory is limited (Baddeley, 2000), and when learners are asked to retrieve language that has not yet been automatised, they must simultaneously search for forms, map them to meaning, and assemble them into coherent output. This operation, for novices, is often simply too demanding. Under such conditions, retrieval may not only fail to support learning but may actually impair encoding, because attentional resources are diverted away from processing input and towards struggling with output (Sweller, 2010; DeKeyser, 2007).

So what does this mean for classroom practice? Not, certainly, that retrieval should be abandoned—far from it—but rather that it should be repositioned within a carefully sequenced instructional framework, one in which learners are first provided with rich, comprehensible input, then guided through structured processing activities that stabilise form–meaning connections, and only then asked to retrieve and produce the language in question. In other words:

input → processing → rehearsal → retrieval

not the other way round, as often happens..

This is not a trivial adjustment. It represents a shift from viewing retrieval as the engine of learning to understanding it as a powerful consolidator of learning, effective precisely because it strengthens representations that are already in place, rather than creating them ex nihilo—an assumption which, though rarely stated explicitly, often underpins premature retrieval tasks.

In conclusion, retrieval practice remains one of the most valuable tools at our disposal, but like all powerful tools it must be used with precision; and if there is one lesson emerging from the most recent research—and, indeed, from careful classroom observation—it is that timing matters as much as technique, because learning is not simply a matter of doing the right thing, but of doing it at the right moment.

References

Baddeley, A. (2000). The episodic buffer: A new component of working memory? Trends in Cognitive Sciences.
Butler, A. C., & Roediger, H. L. (2008). Feedback enhances the positive effects and reduces the negative effects of multiple-choice testing. Memory & Cognition.
Carpenter, S. K. (2022).Retrieval practice. In J. Dunlosky & K. A. Rawson (Eds.), The Cambridge handbook of cognition and education (2nd ed., pp. 347–369). Cambridge University Press. DeKeyser, R. (2007). Practice in a second language: Perspectives from applied linguistics and cognitive psychology.
Kang, S. H. K. (2016). Spaced repetition promotes efficient and effective learning. Policy Insights from the Behavioral and Brain Sciences.
Karpicke, J. D. (2017). Retrieval-based learning: A decade of progress. Nakatsukasa, K. (2023).Retrieval practice in second language vocabulary learning: A research synthesis. Language Teaching Research, 27(4), 987–1008.
Nation, I. S. P. (2013). Learning vocabulary in another language.
Roediger, H. L., & Karpicke, J. D. (2006). Test-enhanced learning: Taking memory tests improves long-term retention. Psychological Science.
Rowland, C. A. (2014). The effect of testing versus restudy on retention: A meta-analytic review. Psychological Bulletin.
Sweller, J. (2010). Cognitive load theory: Recent theoretical advances.