Introduction
In my experience, one of the most persistent myths in language education is that vocabulary growth comes from introducing lots of new words quickly. Research, however, tells a very different story. Vocabulary learning is slow, cumulative, and constrained by cognitive limits, especially when it comes to working memory and processing speed. These limits differ markedly between primary and secondary learners, which means the “right” number of words per lesson is not the same across phases.
What often goes missing from this discussion, however, is how vocabulary is taught. Muche research suggests that teaching words in isolation and teaching them as chunks or multi-word units place very different demands on the brain — and this has important implications for how much learners can realistically handle.
A necessary caution: what does it mean to “learn” a word?
Before addressing how many words can be taught in a lesson, I believe it is important to clarify what learning a word actually entails. Vocabulary knowledge is not an all-or-nothing phenomenon. Research consistently shows that knowing a word involves multiple dimensions: recognising its spoken and written form, understanding its meaning, knowing how it behaves grammatically, and being able to retrieve and use it appropriately (Nation, 2001; Schmitt, 2008). In classroom terms, this means that many words students encounter in a lesson may be noticed or partially understood without being fully learned or retained. The figures discussed in this chapter therefore refer to words or chunks that can realistically be taught for durable learning, not merely encountered or temporarily recognised.
Table 0. The dimensions of word knowledge
| Dimension | What it involves | Classroom implications |
|---|---|---|
| Spoken form (phonological) | Recognising and producing the word’s sounds accurately | Learners may know a word in writing but fail to recognise it in listening |
| Written form (orthographic) | Recognising and spelling the word correctly | Spelling knowledge can support memory, especially at secondary level |
| Meaning (semantic) | Understanding what the word refers to | Meaning is often partial at first and becomes more precise over time |
| Form–meaning connection | Linking the sound/spelling to the correct meaning | This link is fragile in early learning and easily breaks under time pressure |
| Conceptual knowledge | Understanding the concept behind the word | Abstract or culturally unfamiliar concepts are harder to learn |
| Grammatical behaviour | Knowing the word’s part of speech and how it behaves grammatically | Includes gender, agreement, verb patterns, count/uncount status |
| Collocations | Knowing which words typically occur with it | Crucial for fluency and naturalness (e.g. make a mistake, not do) |
| Formulaic use / chunks | Knowing how the word functions inside common phrases | Supports faster processing and listening comprehension |
| Register | Knowing whether the word is formal, informal, slang, etc. | Prevents inappropriate usage in speaking and writing |
| Frequency | Knowing how common the word is | High-frequency words deserve more classroom time |
| Associations | Knowing related words (synonyms, antonyms, semantic fields) | Supports lexical networks and faster retrieval |
| Pragmatic use | Knowing when and why the word is used | Includes politeness, social norms, and discourse function |
| Receptive knowledge | Understanding the word when heard or read | Usually develops before productive knowledge |
| Productive knowledge | Being able to use the word accurately | Requires more practice and stronger memory traces |
| Automaticity | Retrieving the word quickly under pressure | Essential for fluent listening and speaking |
L2 primary learners (approx. ages 5–11)
I have taught primary learners between the ages of 7 and 10 for 18 years and one thing that never ceased to surprise me was how fast their forgetting rate without constant revision was! This is because young learners face particularly strong cognitive constraints when learning vocabulary in an additional language. Working memory capacity is limited, attentional control is still developing, and phonological representations in the L2 are fragile and slow to stabilise. In addition, primary learners often have limited literacy skills in both their first language and the target language, which reduces their ability to use orthography as a support. As a result, vocabulary learning at this stage is highly incremental and depends heavily on repetition, salience, and recycling across time.
Table 1. Research findings: vocabulary learning in L2 primary learners
| Research | Key findings relevant to “words per lesson” |
|---|---|
| Cameron (2001) | Vocabulary learning in young learners is gradual and fragile; introducing too many new words at once leads to shallow learning and rapid forgetting |
| Nation (2001) | Small numbers of new words should be taught explicitly, with repeated encounters over time; depth of processing matters more than quantity |
| Gathercole & Alloway (2008) | Children’s working memory capacity is very limited, strongly constraining how many unfamiliar items can be processed simultaneously |
| Pinter (2017) | Young learners benefit most when new vocabulary is embedded in familiar routines and recycled frequently |
| Kersten et al. (2010) | Vocabulary uptake improves when lexical load is low and exposure is distributed over time |
Table 2. Studies informing how many words can be taught per lesson (Primary)
| Study | Learners | Implication for words per lesson |
|---|---|---|
| Nation (2001) | Primary & early L2 learners | Around 3–5 new items can be taught effectively when recycling is built in |
| Cameron (2001) | Primary L2 learners | Fewer than 5 items per lesson supports retention |
| Gathercole & Alloway (2008) | Children | Working memory limits suggest very small lexical loads |
| Kersten et al. (2010) | Young L2 learners | Learning improves when lessons focus on few items, frequently recycled |
| Pinter (2017) | Primary learners | Depth over breadth; typically 3–4 items per lesson |
What changes when words are taught in chunks?
In my experience, when vocabulary is taught as formulaic chunks (e.g. I like football, on the table, there is a dog) words are retained better by younger learners. One can also teach them more words, as the brain does not treat each word as a separate unit. Instead, the entire sequence can be processed as a single cognitive chunk.
Psycholinguistic research shows that:
- working memory operates on chunks rather than individual words (Miller, 1956; Cowan, 2001)
- frequently occurring multi-word sequences are stored and retrieved holistically (Wray, 2002; Ellis, 2003)
- chunking reduces the need for online grammatical computation, freeing cognitive resources for meaning (Ellis, 1996; Nation, 2013)
For primary learners, this is particularly important. Because attentional resources are limited and processing is slow, treating a phrase as one unit allows learners to engage with meaningful language without having to assemble it word by word.
In sum, while primary learners can typically only learn around 3–5 new items per lesson, those items can be multi-word expressions rather than isolated words. Chunking does not increase memory capacity, but it significantly increases the amount of functional language that can be processed and retained.
How this translates into KS2 practice (Years 3–6)
Based on the research above, and taking into account developmental changes in working memory, phonological automatisation, and classroom listening demands, the following ranges are realistic teaching targets, not exposure limits.
Table 3. Recommended teachable vocabulary load per lesson (KS2)
| Year group | New items per lesson (taught for retention) | Notes |
|---|---|---|
| Year 3 | 2–3 items | Strong reliance on chunks; heavy recycling essential; listening load must be very light |
| Year 4 | 3–4 items | Chunks preferred; begin gentle variation within familiar frames |
| Year 5 | 4–5 items | Mix of chunks and high-frequency single words; listening tasks still limit capacity |
| Year 6 | 5 items (occasionally 6) | Greater tolerance for analysis, but chunking remains more efficient than isolation |
These figures assume that items are recycled across lessons and revisited in multiple modalities. Teaching more items in a single lesson does not increase long-term retention.
L2 secondary learners (approx. ages 11–16)
Secondary learners obviously benefit from several cognitive and experiential advantages as compared to their primary counterparts. First off, working memory capacity is greater, especially at 16 where it reaches the adult-like levels. Secondly, attentional control is more stable and learners are better able to analyse language explicitly. They also tend to have more developed literacy skills, allowing them to use spelling and morphology to support retention. As a result, vocabulary learning becomes more efficient, although it remains constrained by time pressure and real-time processing demands, particularly in listening.
Table 4. Research findings: vocabulary learning in L2 secondary learners
| Research | Key findings relevant to “words per lesson” |
|---|---|
| Nation (2001) | Vocabulary acquisition is cumulative; teaching too many items at once reduces retention |
| Hulstijn (2001) | Intentional vocabulary learning is effective only when cognitive load is manageable |
| Schmitt (2008) | Knowing a word involves multiple dimensions, requiring repeated encounters |
| Field (2008) | Lexical overload impairs listening comprehension; fewer new items improve decoding |
| Vandergrift & Goh (2012) | Lexical familiarity is a strong predictor of listening success |
Table 5. Studies informing how many words can be taught per lesson (Secondary)
| Study | Learners | Implication for words per lesson |
|---|---|---|
| Nation (2001) | Adolescent L2 learners | Typically 6–10 new items per lesson with recycling |
| Hulstijn (2001) | Secondary learners | More than 10 items overloads processing |
| Schmitt (2008) | Secondary & adult learners | Learning requires multiple encounters; limits effective intake |
| Field (2008) | Secondary L2 listeners | Listening lessons should stay toward lower end (6–8 items) |
| Vandergrift & Goh (2012) | Secondary learners | Lexical familiarity constrains how many items can be processed |
What changes when words are taught in chunks?
At secondary level, chunking supports processing efficiency and fluency rather than basic capacity expansion. Research shows that formulaic sequences are retrieved faster than novel combinations (Pawley & Syder, 1983; Conklin & Schmitt, 2008) and reduce the cognitive cost of real-time comprehension.
To sum up, while secondary learners can typically learn 6–10 new items per lesson, teaching these items as chunks allows teachers to expose learners to a far greater volume of language without increasing cognitive overload.
Teaching vs exposure: revisited through chunking
The distinction between teaching and exposure becomes clearer when chunking is considered.
- Teaching isolated words often leads to fragmented knowledge
- Teaching chunks supports immediate comprehension and production
- Exposure to many words inside a small number of chunks is cognitively efficient
Chunking therefore allows teachers to teach fewer items while delivering richer input.
Pros and cons of teaching words in isolation
Teaching vocabulary in isolation is not inherently wrong, but it has specific strengths and limitations.
Advantages
- supports semantic precision
- useful for low-frequency or content-specific nouns
- facilitates dictionary skills and explicit form–meaning mapping
- easier to assess in short written tasks
Limitations
- high cognitive load during listening
- weak support for fluency and real-time processing
- encourages word-by-word decoding
- delays access to functional language use
Isolated-word teaching is most effective when it is limited in quantity and quickly integrated into phrases or chunks.
When the words are cognates
Cognates occupy a special position in vocabulary learning. Because they share form and meaning with words in the learner’s first language, they place a much lighter burden on working memory and phonological decoding.
When teaching cognates:
- learners can often process more items per lesson
- sound–meaning mapping is faster
- retention is generally higher
In practical terms, lessons focusing on transparent cognates may safely exceed the usual word-count limits, provided pronunciation differences are explicitly addressed to avoid fossilisation.
Factors affecting the learnability of words
Before considering how many words to introduce in a lesson, it is essential to recognise that not all words are equally learnable. Learnability refers to the extent to which a lexical item can be easily noticed, processed, stored, and retrieved by learners. Cognitive factors such as phonological complexity and length interact with experiential factors like frequency, transparency, and conceptual familiarity. Pedagogically, this means that raw word counts are misleading unless we also consider what kinds of words are being taught.
Table 6. Factors influencing how easily words are learned
| Factor | Effect on learnability |
|---|---|
| Frequency | High-frequency words are learned faster |
| Phonological simplicity | Simple, familiar sound patterns are easier to retain |
| Transparency / cognacy | Cognates reduce cognitive load |
| Imageability | Concrete words are easier than abstract ones |
| Morphological regularity | Regular forms are easier to generalise |
| Length | Shorter words and chunks are easier to process |
| Contextual support | Rich context aids retention |
| Prior knowledge | Familiar concepts are learned more easily |
Learnability directly affects how many words can be taught in a lesson. Highly learnable items allow for slightly higher word counts; low-learnability items sharply reduce capacity. Effective planning therefore requires managing both quantity and quality of vocabulary.
Why listening lowers the threshold (even with chunks)
Listening remains demanding because learners must decode sounds, segment speech, and hold information in working memory under time pressure. Chunking, of course, reduces these demands but does not remove them, which is why listening-heavy lessons should operate at the lower end of recommended word counts.
Conclusion
Vocabulary learning is governed not by ambition but by cognition. Across both primary and secondary phases, learners can only process and retain a limited number of new items in a single lesson. Teaching vocabulary in chunks does not change these limits, but it allows each item to carry more meaning, structure, and communicative value. Effective curricula therefore prioritise fewer items, taught more deeply, recycled more often, and embedded in meaningful input over time.
Chunking does not allow us to teach more words — it allows us to teach language more effectively.
