Zipf’s Law and What It Means for Vocabulary Teaching in Instructed Second Language Acquisition

Introduction: When Everything Is Important, Nothing Is

One of the most common and dangerous assumptions in vocabulary teaching is that all words are equally important. This belief — often reinforced by thematic lists, textbook sequencing, or “fun” vocabulary — leads to wasted effort, cognitive overload, and, frankly, poor communicative payoff.

The truth is this: in real language, some words matter a great deal more than others. A handful of high-frequency words do the heavy lifting, while thousands of others live in the shadows of occasional use. This isn’t an opinion. It’s a statistical reality, first observed by linguist George Zipf (1935) and later confirmed across a wide range of languages and domains.

However, if we only focus on those top 1,000 words, we risk draining our lessons of personal relevance, cultural richness, and opportunities for self-expression. For example, words like le pingouin or la trottinette may be low frequency, but they often spark curiosity and provide hooks for meaningful communication. The art of good teaching is in finding the right balance — prioritising the high-frequency core, but leaving space for the occasional low-frequency gem that truly resonates with learners.

In this post, we’ll explore what Zipf’s Law is, why it matters, and how it should shape vocabulary teaching in instructed second language acquisition (ISLA). Along the way, we’ll also look at what this means for Modern Foreign Language (MFL) teaching in school settings — including both the opportunities and the potential pitfalls.

Please note that, although Zipf’s Law has been around for nearly a century and is one of the first things I was ever taught on my MA TEFL’s ‘L2 Acquisition Principles’ module back in 1997! It is funny how some people in the UK MFL circles treat it as some new breakthrough in L2 pedagogy… It isn’t. In the EFL world, language programming has been based on high-frequency word teaching for several decades!

What Is Zipf’s Law?

Zipf’s Law describes how word frequency is distributed in any natural language: the most common word occurs twice as often as the second most common, three times as often as the third, and so on. This creates a power-law distribution, where a tiny number of words account for the vast majority of all word occurrences in speech and writing.

For example, in English, the top 100 words cover about 50% of all written or spoken text, and the top 1,000 words account for roughly 80% of tokens in a standard corpus (Nation, 2001). The remaining words — tens of thousands of them — each appear rarely and carry diminishing returns for communication. The same holds true for French. According to the Lexique corpus and oral frequency studies (New et al., 2001), the top 150 words account for approximately 50% of everyday speech, and the top 1,000 cover around 85% of most texts. Beyond that, vocabulary frequency drops off steeply, with tens of thousands of words appearing only once in hundreds of thousands of words of input. This reinforces the principle that frequency-driven vocabulary selection is just as critical in French as it is in English.

In other words: if you’re teaching “la myrtille” (blueberry) or “le homard” (lobster) before your students can say “je veux”, “il y a”, or “je prends”, you’re teaching against the grain of how language actually works, according to current wisdom in the Applied Linguistics research community.

Why This Matters for ISL

In ISLA, where input and contact time are limited, learners cannot rely on incidental exposure to acquire the breadth and depth of vocabulary needed for real-world comprehension. This makes the principled selection of vocabulary absolutely essential.

Zipf’s Law reminds us that we shouldn’t treat all vocabulary as equal. Some words are exponentially more useful than others — not just for comprehension, but also for production, task success, and confidence building. Prioritising high-frequency vocabulary early on gives learners the best chance of gaining access to comprehensible input and generating meaningful output from the start.

Research shows that learners need knowledge of around 2,000 to 3,000 high-frequency word families to achieve 95% lexical coverage of typical texts — a threshold necessary for adequate comprehension (Laufer & Ravenhorst-Kalovski, 2010). However, to reach 98% coverage — the level associated with full reading fluency — learners would need up to 8,000–9,000 word families (Nation, 2006).

Implications for Vocabulary Teaching

1. Teach the most frequent words first

The top 1,000–2,000 words offer disproportionate access to real-life input. These are the words that learners need for basic survival communication and to begin noticing patterns in input. They allow learners to understand a wide range of texts and participate in simple conversations. Starting with these items gives learners a base of lexical material that can be quickly recycled, reused, and expanded upon.

According to Schmitt and Schmitt (2014), teaching the first 2,000 word families should be the top priority in any second language vocabulary programme, as these offer the highest utility for both receptive and productive language.

Table 1 – Words commonly taught in the food topic which are low-frequency

WordWhy It’s Problematic
la ceriseTaught in fruit lists but low in input frequency and rarely used communicatively.
le poireauIncluded in Studio food vocab; infrequent in real-world communication and difficult to recycle.
la betteraveAppears in healthy eating contexts but is obscure, unmemorable, and low frequency.
l’ailTaught in food vocab but rarely used productively by learners; low recall.
le foieFound in meat sections; culturally specific, rarely used, and often off-putting to learners.
la confitureCommon in breakfast sets but mid-to-low frequency in real input and limited for communicative use.
le mielIntroduced in food units; rarely heard in oral corpora and not central to beginner tasks.
le canardIncluded in meat lists in Studio 2, but low-frequency and not learner-relevant.
la dindeTaught in food/meat sets; low frequency and culturally narrow in scope.
l’agneauFound in meat vocabulary; extremely rare and often unknown even to learners in L1.
les champignonsTaught early but not high frequency and not often used by learners in speech or writing.

2. Prioritise frequency within thematic contexts

While frequency should guide word selection, thematic teaching can still play a valuable role in maintaining learner motivation and fostering meaningful communication. Themes like “food”, “daily routines”, or “school life” provide a coherent context for vocabulary, allowing learners to develop interconnected networks of meaning. They also support the development of topic-based conversations and help learners feel that they can talk about real-life experiences.

The compromise is to embed high-frequency vocabulary within thematic units, ensuring that learners are not wasting time on isolated or obscure items, while still benefiting from the motivational and organisational advantages of theme-based teaching. Research by Daller, Milton, and Treffers-Daller (2007) suggests that motivation and lexical relevance are strong predictors of vocabulary uptake, especially in school-based foreign language learning contexts.

For instance, would you always categorically not teach any of the words in Table 1 above, just because they are not frequent enough?

3. Emphasise multiword chunks and collocations

High-frequency words often combine into high-frequency phrases: I don’t know, Do you want to…?, There is/are. These chunks are essential for fluency and are processed faster than isolated words. They support both listening comprehension and fluent output. Teaching chunks helps learners acquire ready-made building blocks for communication and fosters grammatical awareness implicitly. Chunks also reflect the way language naturally occurs, reinforcing learner intuition and confidence.

Studies by Boers and Lindstromberg (2008) highlight the pedagogical value of formulaic sequences, showing that learners who are taught high-frequency collocations and chunks exhibit faster processing times and more fluent language production.

4. Engineer comprehensible input

Texts and listening materials can be crafted or selected to recycle high-frequency words intentionally. This repetition supports incidental learning and increases opportunities for retrieval and consolidation. Teachers can create texts that flood learners with targeted vocabulary while maintaining naturalness and interest. Rich input enables learners to notice how words behave in different contexts and contributes to the development of a robust mental lexicon.

Research by Webb and Nation (2017) shows that repeated exposure to vocabulary in meaningful contexts is one of the most effective ways to support retention and depth of processing.

5. Don’t overload beginners with rare words

Words like la myrtille (blueberry), la pastèque (watermelon), le homard (lobster), or la carotte râpée (grated carrot) can be fun but they appear rarely in input and are hard to retain. Introducing them too early results in poor long-term retention and delays learners’ ability to engage with authentic language. Schmitt and Schmitt (2014) argues L2 instruction should focus on the words that unlock the most language, both in terms of input and output. Lower-frequency words can be introduced gradually, once learners have developed a solid high-frequency base.

This aligns with findings by many other eminent researchers (e.g. Waring and Nation,1997; Laufer and Ravenhorst-Kalovski, G., 2010), who argue that introducing low-frequency vocabulary too early places an unnecessary burden on working memory and is often forgotten without extensive reinforcement.

6. Base word selection on corpus-informed frequency lists

Tools like the BNC/COCA corpus, the CEFR-based frequency bands, or the NCELP lists offer an empirically grounded way to prioritise vocabulary. These can guide syllabus design and resource selection far more reliably than textbook-driven intuition. Frequency lists allow for systematic coverage of core vocabulary and ensure learners are exposed to words that are genuinely useful. They also help teachers maintain consistency and transparency in their planning.

7. Use SRS for rarer vocabulary

Low-frequency words — the long tail — need intentional retrieval practice. Spaced repetition systems (SRS) can help keep these words alive once learners have built a strong core lexicon. These tools allow learners to control the human forgetting curve, revisit difficult items, and build long-term retention. For school settings, digital flashcards or quiz platforms can be used to integrate SRS into class time or homework routines.

As mentioned in several previous posts of mine, research by Bahrick (1984) and later Karpicke and Roediger (2008) provides strong evidence that spaced retrieval and repeated testing lead to superior long-term vocabulary retention compared to passive review.

An IMPORTANT Note of Caution: Motivation, Relevance, and the Long Tail

As useful as Zipf’s Law is, it should not be followed with blind dogmatism. A curriculum that teaches only the top 1,000 words may be efficient, but it can quickly become dry, demotivating, and disconnected from the learners’ lived experiences or interests. Not all high-frequency words are equally exciting to teenagers — and not all low-frequency words are useless. A Year 9 student may find quicksand more memorable than some, thing, or there.

In MFL, where student motivation is already fragile, it’s crucial to find a balance between utility and relevance. That means making room for high-interest, low-frequency words — particularly when they support personal expression, curiosity, or cultural engagement. The goal is not to ignore the long tail, but to introduce it strategically, in meaningful contexts, and after learners have built a functional core lexicon. In other words: start with the high-frequency core, but layer in personalised, memorable vocabulary to keep learners engaged.

As many of my readers would know, this is the approach we have taken on The Language Gym website and in the books, of course.

Summary Table: Zipf’s Law and ISLA Vocabulary Teaching

Zipfian PrinciplePedagogical Implication
A few words dominate real usagePrioritise the top 1,000–2,000 words
Most words are rareAvoid low-frequency vocabulary early
Language is chunkedTeach high-frequency phrases and collocations
Input matters more than intentionUse rich, repetitive, input-based tasks
Frequency predicts retentionTeach with frequency in mind, not themes alone
Motivation mattersInclude personally meaningful vocabulary selectively
Rare words are harder to acquireRecycle intentionally or use SRS for long-tail words

Conclusion: Frequency Isn’t Everything — But It’s a Great Place to Start

Zipf’s Law offers a valuable corrective to decades of inefficient vocabulary teaching. It reminds us that words are not equally useful and that real-world communicative value should trump textbook convenience or topic neatness. For ISLA, where time and input are limited, frequency is a powerful filter for what to teach first.

But frequency is not everything. Learner interest, identity, and curiosity also matter. Vocabulary instruction, then, is not just a matter of statistical prioritisation — it’s a matter of pedagogical tact: knowing when to stick to the core and when to deviate in order to sustain motivation and spark genuine engagement.

The best teachers will do both: build fluency through high-frequency input, while allowing for low-frequency magic when the moment is right.

References

  • Bahrick, H. P. (1984). Semantic memory content in permastore: Fifty years of memory for Spanish learned in school. Journal of Experimental Psychology: General, 113(1), 1–29.
  • Boers, F., & Lindstromberg, S. (2008). Phraseology and language teaching. John Benjamins.
  • Daller, H., Milton, J., & Treffers-Daller, J. (2007). Modelling and assessing vocabulary knowledge. Cambridge University Press.
  • Karpicke, J. D., & Roediger, H. L. (2008). The critical importance of retrieval for learning. Science, 319(5865), 966–968.
  • Laufer, B., & Ravenhorst-Kalovski, G. C. (2010). Lexical threshold revisited: Lexical text coverage, learners’ vocabulary size and reading comprehension. Reading in a Foreign Language, 22(1), 15–30.
  • Nation, I. S. P. (2001). Learning vocabulary in another language. Cambridge University Press.
  • Nation, I. S. P. (2006). How large a vocabulary is needed for reading and listening? The Canadian Modern Language Review, 63(1), 59–82.
  • Schmitt, N., & Schmitt, D. (2014). A reassessment of frequency and vocabulary size in L2 vocabulary teaching. Language Teaching, 47(4), 484–503.
  • Waring, R., & Nation, P. (1997). Vocabulary size, text coverage and word lists. In Schmitt, N., & McCarthy, M. (Eds.), Vocabulary: Description, acquisition and pedagogy (pp. 6–19). Cambridge University Press.
  • Webb, S., & Nation, P. (2017). How vocabulary is learned. Oxford University Press.
  • Zipf, G. K. (1935). The psychobiology of language: An introduction to dynamic philology. Houghton Mifflin.