Why L2 Writing Feels So Hard (and What We Should Do About It) – A Cognitive Perspective

Introduction

I still remember marking a set of Year 10 writing tasks years ago and thinking, not without a certain irritation tinged with professional unease, “They understand this… so why can’t they write it?” They could recognise the language with little difficulty in aural tasks, manipulate it orally when supported, and even demonstrate decent control in tightly scaffolded activities; and yet, the moment they were asked to write independently, the output collapsed into short, brittle sentences, missing endings, hesitant word order, and a palpable sense of effort that went far beyond what the task seemed to warrant. Why? Why here? Why now?

What teachers often interpret as a lack of effort, resilience, or ambition in L2 writing is, in my experience, more accurately understood as a mismatch between task demands and cognitive capacity For a long time, I nevertheless explained the problem away as a practice deficit, reassuring myself that what pupils needed was simply more writing, more exposure, more rehearsal.

The moment when that explanation began to feel intellectually untenable came when I started my PhD in the early 2000s and immersed myself in the L2 writing literature, particularly research examining the cognitive differences between writing in one’s first language and writing in an additional one. What became increasingly obvious to me was that L2 writing is not merely slower or less accurate L1 writing, but a qualitatively different cognitive activity, one that places radically different demands on working memory, attentional control, and linguistic retrieval, such that pedagogical approaches borrowed wholesale from L1 contexts are almost guaranteed to misfire.

In my opinion, unless pedagogy is designed with this difference explicitly in mind, persistent weakness in writing should surprise no one… and yet, how often do we still default to “just write more”? And why do we expect a different outcome?

1. Two languages are active simultaneously and compete for selection (Kroll, Bobb & Wodniecka, 2006)

When learners write in an L2, their L1 does not simply step aside and…wait its turn. Lexical, syntactic, and even pragmatic representations from both languages remain active and compete for selection, which means that hesitation, reformulation, and interference are not symptoms of poor learning or insufficient effort, but the predictable consequence of bilingual activation unfolding in real time. The brain, quite simply, is doing two things at once, juggling competing representations while attempting to maintain coherence. Should we really be surprised? Please note: this happens with more L2 proficient writers like myself too!

Pedagogical implications.
Implication 1: Allow planned use of L1 at the planning stage (ideas, notes), because content generation does not need to drain scarce L2 resources that will be required later for accurate encoding.
Implication 2: Teach contrastive chunks explicitly (L1 ≠ L2 structures), so that interference is anticipated and managed rather than discovered painfully through error.
Implication 3: Avoid banning L1 outright; manage interference instead, because in my experience such bans tend to increase cognitive friction and anxiety rather than promote fluency.

2. Grammar competes directly with idea generation (Skehan, 1998)

In L1 writing, grammatical encoding is largely automatised and therefore cognitively ‘cheap’, allowing writers to focus almost entirely on meaning, organisation, and nuance. In L2 writing, it is not. Formulation draws on the same limited working-memory resources as idea generation, so when grammatical decisions require conscious attention, something else must give way — and it is almost always content, complexity, or risk-taking. How could it be otherwise?

Pedagogical implications.
Implication 1: Separate content planning from language encoding in time and task design, so that pupils are not asked to generate ideas and encode unfamiliar language simultaneously.
Implication 2: Use sentence builders so grammar is effectively pre-loaded, reducing the processing burden at the point of writing.
Implication 3: Delay extended writing until forms are automatised, not merely “covered”, because exposure without control does not lead to fluency.

3. Working memory overloads quickly in L2 writing (Sweller, 1998)

As already implied above, one point that teachers consistently underestimate is how early working memory overload occurs in L2 writing. Add new vocabulary, unfamiliar grammar, new content, and time pressure, and the system saturates fast; performance does not decline gently but collapses, often in ways that look like carelessness yet are entirely predictable cognitive consequences.

Pedagogical implications.
Implication 1: Reduce task length deliberately (quality > quantity), recognising that fewer sentences written with control are more developmentally valuable than longer texts produced under strain.
Implication 2: Limit the amount of new language per writing task, so that attention can settle on consolidation rather than constant retrieval.
Implication 3: Scaffold heavily at first, then fade support gradually, because, as often reiterated on this blog, overload is not challenge; it is interference.

4. Pauses happen at morphology and function words (Spelman Miller, 2006)

Keystroke-logging studies show that L2 writers pause most frequently around verb endings, agreement, prepositions, and connectors — not around ideas or content generation. This finding, while awkward for certain assumptions about creativity and spontaneity, is deeply revealing of where cognitive effort is actually being spent. During the think-aloud protocols I staged with my informants during my PhD study, this was one of my most interesting findings. When asked about it, every single one of my students replied that they needed to think about them harder, especially when it came to verb endings they had learnt by memorizing verb tables (due to the TAP phenomenon) and prepositions (due to the differences in L1 vs L2 usage).

Pedagogical implications.
Implication 1: Over-teach verb endings, agreements, and connectors as chunks, treating them as high-frequency building blocks rather than incidental details.
Implication 2: Practise micro-writing (one or two sentences) so that attention can focus on these pressure points without overwhelming the system.
Implication 3: Recycle the same structures across many tasks, relentlessly, because, as often reiterated on this blog, working-memory overload — not lack of ambition — is the real enemy.

5. Accuracy is prioritised over fluency under pressure (Ellis, 2009)

Under time pressure, even advanced L2 writers protect accuracy first and sacrifice fluency shortening sentences, simplifying syntax, and avoiding risk. This is not a motivational issue, nor a lack of resilience; it is a rational response to finite cognitive resources. And yet… how often do we assess both as if they were the same thing?

Pedagogical implications.
Implication 1: Don’t time extended writing too early, especially before core structures are stable.
Implication 2: Use untimed drafting before timed exam practice, allowing control to consolidate before speed is imposed.
Implication 3: Assess fluency and accuracy separately, at least diagnostically, so that pupils are not penalised for unavoidable cognitive trade-offs.

6. L2 writers plan less and monitor locally (Hayes & Chenoweth, 2006)

This was the most obvious phenomenon I observed during my PhD into L2 writers’ self-monitoring habits: unlike L1 writers, L2 writers often move sentence by sentence, monitoring locally rather than structuring ideas globally, because attentional resources are already stretched thin by encoding demands. Planning does not magically transfer… why would it, given the load?

Pedagogical implications.
Implication 1: Teach explicit planning frames (who / when / where / why) so that organisation is not left to chance.
Implication 2: Model planning aloud before writing, making invisible cognitive processes visible.
Implication 3: Use paragraph-level sentence starters until patterns are internalised, rather than withdrawing support prematurely.

7. L2 writing relies on effortful executive control (Bialystok, 1990)

Early L2 writing is governed not by creativity, but by executive control: inhibition, selection, monitoring, and constant checking. There is simply no spare capacity for “free expression” at this stage, however desirable it may seem pedagogically. This is not an argument against creativity, but an argument about timing.

Pedagogical implications.
Implication 1: Avoid “creative free writing” with novices, not because creativity is unimportant, but because the system is already overloaded.
Implication 2: Build automatisation through repetition of familiar language, recognising repetition as a cognitive necessity rather than a pedagogical failure.
Implication 3: Treat writing as skill-building, not self-expression (yet), postponing creativity until control is secure.

8. Under time pressure, learners regress (Robinson, 2001)

When under pressure – especially during high stakes tests – learners retreat to safer grammar and simpler syntax, relying on what is most reliable rather than what is most ambitious. This regression is by design, not by weakness.

Pedagogical implications.
Implication 1: Train exam conditions gradually, rather than imposing them suddenly.
Implication 2: Practise speed on familiar language only, ensuring that pace does not come at the expense of accuracy.
Implication 3: Teach safe grammar strategies for exams, so pupils know what to fall back on when pressure rises.

9. Translation causes heavy cognitive interference (Kern, 1994)

Translating from L1 to L2 activates both systems simultaneously and forces alignment, creating interference that often makes the task cognitively heavier than composing directly in the L2. Counter-intuitive, perhaps, but well attested.

Pedagogical implications.
Implication 1: Avoid L1→L2 translation as a main writing task, particularly for extended output.
Implication 2: Prefer guided L2 composition, supported by models and chunks.
Implication 3: Use translation sparingly and diagnostically, to reveal interference patterns rather than generate text.

10. Chunks dramatically reduce cognitive load (Wray, 2002)

Here we go again some of you will say! Conti’s obsession: chunks ! However, chunks are not just a methodological preference of mine; they are a cognitive solution. Sequences such as in my opinion, I think that, because I like, when I was younger, in the future I would like to, on the one hand… on the other hand, or it is important to are processed as single units, freeing working memory and allowing attention to be redirected toward meaning. Why would we deny learners that advantage? As I am writing this article, right now, I can feel myself retrieving one chunks after another, sequencing and moving them around inserting connectives, adverbs and adjectives here and there.

Pedagogical implications.
Implication 1: Teach whole sentences, not isolated words, as the primary unit of instruction.
Implication 2: Recycle chunks across listening, reading, and writing until retrieval is fast and automatic.
Implication 3: Make chunk recall the core success criterion, as often reiterated on this blog, because availability rather than originality is what ultimately drives fluency.

11. Editing is not writing, and treating it as such is a mistake (Hayes, 1996)

One final cognitive distinction that is too often blurred in classroom practice is the difference between writing and editing. In L1 contexts, editing is often treated as a natural extension of composition: writers draft, reread, revise, and refine with relatively little additional cognitive cost. In L2 writing, however, editing constitutes a separate, highly demanding task, one that places additional strain on working memory. Why? Because it requires learners to reread text they have already struggled to produce while simultaneously evaluating form, meaning, and accuracy.

For many learners, this creates a perfect storm! By the time they reach the editing stage, cognitive resources are already depleted, which means that “editing” frequently degenerates into either superficial tinkering or indiscriminate rewriting, rather than targeted improvement. Errors that teachers expect pupils to notice remain invisible, not because pupils are careless, but because the act of noticing itself requires cognitive capacity that is no longer available.

Pedagogical implications.
Implication 1: Editing must be taught and sequenced as a distinct phase, not bolted on at the end of writing tasks, with clear limits on what pupils are expected to attend to (e.g. verb endings only, or agreement only).
Implication 2: Editing should be selective rather than comprehensive, focusing on a small number of high-frequency features, so that attention is not diffused across too many competing demands.
Implication 3: Editing routines should be highly scaffolded and repetitive, using checklists, models, and shared correction, until learners develop the procedural knowledge required to edit with some degree of independence.

Crucially, editing should not be treated as evidence of autonomy or maturity, but as another skill that needs to be explicitly taught, practised, and automatised. Without this, we risk mistaking cognitive overload for indifference, and missed errors for lack of effort.

Table 1: A Cognitive Taxonomy of Editing in L2 Writing

Editing typeWhat it targetsCognitive loadWhen it is viableClassroom exampleMain pedagogical risk if misused
1. Surface accuracy editingVerb endings, agreement, articles, high-frequency prepositions, spelling of familiar formsLow–moderate (narrow attentional focus)After highly scaffolded writing; with short texts; when target forms are already practised“Check only past tense verb endings.”Overload if combined with higher-level editing; pupils change nothing or everything
2. Lexical precision editingWord choice, replacing vague words, retrieval of taught chunksModerateOnce a core lexical repertoire is secure; with models available; limited alternatives“Replace ‘I like’ with one practised alternative chunk.”Slips into creative rewriting rather than editing
3. Morphosyntactic restructuringSentence structure, word order, subordination already taughtHigh (partial re-encoding)After sentence-level automatisation; with sentence builders; one sentence at a time“Rewrite sentence 3 using ‘because’.”Accuracy collapses; gains made earlier are lost
4. Discourse & organisation editingParagraph order, logical sequencing, basic connectorsVery high (global monitoring)With very short texts; higher proficiency; explicit planning frames“Check each paragraph answers one bullet from the plan.”Form accuracy deteriorates rapidly
5. Stylistic editingRegister, tone, variety, expressivenessExtremely highVery late in development; with highly familiar language onlyRarely appropriate below advanced levelCompetes with all other processes; derails learning focus

12. Conclusion

Across these strands of research, a pattern emerges with uncomfortable consistency: pupils struggle with L2 writing not because they lack ideas or resilience, but because the task routinely exceeds their cognitive capacity. Why would they keep taking risks in such conditions?

An approach such as Extensive Processing Instruction, with its emphasis on rich input, structured processing, chunking, and delayed output, aligns naturally with what cognitive research tells us about how writing develops. In practical terms, this means a curriculum in which rich input, repeated processing, and controlled output precede extended writing, rather than the other way around. In my experience, writing improves not because pupils are pushed harder, but because the task is redesigned to fit how cognition actually works… and once you see that, it is very hard to unsee.

Table 2: A Cognitive Taxonomy of Editing in L2 WritingSummary table

What the brain does in L2 writingPedagogical implication 1Pedagogical implication 2Pedagogical implication 3
1. L1 and L2 are both active during L2 writingAllow planned use of L1 at the planning stage (ideas, notes)Teach contrastive chunks (L1 ≠ L2 structures) explicitlyAvoid banning L1: manage interference instead
2. Grammar competes with idea generationSeparate content planning from language encodingUse sentence builders so grammar is pre-loadedDelay extended writing until forms are automatised
3. Working memory overloads quickly in L2Reduce task length (quality > quantity)Limit new language per writing taskScaffold heavily, then fade support gradually
4. Pauses happen at morphology and function wordsOver-teach verb endings, agreements, connectors as chunksPractise micro-writing (1–2 sentences)Recycle the same structures across many tasks
5. Accuracy is prioritised over fluency under pressureDon’t time extended writing too earlyUse untimed drafting before timed exam practiceAssess fluency and accuracy separately
6. L2 writers plan less and monitor locallyTeach explicit planning frames (who / when / where / why)Model planning aloud before writingUse paragraph-level sentence starters
7. L2 writing relies on effortful executive controlAvoid “creative free writing” with novicesBuild automatisation through repetitionTreat writing as skill-building, not self-expression
8. Under time pressure, learners regressTrain exam conditions graduallyPractise speed on familiar language onlyTeach “safe grammar” strategies for exams
9. Translation causes heavy cognitive interferenceAvoid L1→L2 translation as a main writing taskPrefer guided L2 compositionUse translation sparingly and diagnostically
10. Chunks dramatically reduce cognitive loadTeach whole sentences, not isolated wordsRecycle chunks across listening, reading, writingMake chunk recall the core success criterion

References

Bialystok, E. (1990). Communication strategies: A psychological analysis of second-language use. Oxford: Blackwell.

Ellis, R. (2009). The differential effects of task planning on fluency, complexity and accuracy. Applied Linguistics, 30(4), 474–509.

Flower, L., & Hayes, J. R. (1981). A cognitive process theory of writing. College Composition and Communication, 32(4), 365–387.

Hayes, J. R. (1996). A new framework for understanding cognition and affect in writing. In C. M. Levy & S. Ransdell (Eds.), The science of writing (pp. 1–27). Mahwah, NJ: Lawrence Erlbaum.

Hayes, J. R., & Chenoweth, N. A. (2006). Is working memory involved in the transcribing and editing of texts? Written Communication, 23(2), 135–149.

Kellogg, R. T. (1996). A model of working memory in writing. In C. M. Levy & S. Ransdell (Eds.), The science of writing (pp. 57–71). Mahwah, NJ: Lawrence Erlbaum.

Kern, R. G. (1994). The role of mental translation in second language reading. Studies in Second Language Acquisition, 16(4), 441–461.

Kroll, J. F., Bobb, S. C., & Wodniecka, Z. (2006). Language selectivity in bilingual speech. Bilingualism: Language and Cognition, 9(2), 119–135.

Levelt, W. J. M. (1989). Speaking: From intention to articulation. Cambridge, MA: MIT Press.

Manchón, R. M. (Ed.). (2011). Learning-to-write and writing-to-learn in an additional language. Amsterdam: John Benjamins.

Nation, I. S. P. (2013). Learning vocabulary in another language (2nd ed.). Cambridge: Cambridge University Press.

Robinson, P. (2001). Task complexity and language production. Applied Linguistics, 22(1), 27–57.

Roca de Larios, J., Manchón, R. M., & Murphy, L. (2008). Strategic behaviour in L2 writing processes. Journal of Second Language Writing, 17(1), 30–47.

Skehan, P. (1998). A cognitive approach to language learning. Oxford: Oxford University Press.

Spelman Miller, K. (2006). The pause phenomenon in second language writing. In K. P. H. Sullivan & E. Lindgren (Eds.), Computer keystroke logging and writing (pp. 11–30). Amsterdam: Elsevier.

Sweller, J. (1998). Cognitive load during problem solving. Cognitive Science, 12(2), 257–285.

VanPatten, B. (2015). Input processing in second language acquisition. New York: Routledge.

Wray, A. (2002). Formulaic language and the lexicon. Cambridge: Cambridge University Press.

Common Reasons Why GCSE MFL Writing Underperforms (and how to fix them through concrete classroom routines)

Introduction

After nearly three decades of teaching, examining, observing lessons, reading scripts, delivering CPD, and, perhaps most importantly, sitting next to pupils as they struggled to produce writing under exam conditions or as part of research projects, I have come to a fairly consistent conclusion: GCSE MFL writing underperformance is rarely about motivation or even knowledge.

In my experience, it is about poorly trained processes.

What follows is not a critique of effort but a set of practical failure points, each paired with specific, teachable solutions that I have used myself or seen work repeatedly in real classrooms and if some of this feels uncomfortably familiar, that is precisely the point, quid est enim veritas?

1. Inadequate attention to bullet-point coverage (sine qua non)

The single most common reason students fail to access higher content bands is simple: they do not fully address the bullet points! In my observation, students often believe that if they write enough, or sound fluent enough, the examiner will infer that they have “covered” the task. But since when has implication ever been rewarded in a mark scheme? It hasn’t — and it won’t! This aligns closely with what examiner reports repeatedly flag and with assessment research showing that explicit task fulfilment is a primary determinant of score outcomes (e.g. Shaw & Weir, 2007).

Bullet points are marking hooks, not suggestions, sine qua non.

Concrete classroom fixes:

  • Teach a mandatory bullet routine: underline, micro-plan, tick off.
  • Build bullet audits into marking.
  • Run bullet-only drills with no extended writing.

Is it glamorous No. Is it transformative? Absolutely.

2. Writing practice that does not transfer across prompts (mutatis mutandis)

Many students perform quite well on familiar task types but falter when the exam changes the angle slightly.Why does this happen, despite “lots of practice”? In my experience because students practise topic content, not transferable structures. This is a textbook example of poor transfer, a phenomenon long documented in educational psychology (Perkins & Salomon, 1988) which I have written extensively about on this blog (see my posts on TAP). Students know what to say, but not how to adapt it — mutatis mutandis.

Concrete classroom fixes:

  • Recycle key structures across topics.
  • Vary prompts relentlessly while holding form constant.
  • Rewrite answers to new questions using the same frames.

After all, if the exam changes the rules slightly shouldn’t our preparation anticipate that?

3. Insufficient automatisation of core grammar (festina lente)

In my opinion, grammar remains the main bottleneck in GCSE writing — not because students have never been taught it, but because it has not been automatised.

Under exam pressure, grammar that is not routinised collapses — and collapses fast the lesson here is old but enduring: festina lente. This mirrors findings from skill acquisition theory, which consistently show that declarative knowledge does not reliably convert into performance without extensive procedural practice (DeKeyser, 2007).

Concrete classroom fixes:

  • Define a high-frequency grammar core.
  • Practise it daily through retrieval.
  • Mark it selectively.
  • Enforce grammar self-check routines.

Slow, secure mastery beats rushed coverage every time does it not?

4. Excessive linguistic ambition without sufficient control (primum non nocere)

Students are often encouraged to “include complex language” without really being taught how to control it. This well-intentioned advice does more harm than good. Cognitive Load Theory explains why: adding complexity before automatisation overwhelms working memory and degrades performance (Sweller, Ayres & Kalyuga, 2011). Why reward risk when risk predictably lowers accuracy?

Concrete classroom fixes:

  • Teach tiered ambition explicitly: a secure core plus one controlled stretch.
  • Require students to choose their stretch in advance (e.g. tense shift or subordinate clause, not both).
  • Judge success by execution, not aspiration: inaccurate complexity scores lower than accurate simplicity.
  • Build stretch rehearsal into lessons: practise the same upgrade repeatedly across different topics until it becomes automatic.
  • Use post-task self-evaluation: students identify whether their chosen stretch was accurate and, if not, rewrite only that sentence.

Complexity is not bravery. It is control.

5. Reliance on word-for-word translation during writing (cui bono?)

One of the most damaging habits students develop is writing in English first and translating as they go.

It feels logical — but cui bono? Who actually benefits from this cognitively punishing process? Certainly not the student under exam pressure! Research on L2 writing repeatedly shows that heavy reliance on L1 translation increases cognitive load and error rates, particularly for less proficient writers (Kormos, 2012). So, whilst translation is an effective way to scaffold more creative writing in that it promotes retrieval practice and preps the students for the translation tasks in the GCSE exams, it shouldn’t be considered as the main way to build writing proficiency.

Concrete classroom fixes:

  • Ban sentence-level English planning.
  • Plan in bullet notes or L2 cues.
  • Model assembly writing.
  • Restrict dictionary use.
  • Reward simplification.

When students stop translating accuracy rises — almost embarrassingly quickly.

6. Limited development of sentence complexity (gradatim)

Many students produce grammatically correct but simplistic writing because they have never been taught how to combine sentences.

Why would complexity appear spontaneously, without instruction it doesn’t — gradatim. Writing research shows that sentence-combining instruction is one of the most effective ways to improve syntactic maturity and control (Graham & Perin, 2007). Over the years I have published quite a few free resources on the TES platform which involves sentence recombining.

Concrete classroom fixes:

  • Teach sentence combining explicitly.
  • Practise orally before writing.
  • Limit complexity focus per task.

Step by step beats leap and collapse.

7. Insufficient practice under timed conditions (tempus fugit)

Students often practise writing slowly and then panic in the exam.

Is this surprising? Not at all — tempus fugit and writing is a speeded performance. Research on writing fluency and assessment performance shows that time pressure fundamentally alters output quality unless fluency has been trained (Hayes & Flower, 1980; revisited in Hayes, 2012).

Concrete classroom fixes:

  • Weekly micro-timed writes.
  • Minimal success criteria.
  • Untimed vs timed comparisons.

Confidence comes from familiarity not reassurance.

8. Writing practice without explicit strategy instruction (ars docendi)

Writing is often treated as an output activity rather than a taught process.

But teaching is an art — ars docendi and processes must be made visible. Decades of research on writing instruction confirm that explicit modelling and strategy instruction outperform unguided practice (Graham, Harris & Santangelo, 2015).

Concrete classroom fixes:

  • Model thinking aloud.
  • Use joint construction.
  • Build reflection into every task.

Students cannot imitate what they have never seen.

9. Overly detailed or unfocused feedback (multum non multa)

Teachers often mark too much and students fix too little.

In my experience less is more — multum non multa. Feedback research consistently shows that focused, actionable feedback has greater impact than extensive commentary (Hattie & Timperley, 2007).

Concrete classroom fixes:

  • Two targets only.
  • Mandatory short rewrite.
  • Whole-class feedback for patterns.

Feedback should move learning forward not exhaust goodwill.

10. Lack of an explicit proofreading routine (ultima ratio)

Many marks are lost to errors students could correct themselves. Proofreading is the last line of defence — ultima ratio yet it is rarely trained! Research on self-regulation in writing shows that explicit editing routines significantly improve accuracy when automatised (Zimmerman & Risemberg, 1997). I have written extensively about how to foster self-monitoring in the MFL classroom based on my PhD research and classroom practice.

Concrete classroom fixes:

  • Error-hunt classroom routines focusing on common errors.
  • Fixed checking order.
  • Guided proofreading on models.
  • Visible evidence of checking.
  • Explicit training in self-monitoring strategies (see my dedicated post on this)

If it matters in the mark scheme it deserves rehearsal.

Conclusion (ex usu)

In my opinion, GCSE MFL writing improves not through more content, but through better processes.

Everything outlined here comes directly from classroom experience — mine and others’ — and reflects the principles underpinning the EPI approach and much of what I share in CPD sessions. None of it is theoretical. All of it is teachable. And all of it works — ex usu.

When we stop hoping students will “just get better” and instead train the processes that writing actually requires outcomes improve — quietly, steadily, and predictably.

And really what more could we reasonably ask for?

How many words can we really teach in one lesson?

Introduction

In my experience, one of the most persistent myths in language education is that vocabulary growth comes from introducing lots of new words quickly. Research, however, tells a very different story. Vocabulary learning is slow, cumulative, and constrained by cognitive limits, especially when it comes to working memory and processing speed. These limits differ markedly between primary and secondary learners, which means the “right” number of words per lesson is not the same across phases.

What often goes missing from this discussion, however, is how vocabulary is taught. Muche research suggests that teaching words in isolation and teaching them as chunks or multi-word units place very different demands on the brain — and this has important implications for how much learners can realistically handle.

A necessary caution: what does it mean to “learn” a word?

Before addressing how many words can be taught in a lesson, I believe it is important to clarify what learning a word actually entails. Vocabulary knowledge is not an all-or-nothing phenomenon. Research consistently shows that knowing a word involves multiple dimensions: recognising its spoken and written form, understanding its meaning, knowing how it behaves grammatically, and being able to retrieve and use it appropriately (Nation, 2001; Schmitt, 2008). In classroom terms, this means that many words students encounter in a lesson may be noticed or partially understood without being fully learned or retained. The figures discussed in this chapter therefore refer to words or chunks that can realistically be taught for durable learning, not merely encountered or temporarily recognised.

Table 0. The dimensions of word knowledge

DimensionWhat it involvesClassroom implications
Spoken form (phonological)Recognising and producing the word’s sounds accuratelyLearners may know a word in writing but fail to recognise it in listening
Written form (orthographic)Recognising and spelling the word correctlySpelling knowledge can support memory, especially at secondary level
Meaning (semantic)Understanding what the word refers toMeaning is often partial at first and becomes more precise over time
Form–meaning connectionLinking the sound/spelling to the correct meaningThis link is fragile in early learning and easily breaks under time pressure
Conceptual knowledgeUnderstanding the concept behind the wordAbstract or culturally unfamiliar concepts are harder to learn
Grammatical behaviourKnowing the word’s part of speech and how it behaves grammaticallyIncludes gender, agreement, verb patterns, count/uncount status
CollocationsKnowing which words typically occur with itCrucial for fluency and naturalness (e.g. make a mistake, not do)
Formulaic use / chunksKnowing how the word functions inside common phrasesSupports faster processing and listening comprehension
RegisterKnowing whether the word is formal, informal, slang, etc.Prevents inappropriate usage in speaking and writing
FrequencyKnowing how common the word isHigh-frequency words deserve more classroom time
AssociationsKnowing related words (synonyms, antonyms, semantic fields)Supports lexical networks and faster retrieval
Pragmatic useKnowing when and why the word is usedIncludes politeness, social norms, and discourse function
Receptive knowledgeUnderstanding the word when heard or readUsually develops before productive knowledge
Productive knowledgeBeing able to use the word accuratelyRequires more practice and stronger memory traces
AutomaticityRetrieving the word quickly under pressureEssential for fluent listening and speaking

L2 primary learners (approx. ages 5–11)

I have taught primary learners between the ages of 7 and 10 for 18 years and one thing that never ceased to surprise me was how fast their forgetting rate without constant revision was! This is because young learners face particularly strong cognitive constraints when learning vocabulary in an additional language. Working memory capacity is limited, attentional control is still developing, and phonological representations in the L2 are fragile and slow to stabilise. In addition, primary learners often have limited literacy skills in both their first language and the target language, which reduces their ability to use orthography as a support. As a result, vocabulary learning at this stage is highly incremental and depends heavily on repetition, salience, and recycling across time.

Table 1. Research findings: vocabulary learning in L2 primary learners

ResearchKey findings relevant to “words per lesson”
Cameron (2001)Vocabulary learning in young learners is gradual and fragile; introducing too many new words at once leads to shallow learning and rapid forgetting
Nation (2001)Small numbers of new words should be taught explicitly, with repeated encounters over time; depth of processing matters more than quantity
Gathercole & Alloway (2008)Children’s working memory capacity is very limited, strongly constraining how many unfamiliar items can be processed simultaneously
Pinter (2017)Young learners benefit most when new vocabulary is embedded in familiar routines and recycled frequently
Kersten et al. (2010)Vocabulary uptake improves when lexical load is low and exposure is distributed over time

Table 2. Studies informing how many words can be taught per lesson (Primary)

StudyLearnersImplication for words per lesson
Nation (2001)Primary & early L2 learnersAround 3–5 new items can be taught effectively when recycling is built in
Cameron (2001)Primary L2 learnersFewer than 5 items per lesson supports retention
Gathercole & Alloway (2008)ChildrenWorking memory limits suggest very small lexical loads
Kersten et al. (2010)Young L2 learnersLearning improves when lessons focus on few items, frequently recycled
Pinter (2017)Primary learnersDepth over breadth; typically 3–4 items per lesson

What changes when words are taught in chunks?

In my experience, when vocabulary is taught as formulaic chunks (e.g. I like football, on the table, there is a dog) words are retained better by younger learners. One can also teach them more words, as the brain does not treat each word as a separate unit. Instead, the entire sequence can be processed as a single cognitive chunk.

Psycholinguistic research shows that:

  • working memory operates on chunks rather than individual words (Miller, 1956; Cowan, 2001)
  • frequently occurring multi-word sequences are stored and retrieved holistically (Wray, 2002; Ellis, 2003)
  • chunking reduces the need for online grammatical computation, freeing cognitive resources for meaning (Ellis, 1996; Nation, 2013)

For primary learners, this is particularly important. Because attentional resources are limited and processing is slow, treating a phrase as one unit allows learners to engage with meaningful language without having to assemble it word by word.

In sum, while primary learners can typically only learn around 3–5 new items per lesson, those items can be multi-word expressions rather than isolated words. Chunking does not increase memory capacity, but it significantly increases the amount of functional language that can be processed and retained.

How this translates into KS2 practice (Years 3–6)

Based on the research above, and taking into account developmental changes in working memory, phonological automatisation, and classroom listening demands, the following ranges are realistic teaching targets, not exposure limits.

Table 3. Recommended teachable vocabulary load per lesson (KS2)

Year groupNew items per lesson (taught for retention)Notes
Year 32–3 itemsStrong reliance on chunks; heavy recycling essential; listening load must be very light
Year 43–4 itemsChunks preferred; begin gentle variation within familiar frames
Year 54–5 itemsMix of chunks and high-frequency single words; listening tasks still limit capacity
Year 65 items (occasionally 6)Greater tolerance for analysis, but chunking remains more efficient than isolation

These figures assume that items are recycled across lessons and revisited in multiple modalities. Teaching more items in a single lesson does not increase long-term retention.

L2 secondary learners (approx. ages 11–16)

Secondary learners obviously benefit from several cognitive and experiential advantages as compared to their primary counterparts. First off, working memory capacity is greater, especially at 16 where it reaches the adult-like levels. Secondly, attentional control is more stable and learners are better able to analyse language explicitly. They also tend to have more developed literacy skills, allowing them to use spelling and morphology to support retention. As a result, vocabulary learning becomes more efficient, although it remains constrained by time pressure and real-time processing demands, particularly in listening.

Table 4. Research findings: vocabulary learning in L2 secondary learners

ResearchKey findings relevant to “words per lesson”
Nation (2001)Vocabulary acquisition is cumulative; teaching too many items at once reduces retention
Hulstijn (2001)Intentional vocabulary learning is effective only when cognitive load is manageable
Schmitt (2008)Knowing a word involves multiple dimensions, requiring repeated encounters
Field (2008)Lexical overload impairs listening comprehension; fewer new items improve decoding
Vandergrift & Goh (2012)Lexical familiarity is a strong predictor of listening success

Table 5. Studies informing how many words can be taught per lesson (Secondary)

StudyLearnersImplication for words per lesson
Nation (2001)Adolescent L2 learnersTypically 6–10 new items per lesson with recycling
Hulstijn (2001)Secondary learnersMore than 10 items overloads processing
Schmitt (2008)Secondary & adult learnersLearning requires multiple encounters; limits effective intake
Field (2008)Secondary L2 listenersListening lessons should stay toward lower end (6–8 items)
Vandergrift & Goh (2012)Secondary learnersLexical familiarity constrains how many items can be processed

What changes when words are taught in chunks?

At secondary level, chunking supports processing efficiency and fluency rather than basic capacity expansion. Research shows that formulaic sequences are retrieved faster than novel combinations (Pawley & Syder, 1983; Conklin & Schmitt, 2008) and reduce the cognitive cost of real-time comprehension.

To sum up, while secondary learners can typically learn 6–10 new items per lesson, teaching these items as chunks allows teachers to expose learners to a far greater volume of language without increasing cognitive overload.

Teaching vs exposure: revisited through chunking

The distinction between teaching and exposure becomes clearer when chunking is considered.

  • Teaching isolated words often leads to fragmented knowledge
  • Teaching chunks supports immediate comprehension and production
  • Exposure to many words inside a small number of chunks is cognitively efficient

Chunking therefore allows teachers to teach fewer items while delivering richer input.

Pros and cons of teaching words in isolation

Teaching vocabulary in isolation is not inherently wrong, but it has specific strengths and limitations.

Advantages

  • supports semantic precision
  • useful for low-frequency or content-specific nouns
  • facilitates dictionary skills and explicit form–meaning mapping
  • easier to assess in short written tasks

Limitations

  • high cognitive load during listening
  • weak support for fluency and real-time processing
  • encourages word-by-word decoding
  • delays access to functional language use

Isolated-word teaching is most effective when it is limited in quantity and quickly integrated into phrases or chunks.

When the words are cognates

Cognates occupy a special position in vocabulary learning. Because they share form and meaning with words in the learner’s first language, they place a much lighter burden on working memory and phonological decoding.

When teaching cognates:

  • learners can often process more items per lesson
  • sound–meaning mapping is faster
  • retention is generally higher

In practical terms, lessons focusing on transparent cognates may safely exceed the usual word-count limits, provided pronunciation differences are explicitly addressed to avoid fossilisation.

Factors affecting the learnability of words

Before considering how many words to introduce in a lesson, it is essential to recognise that not all words are equally learnable. Learnability refers to the extent to which a lexical item can be easily noticed, processed, stored, and retrieved by learners. Cognitive factors such as phonological complexity and length interact with experiential factors like frequency, transparency, and conceptual familiarity. Pedagogically, this means that raw word counts are misleading unless we also consider what kinds of words are being taught.

Table 6. Factors influencing how easily words are learned

FactorEffect on learnability
FrequencyHigh-frequency words are learned faster
Phonological simplicitySimple, familiar sound patterns are easier to retain
Transparency / cognacyCognates reduce cognitive load
ImageabilityConcrete words are easier than abstract ones
Morphological regularityRegular forms are easier to generalise
LengthShorter words and chunks are easier to process
Contextual supportRich context aids retention
Prior knowledgeFamiliar concepts are learned more easily

Learnability directly affects how many words can be taught in a lesson. Highly learnable items allow for slightly higher word counts; low-learnability items sharply reduce capacity. Effective planning therefore requires managing both quantity and quality of vocabulary.

Why listening lowers the threshold (even with chunks)

Listening remains demanding because learners must decode sounds, segment speech, and hold information in working memory under time pressure. Chunking, of course, reduces these demands but does not remove them, which is why listening-heavy lessons should operate at the lower end of recommended word counts.

Conclusion

Vocabulary learning is governed not by ambition but by cognition. Across both primary and secondary phases, learners can only process and retain a limited number of new items in a single lesson. Teaching vocabulary in chunks does not change these limits, but it allows each item to carry more meaning, structure, and communicative value. Effective curricula therefore prioritise fewer items, taught more deeply, recycled more often, and embedded in meaningful input over time.

Chunking does not allow us to teach more words — it allows us to teach language more effectively.

Teaching MFL to SEN Pupils: Ten Things That Really Matter (And why most materials still get them wrong)

Introduction

If I have learnt one thing throughout 28 years of teaching, is that there is no such thing as “teaching SEN pupils” in the abstract. SEN profiles are so diverse, messy, and contradictory! However, research from cognitive psychology, SLA, and special education converges on a clear set of principles that consistently make MFL more accessible to learners with additional needs.

What follows are ten key things to bear in mind when teaching languages to SEN students, grounded in research and translated into classroom practice and materials design. Rest assured that none of these are gimmicks. Most are uncomfortable because they require us to slow down, simplify, and rethink what we mean by “progress”.

Finally, do note that I have included in this post a section dedicated solely to teaching dyslexic children, as research indicates that up to around 10 % of the UK population are estimated to have dyslexia — meaning a significant proportion of MFL learners may struggle with reading, processing and recall in ways that traditional materials and assessments do not adequately support

1. SEN pupils struggle more with retrieval than understanding

One of the most persistent myths in MFL – one that I always try to debunk in my posts – is that if a pupil cannot produce language, they do not “know” it. Research on working memory and retrieval (Gathercole & Alloway; Hulstijn) shows that recognition and recall are different cognitive processes.

Classroom implication

A pupil who can match la piscine to a picture but cannot spell or say it is not failing ! — they are operating at a different stage of acquisition.

Materials implication

Design tasks that separate recognition from production:

  • matching before recall
  • word banks before blank pages
  • partial dictation before full dictation

If your material jumps straight from exposure to free writing, SEN pupils fall off a cliff. In my approach (EPI) this translates into having a robust Receptive Processing phase (the first ‘R’ in the MARSEARS framework).

2. Cognitive load kills learning faster than lack of ability

Cognitive Load Theory (Sweller) is, of course, brutally relevant to SEN learners. Overloaded working memory doesn’t result in slower learning — it results in no learning at all.

Classroom implication

If a task requires pupils to simultaneously:

  • read instructions
  • decode new vocabulary
  • apply grammar rules consciously step by step
  • and write accurately

…you are not teaching language; you are testing executive function.

Materials implication

Reduce load by design:

  • one linguistic focus per task
  • minimal text per page
  • predictable task formats (this is key! Don’t be afraid to be repetitive)
  • visual consistency
  • 98% comprehensible input (note: 98% comprehensible input does not mean simplified content, but content made accessible through scaffolding, repetition, and chunking)

A “busy” worksheet is often inaccessible before the pupil even starts!

3. Listening must be the engine, not the afterthought

Many SEN pupils (especially dyslexic learners) process language far more effectively through sound than print. SLA research consistently supports the primacy of input — yet textbooks still privilege reading and writing – and not the accessible sort either!

Classroom implication

Listening should dominate early sequences:

  • teacher modelling
  • choral repetition
  • narrow listening
  • listening with purpose, not just “play and answer”

Materials implication

Design materials where:

  • listening precedes reading
  • texts are short, repeated, and recycled (e.g. the EPI’s narrow listening)
  • audio is exploited multiple times in different ways (e.g. the EPI’s thorough processing techniques)

If listening is just “activity 3”, SEN pupils are already excluded.

4. Patterns must be made visible

SEN pupils are less likely to infer grammatical patterns implicitly. This is not laziness; it’s a cognitive difference. Research on explicit instruction (Norris & Ortega; Spada & Tomita) shows that guided noticing matters. However, research also shows that complicated grammatical explanations are less accessible by students with a lower IQ or less developed executive function.

Classroom implication

Do not assume pupils will “pick it up”. Show them:

  • colour-coded structures
  • sentence frames (e.g. EPI’s sentence builders)
  • chunked patterns

Materials implication

Avoid presenting:

  • vocabulary lists without structure
  • grammar rules without exemplars

Instead, design lexico-grammatical chunks:

voy a + infinitive
me gusta + noun

Patterns first. Labels later. The greatest applied linguists on the planet agree that teaching chunks should come first and explicit grammar explanation should come later – Ellis & Shintani (2014), N. Ellis (2015), Nation (2013), VanPatten (2015), Webb & Nation (2017), etc. In EPI, this is reflected by having a short and snappy Awareness-raising phase (the first A in MARSEARS) immediately after the initial modelling through sentence builders and visual aids and a more robust explicit grammar teaching phase (the E in MARSEARS) after three or for lessons of receptive and productive retrieval of the target chunks.

5. SEN pupils need overlearning, not coverage

Forgetting curves are much steeper for many SEN learners. What looks like “they’ve done this already” is often they’ve seen it once.

Classroom implication

Hence, Recycling is not revision — it is core instruction. If the average child requires

Materials implication

Good SEN-friendly materials:

  • reuse the same language across lessons
  • vary tasks, not language
  • return to the same chunks in different contexts

If your scheme introduces new language every lesson without revisiting old material, SEN pupils are permanently behind. This is one of the biggest shortcoming of the textbooks currently in use in most UK schools, e.g. Stimmt, Viva, Mira,, Dynamo, Tricolore, Studio. Possibly the worst ones are the recently published textbooks based on the new GCSE – often by no fault of the authors, in my opinion, who are constrained by the number of pages set by the publishers and by the ridiculous high volume of content they need to cover.

Table 1 –Encounters with a lexical item required by learners of different abilities (from the average ability ones to those with severe SEN) to develop a BASIC knowledge of it

Key clarification (important)

  • These encounters must be meaningful, not just visual exposure
  • Repetitions work best when they are:
    • spaced (not crammed)
    • multimodal (listening, reading, speaking, matching)
    • embedded in chunks, not isolated words

6. Independence must be earned, not demanded

Textbooks often assume pupils can work independently after one model despite decades of research suggesting otherwise! Sociocultural theory argues that learning happens most reliably in the Zone of Proximal Development—i.e., when pupils can succeed with structured guidance that is then gradually withdrawn (Vygotsky, 1978). Reviews of scaffolding research emphasise that effective support is not “help for the weak”, but a deliberate design feature that enables learners to process language they only partially control and to internalise procedures over time (Malik, 2017; Ertugruloglu, 2023). In the UK MFL context, the Teaching Schools Council’s MFL Pedagogy Review also warns—implicitly for exactly this reason—that textbooks should be chosen for how well they support planned teaching of vocabulary/grammar/phonics and should often be supplemented rather than relied on as the sole engine of learning, because many published materials don’t provide enough structured practice and guided attention to detail for all learners to access them independently. Read this article if you want to know more on this topic: https://www.tandfonline.com/doi/full/10.1080/2331186X.2017.1331533

Classroom implication

Remove scaffolds gradually, not suddenly as we do in EPI, where students arrive at production only after a highly structured journey from input to output which gradually moves from receptive retrieval at sentence-level to more challenging work with connected text in the Receptive phase and then scaffolds the progression from easier productive retrieval at sentence level (e.g. Oral Ping-pong) to harder information-gap tasks (e.g. ‘Back-to-back’ or ‘Ask the experts’) in the Structured Production phase.

Materials implication

Build tasks that move from:

  • full support → partial support → no support

Not:

  • support → (nearly) nothing

For SEN learners, the “blank page” is often the point of collapse. This is another major pitfalls of currently available textbooks, even when a lower-ability specific version of the textbooks does exist. The scaffolding is so bad, that the Listening and Reading activities are not logically linked with the ensuing Speaking and Writing activities! Bizarre, of course, as the former are meant to scaffold the latter. Hence, do ensure that, as we do in EPI, the receptive activities are carefully designed and implemented in a bid to ensure that speaking and writing skills emerge seamlessly and organically from the listening and reading activities staged at the beginning of your instructional sequences.

7. Writing is the hardest output — treat it as such

Writing combines:

  • recall
  • spelling
  • grammar
  • motor skills
  • working memory

For SEN pupils, this is the highest-load skill. Even higher than speaking! The typical textbook expects students to read one or two texts, do a reading comprehension tasks or two on each and then write something similar. This is not going to help the average learner, let alone an SEN child!

Classroom implication

Do not use writing as your default proof of learning. This is the most commonly made mistakes with SEN pupils. Do plenty of scaffolding (see the previous point) ! Give them highly structured 100% feasible output.

Materials implication

Before extended writing, include:

  • sentence completion
  • sentence manipulation
  • ordering tasks
  • easy sentence-puzzle games
  • copying with attention

If the first time pupils write independently is for assessment, you’ve set them up to fail. Delay writing assessment with SEN pupils as much as humanly possible

8. Pace matters more than enthusiasm

Fast-paced lessons are often praised — but for SEN pupils, speed frequently equals panic.

Classroom implication

Calm, predictable pacing reduces anxiety and improves retention.

Materials implication

Design sequences with:

  • repeated task types
  • familiar routines
  • clear expectations

Surprise is motivating for some pupils; it is destabilising for others.

9. Differentiation should be built in, not bolted on

SEN pupils should not always be working on “the easier sheet”. Research on inclusive design stresses universal design for learning.

Classroom implication

Design tasks with multiple entry points, not multiple worksheets.

Materials implication

A good task allows:

  • all pupils to start
  • some to go further
  • no one to be exposed as “different”

Ramped difficulty beats personalised worksheets every time.

10. Progress for SEN pupils is often invisible unless you know where to look

Traditional assessments privilege speed, accuracy, and written output. SEN progress often shows up first in:

  • faster recognition
  • reduced hesitation
  • improved pronunciation
  • willingness to attempt

Classroom implication

If you only value what you can mark, you will miss most progress.

Materials implication

Include low-stakes checks:

  • oral responses
  • mini whiteboards
  • matching and sorting tasks
  • listening discrimination

These reveal learning long before writing does.

Specific advice for teaching MFL to dyslexic children

Table 2 – Teaching strategies specifically aimed at dyslexic children

Teacher StrategyWhy this Matters Specifically for Dyslexic LearnersResearch Basis
Explicit teaching of sound–spelling correspondences (as we do in EPI)Dyslexia is strongly associated with phonological processing difficulties; learners do not reliably infer grapheme–phoneme links implicitlySnowling (2000); Hulme & Snowling (2016)
Overt phoneme segmentation and blending in the target language (as we do in EPI)Dyslexic learners struggle to segment spoken words into phonemes, which directly affects spelling, decoding and pronunciationGoswami (2008); Ziegler & Goswami (2005)
Slow, exaggerated modelling of pronunciation (as we do in EPI)Reduced phonological sensitivity means fast or “natural” speech often collapses into noiseSzenkovits & Ramus (2005)
Consistent font, spacing and visual layout across materials (as we do in EPI)Visual stress and reduced visual tracking make dense or changing layouts disproportionately difficultBritish Dyslexia Association (2018)
Avoidance of copying from the board as a learning activity (in EPI this is done sparingly after much modelling)Copying overloads visual processing and working memory without strengthening language representationsElliott & Grigorenko (2014)
Teaching spelling as pattern-based, not word-by-word (as we do in EPI)Dyslexic learners do not retain arbitrary orthographic forms well but benefit from rule-based generalisationsSeymour (2014)

Why EPI is particularly suitable for SEN learners

Many of the principles outlined above are not incidental features of Extensive Processing Instruction (EPI); they are foundational to its design. EPI is particularly well suited to SEN learners because it systematically removes the very barriers that traditional MFL materials create.

First, EPI places input before output. SEN learners are not rushed into premature production; instead, they are given repeated, highly comprehensible exposure to language through listening and reading before being expected to retrieve it independently. This aligns closely with what we know about the recognition–recall gap in SEN profiles.

Second, EPI actively controls cognitive load. Sentence builders, chunked input, and tightly staged activities mean that learners are rarely asked to process multiple new elements simultaneously. The linguistic focus is narrow, explicit, and sustained over time, which allows SEN pupils to build secure mental representations without overload.

Third, EPI makes patterns visible and reusable. Grammar and vocabulary are not treated as separate pillars but as interlocking parts of lexico-grammatical chunks. For SEN learners who struggle with abstraction, this concreteness is critical: they are not asked to infer rules from sparse examples but are immersed in recurring, meaningful structures.

Fourth, EPI is built around recycling and overlearning. The same language appears again and again across different tasks and modalities, reducing forgetting and increasing automaticity. This is precisely what SEN learners need, yet what textbooks rarely provide.

Finally, EPI embeds scaffolding as a progression, not a crutch. Sentence builders, guided tasks, and structured production phases allow learners to move gradually towards independence. Support is not removed abruptly; it fades as confidence and competence grow.

In short, EPI does not “adapt” to SEN learners after the fact. It is inherently inclusive by design, and what makes it effective for SEN pupils is exactly what makes it effective for everyone else.

Conclusion

Teaching MFL to SEN pupils is not about lowering expectations. It is about changing the route.

When we slow input, reduce cognitive load, foreground patterns, recycle relentlessly, and scaffold intelligently, SEN pupils do not merely cope — they often outperform our expectations. The uncomfortable truth is that many of the practices we label as “SEN strategies” are, in fact, good language teaching full stop.

If SEN pupils struggle in our classrooms, the question is not whether they are capable of learning a language, but whether our materials and sequences are capable of teaching one.

Design for the margins, and the centre takes care of itself.