How many new words should you teach per lesson?

Introduction – The wrong question

The question in the title is one of the most common ones I am asked by colleagues from all corners of the globe. And whenever I have googled that question in the past ten years I have always invariably found the same answer crop up in EFL and MFL forums, blogs and websites: 8 to 10 words per contact hour. I have always wondered where those numbers came from as there is no consensus amongst researchers as to what constitutes an ideal number of new words to teach per lesson. Unsurprisingly so. As I will argue below, it is impossible to answer the question with a precise figure unless we define clearly what we mean by ‘teaching’ and ‘learning’ new words and have a 360-degree awareness of the target learning contexts with their unique interaction of affective and cognitive factors as well as other important individual variables such as the methodology in use, available resources, logistics, timelines, socio-economic factors, etc.

I personally ‘teach’ 20 to 25 words minimum per lesson, but what the word ‘teach’ means to me may not be what other colleagues take it to mean.

The good news 

The good news for MFL teachers in England and Wales is that by the end of a typical GCSE course the estimated vocabulary size of a successful MFL student should be 2,000 words at GCSE Higher and 1,000 at GCSE Lower (Milton, 2006). If we divide that number by 5 years of learning French (from yr 7 to yr 11) two hours per week, that would equate with, 5.2 words per lesson, in truth a very manageable burden. In 2006, however, the national average showed that GCSE students in English state schools had accrued a vocabulary amounting to less than 1,000 words each (see picture, below, from Milton, 2006).


Why the title question is the wrong question to ask yourself

In deciding how many words to teach per lesson one has to take into account a number of contextual factors which play a decisive role in vocabulary acquisition and, more importantly, the depth and range of one’s learning intentions. The question ‘How many words should I teach?’ cannot be answered unless we first consider the following :

(1) Depth of knowledge – Knowing a word entails knowing many things about the word: its literal meaning, its various connotations, its spelling, its derivations, collocations (knowing the words that usually co-occur with the target word), frequency, pronunciation, the syntactic constructions it is used in, the morphological options it offers and a rich variety of semantic associates such as synonyms, antonyms, homonyms (Nagy and Scott, 2000). How deep one intends to go will entail spending more time hence teaching fewer words.E.g., if I teach a set of French irregular adjectives in terms of how they change from masculine to feminine, rather than just focusing on their main meaning and pronunciation of the masculine form, I will evidently have less time which will in turn limit the amount of words I can teach.

(2) Receptive vs Productive knowledge – as Nation (1990) notes vocabulary items in the learners’ receptive vocabulary might not be readily available for productive purposes, since vocabulary reception does not guarantee production. In other words, students may learn to recognize words whilst not being able to use them in speech or in writing. This difference is often overlooked whilst is crucial in planning a vocabulary lesson. If one is planning to simply teach new words for receptive use, they can teach, in my experience, as many as 40 with an able group, as recognition – especially through the written medium – is easier than production.

Moreover, although they are both receptive modalities, learning vocabulary through listening and reading obviously require providing students with two different types of extensive training which means that if you really aim to thoroughly develop the two skill sets – as you should – you will inevitably have less time available.

(3) Speed of recognition and production and degree of contextualisation – When we talk of recognition and production we need to consider (a) the element of speed and  (b) the ability to understand the target words in unfamiliar contexts as markers of mastery . The faster a student recognizes a word (in familiar and unfamiliar contexts) as heard or read will tell us to what degree it has been automatized. The same applies to written and oral production (the hardest to automatize).

A vocabulary item can only be said to be fully acquired when it can be produced spontaneously (and correctly) within the context it was taught as well as unfamiliar contexts. With this in mind, to say ‘I taught ten words in yesterdays’ lesson’ is flawed. I may have presented those words and got the students to practise them and maybe they could recall them in isolation at the end of the lesson or even in one or more sentences. However, that does not mean the words have been learnt, because words are never used in isolation and not simply in two or three sentences learned by rote. Moreover, acquiring a vocabulary item takes weeks and in certain cases even months of practice in context.

(4) Word learnability – the learnability of the target word places further constraints on the number of words one decides to teach. ‘Learnability’ refers to the level of challenge a word poses to the learner. For instance:  long polysyllabic words with unfamiliar phonemes will be harder for beginners to retain; abstract and connotative words are  usually more difficult to acquire than concrete and denotative lexis; cognates are easier to recognize, etc. When deciding how many words to teach, the learnability factor is crucial.

(5) Shallow vs Deep processing –  the method you use will also play an important role in deciding how many words you aim to teach. The deeper the degree of semantic processing the more likely the students are to recall them in the future. Deep processing includes activities such as: establishing association within new and old words, categorizing them; finding opposites and synonyms; writing the definition; inferencing their meanings from context; creating mnemonics to enhance future recall); odd one out; etc. Shallow processing involves little cognitive effort (e.g. learning by repeating aloud; the games). Teacher with effective vocabulary teaching methods are usually more successful at teaching larger amounts of words.

(6) Time, recycling opportunities and learning habits – the numbers of words you can teach will also depend on how many chances you can find in your lesson to recycle them. Do you have enough time, resources or activities in your repertoire for you to recycle each word you set out to teach a minimum of 5 to 8 times (through deep processing tasks) within the lesson? Do you have resources to ensure the recycling of the same items in subsequent lessons?

It takes me a lot of time and effort to create resources that allow me to effectively recycle all the target words I set  out to teach in lesson 1, as well as all the subsequent lessons in which I revisit them. The more words you aim to teach, the more the effort you will have to put in follow-up lessons to create recycling opportunities. This is something you have to factor in when you decide on the number of words to teach in a given lesson or your teaching will have been in vain.

Connected with this is the issue of homework and learning habits and strategies. Are your students the kind of learners who do your homework consistently? If you flip vocabulary learning to them, will they actually do it? What the students do at home and how effectively their learning strategies are will have an impact to on how many words you plan to teach. In the case of one of the two year 9 groups I currently teach the amount of work they do outside the classroom – not their aptitude – profoundly affects the number of words I plan to teach each day.

(7) Chunks –  The memorization of chunks is productive and powerful. It serves two objectives: it enables the student to have chunks of language available for immediate use and it also provides the student with information that can be broken down and analysed at later stages. Chunks allow you teach more words in one go as Working Memory can process chunks made up of 7+/- 2 items (Miller, 1956). Moreover, in real life we rarely process words in isolation.

The main advantage of the use of lexical chunks is that they build on the fluency of the language learner as they facilitate clear, relevant and concise language and are stored as ready-to–use units that can be retrieved and used without the need to compose on-line through word selection and grammatical sequencing. This means that there is less demand on cognitive processing capacity.

I hardly ever teach vocabulary in isolation, unless I am focusing on speed of recognition, decoding/pronunciation or spelling (e.g through the games). I always present vocabulary for the first time either through texts containing comprehensible input which allows easy inferencing from context or through sentence builders (see figure below). Teaching in chunks and short sentences allows me to recycle old material whilst presenting new material but also to include more vocabulary.


(8) Chunking and word awareness – Chunks have another important impact on how many words you will be able to teach. Once you have unpacked each chunk you taught, made the students notice the underlying grammatical pattern (e.g. I want you to go to the cinema) and got them to use that pattern over and over again with new lexical items, you will have enhanced the generative learning power of that chunk. The more morphological (e.g. prefix, suffixes) and syntactic patterns (rather than grammar rules) you teach your students the greater the chances for them to learn new words by ‘hooking’ them to those patterns. This process, known as ‘chunking’ happens in the brain at incredibly high speed in L1 acquisition and plays a crucial role in L2 vocabulary acquisition; hence, the more automatized the ability to recognize those patterns in aural and written input will be in your students, the more likely they will be to learn more words in your lessons.

Word awareness refers to a learner’s ability to ‘unpack’ the way words work both in relation to other words (synonyms, antonyms, collocations, etc.), their word class (adjectives, nouns, etc.) and how they are formed (prefixes, suffixes, etymology, similarities with mother tongue words, etc.). Word awareness promotes chunking, hence, acquisition. Creating a culture of word awareness in your classroom does not require much preparation, just asking lots of questions such as: Is it an adjective or a noun? Does this go before or after the verb? Does it remind you of a word in our language? Why does this word end in ‘-ly’?, etc. Research in word-awareness (also referred to as word-consciousness) it is still pretty scant, but many scholars believe that a strong emphasis on it in the classroom can greatly impact vocabulary acquisition. The more word -aware your students are the greater the amount of words you will be able to teach them in lessons.

(9) The students – last but not least. This is self-evident. Your students are the best source of evidence that you are gauging the amount of vocabulary input correctly. Regular low stakes assessment will tell you how much of what you have taught gets retained or lost along the way as the term advances. Online surveys through google forms or the likes will allow you to find out in a few minutes how they feel about their vocab learning, if you are being too ambitious or spot on. They can also help you find out about their learning habits.

Not all students have the same ability to learn vocabulary. Students who are low in any of the crucial components of language aptitude, especially Working Memory span and Phonemic sensitivity will be particularly disadvantaged and their presence in your class will have to be taken into account as they will be more prone to cognitive overload. Differentiated instruction will be a must in mixed ability classes.

The students’ current level of proficiency will also be an important variable to consider. The more advanced the learner is the easier for them will be to use conscious and subconscious learning strategies to acquire vocabulary. Hence you will be able to teach way more new words per lesson to your advance level students than to your GCSE ones.

Motivation is obviously another crucial factor. I am not going to discuss it as it is beyond the scope of this post. It will suffice to say that motivation enhances cognitive and affective arousal which in turns increases Working Memory span and the chances to memorize words. Hence, the more fun and relevant to your students’ lives and interests your vocabulary teaching is, the more words you will be able to teach effectively.

Concluding remarks

The issues above refer to but a few of the many factors one needs to consider in deciding how many words to teach per lesson. The most important thing I would like the reader to take home from this post is that vocabulary acquisition being a long process, planning a successful vocabulary lesson is about zooming out and thinking about the bigger picture and the longer term: what matters is not how many words you teach in a given lesson but how your subsequent teaching is going to ensure that those words will be automatized both receptively and productively by your learners across a wide range of contexts, both familiar and unfamiliar. In order to do so, the language instructor must master effective vocabulary teaching strategies, know the students well and implement skillful and systematic recycling never losing sight of the challenges that words and the contexts those words are taught in pose to the learner. A culture of word awareness that you build in day in day out through regular questioning, both metalinguistic and metacognitive in nature, will also facilitate your task and allow you to teach an increasingly larger amount of words per lesson, as your students become more alert to the morpho-syntactic properties of the target language words.Ultimately, it will be student feedback and regular low stake assessments that will tell you whether you are teaching the correct amount of words per lesson.


Promoting learning-to -learn: 12 top tips for effective vocabulary learning



Once a month I stage one-to-one conferences with my students in which we address the issues in their language learning  brought up in their reflective journals and/or any other concerns they may have (e.g. a recurring error that is annoying them or a grammar rule they cannot seem to grasp). Last week a couple of my students asked me what they could do to improve their retention of words. My first answer was that they could use my website ( ) or others like Quizlet or Memrise. ‘But what if I do not want to use any language learning websites or online games, sir?’ they replied.

Although I was puzzled and disappointed to learn that my website was not seen as a useful or stimulating means to learn vocabulary, it was refreshing to find that some students are not dependent on language websites for their learning but are eager to find autonomous ways to propel their language acquisition further. After all, the best learners are those who seek self-direction and autonomous mastery.

Personalizing one’s learning experience pays enormous dividends as it involves more cognitive and affective involvement on the part of the learner. But when the learner draws entirely on her cognitive and emotional resources without the help of specialised websites, her effort and investment are likely to be even greater as she is being totally self-directed in her language learning management, without following a pre-determined instructional path dictated by others.

As a result, these very pedagogically sound requests prompted me to set up a little workshop on vocabulary learning strategies which I will deliver to my students next week and which I intend to follow up every so often with short and snappy reminders to use those strategies. This post was written whilst brainstorming the vocabulary learning strategies to include in my workshop. In selecting the tip-top strategies I largely drew on the techniques I myself used as a learner of English, French, Spanish, German, Swedish, Latin, Greek and Malay.

2. 12 tips for self-directed vocabulary learning

The following are the tips I am planning to give my students in the workshop (in more simplified language). They are but a few of the vocabulary learning strategies available in the literature; the reason why they were chosen over others is because (a) they require minimum preparation in terms of resources; (b) do not involve any financial commitment; (c) have high surrender value; (d) for most of them there is plenty of research pointing to their effectiveness; (e) I use them on a daily basis and they have helped me learn 7 languages.

        1.Phonological hooks

Jot down the to-be-learnt words and associate (drawing a mind-map) them with as many words starting or ending with the same sounds. These do not necessarily have to be words in the target language; you may use words in your L1 or L3, too. The retentive power of this technique can be enhanced by creating logical sentences using the rhyming words. Example: target word ‘fair’ – my hair is fair; target word:  ‘hired’ – he was hired then fired.

This tip is based on the principle that words in the brain are more closely connected with words they alliterate, chime and rhyme with. So, on learning a word, give yourself a time limit and brainstorm as many words as you can recall that rhyme with the target items. You can turn this into a competition with your classmate(s).

  1. The Key-Word Technique

This is a memory strategy that has never failed me and that I have been using over the years with all of the language I have learnt.  It involves using synergistically phonological and/or graphological hooks  along with visual imagery to creatively come up with a mnemonic (memory device). Example:  in Malay, the official language of Malaysia, the word ‘sedia’ means ‘ready’. To remember it I hooked it phonologically to the Italian word ‘sedia’ meaning ‘chair’ in my first language whilst associating it with the  image of one of my students ‘Balaj’ who is always ready to leap up out of his chair whenever I ask a question, super eager to answer it. At the early stages of learning ‘sedia’ I would think of Balaj ready to leap out of the chair to answer my questions and never failed to retrieve the correct meaning of the word.

This technique is especially useful when one is dealing with long words . Example: the word jidohanbaiki  (‘vending machine’ in Japanese) could be anchored to the three words ‘judo on bike’ whilst one would visualize a massive bright red drinks vending machine dressed up in a judo outfit on a bike. I learnt this word  through this imagery twenty-five years ago and still remember it .

This technique is effective because it engages the brain through  visual (the imagery and the spelling) and auditory (the phonological  hook) processing whilst at the same time involving the brain in higher order thinking (creativity) and deep processing through elaboration (i.e. establishing complex semantic connections between two or more items). Another learning principle at play here is Distinctiveness – the more we make a word or concept stand out in our memory, the more we increase working memory span and its likelihood of retention.

  1. Emotional associations

When you are learning new words try to associate them with people or objects that are very meaningful to you. So, for instance, on learning about physical or personality attributes in English, use them to describe people that mean a lot to you and whose personality has deeply affected you – possibly by virtue of those very attributes. If your father has often argued with you over pocket money issues you would make up sentences like: ‘My dad is tight or stingy’. If your younger brother is always unwilling to do his chores: ‘My brother is lazy’, etc. You can do this using celebrities, too (e.g. Kim Kardashian is annoying).

This strategy’s effectiveness is based on the principle that an emotional investment in learning any information increases its distinctiveness and consequently the chances of its retention.

  1. Categorizing by meaning and word-class

Sort the words you are attempting to memorize in as many categories based on their meaning as possible, using your imagination as wildly as possible. Do as many rounds of categorizations  with the same words as you can; by doing more rounds you will force yourself to process the words semantically over and over again from different angles therefore following different neural pathways each time. Remember that (1) the heading of each category can be in your first language; (2) there is no right or wrong, as far as the categories make sense to you.

Example (adjectives again): you are trying to memorize the words ‘stingy’, ‘argumentative’, ‘noisy’, ‘talkative’, ‘lying’, ‘poor’, ‘lazy’, ‘active’, ‘toned’, ‘smart’, ‘hard-working’, ‘petty’, ‘cheerful’, ‘amusing’, ‘bright’, ‘well-built’, ‘slim’, ‘stunning’, ‘overweight’, ‘bad-tempered’, ‘treacherous’, ‘unstable’, ’dodgy’(slang),  ‘choleric’, ‘fit’, ‘affluent’, ‘muscular’, ‘depressed’, ‘elated’, ‘underprivileged’, ‘sneaky’(slang); frustrated’, ‘vindictive’, ‘ecstatic’

In the first round you may simply divide them into 2 categories: Positive and Negative; in the second round into Physical Appearance, Personality, Emotional states; in the third round you may want to narrow it down further, as follows:

(a) Physical fitness: toned, fit, well-built, muscular

(b) Uplifting emotions:  elated, cheerful, ecstatic

(c) Money: poor, affluent, rich, underprivileged, stingy

(d) Dishonesty: sneaky , dodgy, treacherous, lying

(e) Negative emotions: vindictive, frustrated, bad-tempered, choleric


The reason why this approach works is because it requires cognitive investment; creates connections between the words you are processing whilst at the same time involving creativity, all of which results in deep processing  of the target items.When the words do not belong to the same word-class but you have a mix of nouns, adjectives, verbs, prepositions, etc. a further type of classification you can perform is by grammatical categories. This metalinguistic activity is much more important than teachers give it credit for, as  (a) words belonging to the same word class seem more strongly linked in the brain (as studies of aphasic patients have shown) and (b) because when we try to comprehend target-language input via listening the brain uses its knowledge of the grammar to analyze it (a process called by psycholinguists ‘parsing’).

  1. Ordering, arranging by size, weight, length, intensity, etc.

During evolution, assessing the dimensions of things and the intensity of phenomena has played a crucial role in our survival. Hence any operations involving sorting people, objects and their attributes are perceived by the brain as important and distinctive; moreover, since they involve evaluation they elicit a fair degree of cognitive investment and higher order thinking. Example: in learning the following adjectives ‘ugly’ , ‘unattractive’, ‘horrible’, ‘cute’, ‘beautiful’, ‘disgusting’, ‘stunning’, ‘pleasant looking’; you may arrange them in ascending order from least attractive to most attractive. Another example: in learning geographical terms, you may arrange them in ascending order of size: pebble –  rock – cliff  – mountain – mountain chain – planet.

  1. Building semantic associations with existing material in Long-term memory

Every learning is highly enhanced the more associations you build between the to-be-learnt information and any information currently existing in your brain. For every new L2 word you intend to learn, search your brain for any L2 words you have already learnt which you may associate with them in terms of meaning – any! . If you know words in other languages that relate in meaning to the target words, add those in too. To enhance the effectives of this technique, explain (even in your own language) how the existing words relate to the new words.


New word: affluent

Related words I already know: (1) money (an affluent person has lots of money)

                                                             (2) rich (an affluent person is rich)

                                                              (3) poor (an affluent person is not poor)

                                                              (4) Taylor Swift (she is very affluent)

A way to make such associations even stronger is to connect the target words with their antonyms and synonyms (if you know any).

The learning principle at play here is that the neural connections between words which are closely associated in meaning are stronger.

As a younger learner, either alone – yes, geeky me ! – or with friends learning the same language I would do a word-chain challenge. This consisted in connecting two words very distant in meaning by connecting them through a chain of words logically associated with each other.

Example: link ‘old lady’ and ‘pollution’

Word chain: old lady – pet – kitten – cat – canned food – alluminum – non biodegradable waste – pollution

If you do this with friends as a competition, give yourselves a time limit and the person who will have come up with the longest word-chain will be the winner.

The potential for learning of any semantic association is strengthened when it is reinforced by sound; hence, as already suggested above, try to find words which alliterate and rhyme or chime as well as having semantic connections.

  1. Word activation in context

If you do not process receptively or use a word/phrase within the first week of learning it is likely to be lost for ever. Hence, try and use it as often as you can. The best way would be with native speakers in face-to-face or phone conversations or online chat. What I used to do when no native speaker was available, was to make up meaningful sentences using the new words – as many as possible – then get a native speaker or one of my teachers to give me some feedback.

To keep the memory trace of the new words alive and kicking over the weeks and months to come until it is fully acquired, you will need to practise over and over again at spaced intervals. Read the next point.

  1. Be mindful or memory decay

As you can see from the curve of forgetting rate below, the time where most of the forgetting occurs is within the first 24 hours from first processing it. Hence, this is when most of the memorization work has to be done; use as many of the above strategies as you can! During the remainder of the first week you should go over the target words over and over again, a few minutes for word-set. Better a few minutes per day than one hour once a week.


  1. Get the pronunciation of the word as close to right as possible from the beginning

Words are activated in the brain by their sound, even when we are processing them silently, as we read. This entails that we must try and get their pronunciation right from day one or you may confuse them in the future with other words that sound similar with harmful consequence for your processing ability even for reading comprehension !. You could learn the IPA (the international phonetic alphabet) as this will allow you to work out the pronunciation of any lexical item you are learning by interpreting its phonetic transcription in the dictionary. I did and it was the best thing I have ever done for learning languages. There are plenty of free websites listing the IPA alphabet characters and recording of how each of them is pronounced. What is important is that you do a lot of independent listening  – don’t simply learn words through the written medium.

  1. Have a storage space for key vocabulary

What I do when I learn a language is creating a vocabulary booklet which I divide in as many sub-topics as I can think of. Whenever I encounter a word or phrase I think is worth remembering I write it down in as many sections of the booklet I feel it may belong to. Example: the word ‘flight’ would fit in the ‘transport’ section as well as in the ‘holidays’ section. Once I have chosen the section(s) I will select one or more existing  words in that section that  I associate the target word most closely with and write it next to them, explaining why the two words are associated. So, for instance, ‘flight’ would go next to ‘plane’ (a flight is a journey by plane) and next to ‘to fly’ (they sound similar and a flight is the result of flying) – again, the connections can be written out in the first language.

  1. Google search the target words

If you are a die-hard vocabulary geek, like me you may  want to find out alongside which other words the target lexical items are most frequently used. The quickest way is to place the word in google search which will result in the search engine giving you a range of predictions, as shown in the picture below. This will give you an insight in some of the most frequent collocations that word is associated with and teach you a new word or two


  1. Use songs

There are beautiful songs in every single world language. The beauty of the Internet is that there are plenty of websites with the translation of those songs in your own language(s).  Songs are extremely useful in terms of vocabulary learning as they repeat key words several times over and the music – especially when it is catchy – provide distinctive, memorable and recurring sound patterns which promote memorization.

What I do is focus on the refrain as it is the part of the song I am most likely to retain by virtue of its ‘catchiness’ and repetition in the song. Moreover, refrains typically contain ‘cool’ and ‘interesting’ words or phrases which often have high surrender value.

4. Concluding remarks

Training the students to learn autonomously is ideally what we should do day in day out. However, it is not always easy nor viable in view of the time constraints imposed by the courses we teach on. The above strategies do have high surrender value, though, and are mostly ‘no brainers’. If we plan to impart them effectively, however,  we must be mindful of the importance of not limiting our input to a one-off session. If we do not, we will have simply raised our students’ awareness of their existence but not developed their intentionality to adopt them, their expertise in their deployment and their self-efficacy as strategy users. Hence, we must keep them ‘alive’ in our students’ focal awareness by embedding them in our daily teaching; by reminding the class to use such strategies in preparation of a mini-vocabulary test you are staging next week; by eliciting their use in class every so often; by providing them opportunities to experience success in the deployment of as many as possible of those techniques,etc. Thus, my workshop is but the beginning of a longer process that will unfold on and off over the next three o four months at least. For  a principled framework on how to implement learner training (learning-to-learn), please refer to this blog.

It goes without saying that all of the above strategies can be used by teachers and material designers in their lesson planning and in creating instructional materials. I always do and I have based my whole website and most of my most popular vocabulary revision resources (e.g. here)  on the above principles.

13 commonly made mistakes in vocabulary instruction


In this post I will concern myself with thirteen very common pitfalls of vocabulary instruction and with ways in which they can be easily pre-empted.

Mistake1 – Relying heavily on shallow encoding practices

As already mentioned in many previous posts of mine, a to-be-learnt word lingers in our Sensory Memory for no longer than two or three seconds immediately after we hear it. Thus, in order to commit it effectively to Long-term Memory, we must perform some form of rehearsal. Rehearsal involves either ‘shallow’ or ‘deep’ processing.

In shallow processing we use repetition or matching a word to a visual cue. In deep processing, on the other hand, the brain performs problem-solving operations which require more attentional investment and higher order thinking (e.g. analysis and evaluation) and are meaning-orientated. Typical vocabulary teaching activities of this kind include:

  • Matching synonyms
  • Matching antonyms
  • Odd one out
  • Matching word and definition
  • Providing the definition of a word
  • Sorting words into semantic categories
  • Creatively finding association between words seemingly semantically unrelated
  • Working out the meaning of a word using the surrounding linguistic context

The reason why deep processing is more likely to result in deeper learning than shallow processing is because (1) it requires more cognitive investment on the part of the learner and, more importantly, (2) it creates more and stronger associations between the to-be-learnt word and existing information and words in Long-Term Memory. The latter point is of paramount importance as failure to retrieve a word (forgetting) is usually cue-dependent, i.e. the brain cannot find the required word not because it has vanished from Long-Term Memory, but because it ‘cannot find its way to it’ in the absence of effective contextual cues (physical or psychological elements that were present at the time of learning the word but are absent at the time of recall).

Example: if you taught your students ten words using some of the very entertaining games (e.g. matching words to pictures; word dictation; spelling games), they will have performed lots of fun activities for 10-15 minutes. True. However, you will have engaged your students in 100% shallow encoding; the number of contextual cues you will have provided them  with will have been very limited (as all they did was word-recognition work); and the associations with previously learnt L2 vocabulary will be zero as Linguascope does not present the words in context.

On the other hand, imagine asking your students to: (1) match the target words with their antonyms and synonyms; (2) sort them into different thematic categories or in terms of size or importance; (3) use them to solve a problem (e.g. working out the meaning of a sentence), (4) fulfill a communicative goal (e.g. booking a holiday or simply interviewing a peer), (5) complete gapped sentences meaningfully, (6) create a poem or song in the target language. Your students will be processing the words in terms of meaning and will build hundreds of associations with other L2 words, other existing information in your brain (e.g. your knowledge of the world) and with many other contextual cues (e.g. their peers, the website used to book the holiday, the things that inspired the song or poem). Last but not least, they will have put serious thought into these activities; not just mindlessly matched words to images and sounds as happens in most online vocabulary learning websites (e.g. Quizlet, Memrise, etc.)

The reason why I created my (free) website was dictated by the need to involve my students in less fun but more cognitively challenging deep processing activities. And it has paid enormous dividends in term of vocabulary learning.

It goes without saying that with absolute beginner learners it is not always straightfoward to create activities that promote deep processing.

Mistake 2 –  Limited contextualized practice

You will have surely noticed, whilst doing a Google search, that as you type a sentence Google offers you a range of predictions as to how that sentence is going to end. You will have also noticed how those predictions get gradually narrowed down as you get closer to the end. In other words, based on their users’ behavior, Google has worked out what you are statistically more likely to type next. Well this is, according to existing Cognitivist models of language production what our brain does, too. Based on the probability that you will utter words Y and Z after word X, your brain automatizes and speeds up language production. So, if you have said ‘Quel âge as-tu? (‘How old are you?’ ) 100 times and ‘Quel âge a-t-il?’ (how old is he?) only 10, it is highly probable that the sentence stem ‘Quel âge’ will automatically retrieve ‘as-tu’ rather than ‘a-t-il’.

The implications of this for language teaching and learning in general are enormous, but beyond the scope of this blog. In terms of vocabulary acquisition, the main implication is that vocabulary items MUST NOT be taught as discrete items or in the very limited range of phrases or contexts in which textbooks usually present it. If we do, we are merely teaching the Audiolingual way – i.e. the relentless memorization of the same words/phrases over and over. That is why it is important to:

(1) teach the target words as contextualized in as wide as possible a range of written or aural comprehensible input which models the target vocabulary (e.g. narrow reading and listening). This can be done even with beginners, provided that the texts used are short and accessible. This is the most important part of teaching vocabulary as it models how words relate to and combine with each other in the target language;

(2) integrate grammar/syntax instruction and sentence combining into the teaching of vocabulary so as to increase the generative power of the target lexis;

(3) teach a variety of verbs + noun collocations (not always the same one or two verbs);

(4) involve the students in a lot of structured and semi-structured communicative practice which requires them to use the target vocabulary in as wide as possible a range of linguistic contexts;

(5) try, as much as possible, when teaching new vocabulary, to provide opportunities for students to use it with previously learnt lexis so as to kill two birds with one stone; on the one hand you will recycle old vocabulary, on the other you will provide a further context to ‘anchor’ the new words to.

In much vocabulary teaching I have observed in 25 years, target words (mostly nouns!) are taught for the most part of the lesson as discrete items and/or within the same basic phrases or pattern. Little modelling in context actually occurs and when it does happen it is limited to one text or two. This has created a generation of students who know – at best – lots of isolated words but do not often know how to interpret/use them in more challenging receptive/productive contexts. Remember the famous saying: you shall know a word by the company it keeps.

Mistake 3. The ‘so what?’ effect

A lot of vocabulary learning these days is divorced from a real-life communicative purpose due to a tendency to an OVERreliance on Quizlet and similar digital tech tools. Humans are goal-orientated beings, hence their motivation and cognition are aroused by problem solving and by the attainment of a goal. The most effective way to learn vocabulary is by activating it in order to carry out several real-life tasks in the context of interactional activities. The ‘so what?’ effect, when compounded by discrete and out-of-context word teaching exacerbates the perception by learners that language lessons are just about memorizing words for memorization’s sake and have not much relevance to the real world.

Mistake 4. Misunderstanding of what progression means in terms of vocabulary acquisition

Often teachers from various parts of the world approach me on social media asking me to give them ideas or help them prepare for an imminent lesson observation by a line-manager. Their main worry: showing progression. However, progression in vocabulary acquisition as measured within a specific lesson is a construct of questionable validity.

Firstly, because the same students who show evidence of learning will lose, in the absence of reinforcement, 40% of what they have learnt an hour after the lesson (will return to this point below); 60 % 24 hours later; and 80 % six days later. Vocabulary acquisition does not occur within one lesson, hence, stating that by the end of the lesson students demonstrate to have learnt the target vocabulary is a flawed assumption.


Secondly, as pointed out above, learning words as discrete entities does not mean acquiring them. You need to be able to understand or use them in context for the attainment of a communicative goal or it has no value.

Thirdly, progression in vocabulary acquisition refers to being able to understand/produce the target lexis successfully across as wide as possible a range of contexts (familiar and unfamiliar), at high speed (fluency) and with a high degree of accuracy. Hence teachers ought to measure all of these dimensions of vocabulary learning before claiming that the target vocabulary has been acquired

Yet many lesson observers in a typical British secondary school will require from their observee-teachers that most of the students demonstrate by the end of the lesson the ability to recall accurately orally and/or in writing most of the target words (mostly in isolation); they will see it as the ultimate evidence of learning. And most teachers, too, will agree that this indeed is their main preoccupation.

This leads to a neglect of all the other very important dimensions of progression alluded to above, at the detriment of effective language acquisition.

Mistake 5. Homework timelines

From what I said above about how forgetting occurs, it is evident that setting vocabulary learning homework for Thursday when you have just taught new words on Monday is not very smart if you know that the vast majority of your students will do it on Wednesday night – as it means they will have forgotten 60 % of what they learnt by then.

Solution: if you teach in a high-tech school like mine you can split up the vocabulary learning homework in two and ask them to send it to you in two installments (e.g. via Google classroom). So, using the Monday/Thursday scenario above, one part of the HW will be due on Monday eve and the other one on Wednesday.

Mistake 6. Not planning which level of acquisition you aim at in a lesson

Much ineffective vocabulary teaching stems from not deciding which level/facet of vocabulary acquisition (of the ones mentioned above) one is focusing on. In planning a lesson it is important to decide whether one wants to focus on receptive skills rather than productive ones or on both. Is it just listening for modelling and/or comprehension you want to focus on in lesson one (today) because you want to focus on speaking and pronunciation in lesson 2 (tomorrow)? Is it only the grammatical usage of the target adjectives you are mainly concerned about? Or are you focusing on enhancing speed of retrieval (fluency)?

I also usually decide which 10-15 of the 20-25 words I typically aim to teach in a given lesson will be in my students’ focal awareness and which 10-15 will be in their peripheral awareness. This is another important decision to take in order to pre-empt student cognitive overload.

Mistake 7 – Using audio-tracks to introduce new words

Using audio-tracks to introduce new words has become common practice in many classrooms these days. This can be justified when the teacher does not have a good target language pronunciation; however, when she does, this ought to be avoided. The teacher must clearly show how each new target word is pronounced and get her students to imitate her mouth movements, especially with sounds that are more notoriously challenging, such as the French ‘in’ and the ‘en’ sounds in the words ‘singe’ and ‘serpent’.

This pronunciation-visibility issue is often compounded by the fact that recordings  tend to pronounce the target words at native speed. This can be detrimental when dealing with novice learners whose decoding skills are poor and would benefit from the pronunciation being slowed down in order to render the sounds more intelligible.

Mistake 8 – Using word-lists and mats/sentence builders with students with poor decoding skills

Often students are provided with word lists and talking/writing mats packed with unfamiliar lexical items. I, for one, love using writing mats and have come up with an instructional sequence based on their deployment that I implement quite frequently in my lessons (outlined in a previous blog). If teachers have trained their students extensively in L2 decoding skills there will be no problem as they will be able to convert most of the words into sound fairly accurately. However, in most secondary schools this is not the case.

This can be very harmful since, as I have explained in several blogs, correct or near-correct pronunciation of L2 words is of crucial importance to successful L2-acquisition and performance (Walter, 2008). The main reason is that memory is sound-mediated, so successful recall of L2 words and their meaning require their accurate phonological encoding.

In many lessons I have observed teachers usually pronounced the words and got students with no-decoding-skill training to repeat them aloud a few times before using the words on the lists/mats. However, since words linger in Working Memory for only a few seconds, only a few gifted learners could actually pronounce the words correctly in the subsequent oral tasks. The rest experienced cognitive overload. Hence the teacher ended up having to correct the same students on the same mistakes over and over again for the whole duration of the lesson. At the end of the lesson the pronunciation of the new words was still generally quite poor.

A possible solution: when one is using word-lists and writing mats one may want to model those words extensively through lots of listening and micro-listening tasks. As far as listening is concerned, the easiest zero-preparation way to do this is (a) to utter short accessible sentences and ask the students to write their meaning on MWBs or (b) micro-dictation/transcription tasks. Narrow listening tasks require more preparation but yield excellent results. As for the micro-listening tasks, please refer to this post:

An even better solution: teach decoding skills from the very early stages of instruction so as to avoid these problems when you will be providing your students longer and more complex vocabulary lists for independent learning in the future. An effective L2 decoder is a more effective autonomous learner on many accounts. Unsurprisingly, research has evidenced a correlation between good decoding skills and the pursuit of language study at GCSE (i.e. it is the students with more effective decoding skills who usually choose to continue to study MFL after year 9).

Mistake 9. Presenting and practising new vocabulary moslty in its written form 

As already discussed in my previous blog ‘Nine research facts about pronunciation’ L2 graphemes (letters) automatically activate L1 pronunciation. Hence exposure to L2 words in their written form ought to be avoided as much as possible with beginner learners who have not developed a stable representation of the L2 phonological system. When new lexical items are indeed presented, they should be presented through visual aids or gestures first and then in their written form or simultaneously in both.  I prefer the former modus operandi.

10. Causing cognitive overload

This issue refers to many scenarios I witnessed. Here are four common ones.

(1) the teachers is overambitious and aims at teaching too much vocabulary – without deciding on the receptive vs productive / core vs peripheral dichotomies. The result is poor overall recall.

(2) (with novice learners) the teacher selects complex words which pose a series of important challenges in terms of pronunciation and/or grammar (e.g. word order and agreement). For what we said about the importance of pronunciation and the limitation of Working Memory capacity in terms of phonological storage, teachers must select the target words carefully. When faced with polysyllabic words containing challenging phonemes, one must deploy strategies to make them more accessible to learners both receptively and productively (e.g. ‘chunking’). My colleague Dylan Vinales uses humour, body language and focus on muscle memory as a way to make the pronunciation of such words ‘stick’. Here is a short clip demonstrating the very simple and minimal preparation way in which he does it: )

(3) the teacher selects a lot of cognates in the belief that they are easy to pick up. However, whilst cognates are easy to learn receptively (especially in reading) they can pose serious cognitive challenges in certain aspect of production especially pronunciation and writing, causing processing inefficiency issues. I have experienced this first hand at the early stages of my Spanish learning; Italian and Spanish being so close I would often misspell words which differed by only one letter. Obviously, this issue does not have a major negative impact on acquisition. This phenomenon is referred to by psycholinguists as ‘cross association’

(4) two or more near-homophones are taught in the same lesson. This, too can cause cross association. This happened to me yesterday in my year 7 French class. A week earlier I had just finished a unit on animals in which I had taught ‘grenouille’ (frog) and as we were asking each other what we thought about different rooms in the house, my student Abi asked me: ‘ Tu aimes le grenouille?’ (do you like the frog?) when he actually meant to ask ‘Tu aimes le grenier?’ (do you like the attic?). He had cross-associated ‘grenier’ and ‘grenouille’ due to the common stem the two words share.

11. Not focusing sufficiently on the form of words

When we acquire vocabulary, we tend to learn the meaning first. Form, its morphology, its sound and its syntactic properties emerge later. In deep-orthography languages (e.g. French, English, Chinese), spelling and decoding skills emerge later, for obvious reasons considering how difficult it is for learners to much spelling to print.

Whenever I conduct workshops, teachers ask me why they students have issues with spelling. The ‘Je m’apple’ error is often cited as an example of the students’ deficit in this area. I always answer with the following question: how often do you spend on teaching spelling?

The four different dimensions of knowing a word mentioned above (morphology, sound, meaning and syntax) are stored in four different parts of the brain. Are job is to connect them skillfully through a balanced amount of practice in all four areas. Research (e.g. Boers, 2021) points out that one area that is massively neglected by teacher is the oral form of words. In light of this finding, the current push for the teaching of phonics can only be a good development.

12. Staging productive retrieval practice activities without insufficient rehearsal

For retrieval practice to be effective, it needs to be practised after much rehearsal of the receptive sort. These days, there is a dangerous tendencies to stage retrieval practice activities too early due a misunderstanding of the Robert Bjork’s notion of ‘desirable difficulties, i.e. the idea that making the students try to retrieve the target vocabulary from memory is conducive to learning by virtue of the cognitive effort it involves. Whilst retrieval practice activities are key to learning, and have been common practice in language learning for centuries, they should be staged only when you believe that they will result in a high success rate or they will be counterproductive. Hence, it is key to stage a lot of receptive retrieval activities, carefully graded in terms of difficulty and to scaffold the initial stages of retrieval practice before involving the students in oral and written retrieval practice.

13. Not attaining fluent retrieval

A vocabulary item can only be said to have been acquired when learners can access it automatically and across a wide range of contexts. Automatic access is key. These can be achieved, according to much research (see Segalowitz, 2010; Nation et al, 2016), through (1) systematic revisiting of vocabulary across the units of work and (2) a deliberate type of training which fosters fast and effortless (fluent) retrieval. Such training, more popular in the TEFL than in the MFL/WL world,  involves engaging learners in tasks eliciting repeated processing of the target items and their recognition and/or production under taxing time constraints. Unfortunately, many teachers stop at the ‘mastery’ stage of knowing a word, i.e. when the students demonstrate a high rate of successful recall in an end-of-unit vocab test. A this stage, though, vocabulary isn’t often entrenched yet, and subject to the laws of decay and disuse, it will be likely forgotten a few weeks down the line unless we constantly revisit it using a wide range of tasks. Using the same task (e.g. a Quizlet), as is often done, to practise the same vocabulary set, is not a good idea, since memory is context dependent. Hence, practice with the same task enhances memory which is context- specific but not readily transferrable to other tasks (the so called Transfer Appropriate Processing principle).

Concluding remarks

Some of the above mistakes are more serious than others and may have a more long-lasting detrimental impact on vocabulary acquisition and language learning in general. Vocabulary learning being one of the most important aspects of language acquisition, teachers need to be mindful of the issues discussed in this post. The most important mistakes, in my view, pertain to four areas. Firstly, the bad habit of not contextualizing the teaching of lexis and wasting too much classroom time on discrete-word teaching (which can be flipped). Secondly, the importance of getting the students to learn the words by using them orally or in interactional writing for real-life communication. Thirdly, the insufficient amount of listening practice devoted to modelling good pronunciation and, fourthly, the very limited focus devoted to decoding skills, one of the most important sets of lifelong learning skills a linguist may ever wish to have.

You can find more on vocabulary teaching and learning in my book ‘The language teacher toolkit’ , co-authored with Steve Smith and available for purchase on


13 key steps to successful vocabulary teaching.


The following are the principles that underpin vocabulary teaching in my everyday practice. The reader may want to refer to my article ” How the human brain stores and organizes vocabulary and implications for the EFL/MFL classroom” for the theoretical background to my approach.

1. Select the vocabulary based on :

  • Learnability (how easy or challenging lexis is in terms of length, pronunciation, spelling, meaning, grammar, word order in a sentence etc.);
  • Frequency;
  • Relevance to students’ interests, back-ground, culture/sub-culture;
  • Semantic relatedness (the more strongly semantically inter-related the target words are, the stronger the chances of retention)

Since I usually create my own vocabulary teaching resources (see the work-out section of or )  it is easy for me to keep track of the target lexis through each step of the lesson. If you do not make your own resources, especially if you are a novice teacher, it may be useful to draw a list of the words you intend the students to learn to ensure that systematic recycling does occur throughout the lesson.

2. Decide which lexical items you are planning for students to learn receptively (for recognition only) and productively (for use in speech and writing) – ‘receptive learning’ being obviously easier.

3.Decide on how ‘deep’ your teaching of the target lexis is going to go. In other words, which levels of knowing a word you are going to teach. Nation (1990) identified the following dimensions of knowing a word:

Learner knows:

  1. Spoken form of a word;
  2. Written form of a word;
  3. Grammatical behavior of a word;
  4. Collocational behavior of a word;
  5. Frequency of a word;
  6. Stylistic appropriateness of a word;
  7. Concept meanings of a word;
  8. Association words have with other related words.

4.The number of words that you select per lesson will depend largely on the students you are teaching and how systematically one wants the target lexis (every single item) to be recycled. There is a myth that one should teach 7+/- 2 words per lesson. This rule of thumb is based on a misunderstanding of Miller’s (1965) law which posits that Working Memory can only hold and rehearse 7+/-2 digits at any one time. But Working Memory span has nothing to do with how many words one can learn in a lesson. In my experience, with an able group (i.e. students with highly efficient working memories) one can aim at as many as 20-25 words/lexical chunks receptively (especially if the lexis includes cognates) and around 10 to 15 productively. This on condition that the words are recycled frequently, systematically and as many retrieval cues as possible are provided  (by building in lots of semantic associations);

5.In order to avoid the risk of cross-association, avoid selecting items which are similar in sound and/or spelling with younger or less able learners:

6. Ensure that the lexical items selected include a good balance of nouns and verbs – as I discussed in previous posts, there is an unhealthy tendency in many MFL classrooms for vocabulary teaching to be noun-driven.

7.Plan for several recycling opportunities throughout the lesson through various modalities (e.g. listening, speaking, reading, writing, body gestures). Ensure students process receptively and productively each and every target lexical item 8 to 10 times during a given lesson – more if possible! Since research clearly shows that learners notice adjectives and adverbs less, greater attention should be given to these two word-classes;

8.Ensure the recycling opportunities include activities which involve:

  • Higher order thinking skills / Depth of processing (e.g. odd one out; matching with synonyms; inferencing meaning from context, etc.) – the deeper and more elaborate the level of semantic analysis of the target lexis, the stronger the chances of ensuring retention;
  • Communicative activities involving information gaps and negotiation of meaning (surveys; find someone who; find who does what; etc.)– a substantial body of research indicates that these activities significantly enhance vocabulary acquisition;
  • Work on orthography;
  • Work on phonological awareness;
  • Work on the words’ grammar;
  • Work on the words’ collocational behavior;
  • Semantic and phonetic associations with previously learnt lexis;
  • Inferential strategies (e.g. understanding texts in which the target lexis is instrumental in grasping the meaning of unfamiliar words);
  • Creating associations with each individual learner’s personal life experiences;
  • A competitive element (games) ;
  • Personal investment / self-reliance (e.g. using dictionaries; creative use of the target words).

Give as much emphasis as possible to the correct pronunciation of the target lexis from the very early stages as vocabulary recall is phonologically mediated. Also, ensure that when you teach vocabulary work is as student-centred as possible in order to maximize the level of individual cognitive investment in the learning process. Do remember that retention is more likely to occur when learning involves deep levels of processing and substantial personal investment.

9. Ensure that words are practised in context not in isolation – hence, if you are staging games or other ludic activities, ensure that they involve the processing or deployment of the target lexis within meaningful sentences. Since exposure to the target lexis through the listening medium is often neglected in the typical UK classroom, ensure that students get plenty of aural practice.

10. When using visuals ensure they are as unambiguous as possible. If using visuals to present new lexis, make sure that they are not exposed to the spelling of the words until after you have practised its pronunciation a few times;

11. Draw on the distinctiveness principle as much as possible to ensure that through visuals, anecdotes, jokes or special effects the most challenging vocabulary items are made to stand out, memorable;

12. Occasionally – not in every lesson – select a strategy or set of memory strategies (e.g. the keyword technique) to model to and train students in. If you do teach memory strategies, ensure that you recycle and scaffold practice in those strategies in several subsequent lessons to keep them in the learners’ focal awareness for as long as you deem necessary for uptake to occur;

13. Plan for systematic and distributed (a little bit every day rather than a lot in one go) practice/recycling of the target lexis in homework and future lessons. Remember Ebbinghaus’ curve (figure 1, below), mapping out humans’ rate of forgetting and set homework accordingly so as to prevent memory decay. The fact that your students WILL forget  to aound 67 % of what they ‘learnt’ in lesson after one day should prompt to plan your recycling carefully. Figure 2 shows how I do it for grammar and vocabulary, i.e. a spreadsheet that lists vertically the items to recycle and horizontally each week of the present term; the sheet allows you to keep track of how often you have been teaching a given set of vocabulary and to plan for future recycling.

Figure 1 – The rate of human forgetting


Figure 2 – Recycling tracking sheet


Here are ten commonly made mistakes in vocabulary instruction that every EFL / MFL teacher ought to look out for: 10 commonly made mistakes in EFL vocabulary instruction

Does too much noun-orientated foreign language teaching hinder our students’ learning?


When one observes language lessons, browses through textbook vocab lists, schemes of work, published worksheets / Powerpoints and specialised websites, with very few exceptions, one cannot help but notice that nouns make up the overwhelming majority of the target vocabulary. Look at the vocabulary website most widely subscribed to by British schools in the UK and around the world,, for instance; it teaches hardly any verb (as in: their meaning), adjectives, adverbs or function words. When it does, it is as part of formulaic units, set phrases such as ‘regarder la télé’, ‘je me couche’ or ‘je joue au foot’.

The emphasis on nouns is acceptable at the early stages of foreign language instruction, when the students are not familiar with verb conjugations, tenses and aspect, may not master modal verbs like ‘vouloir’,’pouvoir’ and ‘devoir’ and have poor control over word-order and syntax in general. However, as language learners progress along the proficiency continuum and acquire more knowledge and control over the mechanics of the language, learning verbs becomes imperative. The reader should note that I am not arguing in favour of teaching masses of verb conjugations (although I personally think it is important) ; what I mean here, is learning the meaning of verbs, i.e. that ‘manger’ means ‘to eat’, ‘courir’ to run, ‘nettoyer’ to clean, etc.

The benefits for language learners of possessing a wide repertoire of verbs (even when one does not master conjugations perfectly) are self-evident in terms of enhanced receptive and expressive power and, consequently, more effective autonomous competence. There are also other benefits for learning which may explain how the practice of not emphasizing the learning of verbs may be correlated with lower levels of proficiency.

One benefit relates to adverbs, one word-class which is, in my experience, underrepresented in the oral and written output of foreign language learners. The learning of many adverbs goes hand in hand with exposure to / use of verbs, as adverbs are mostly used with verbs. Hence, the more frequent the exposure to / use of verbs, the greater will be the chances of learner uptake.

Another benefit relates, at least with German, French, Spanish and Italian learning, to the acquisition of another word-class: prepositions. Very often A-level students find it hard to select the correct preposition to use before infinitives. Prepositions not being semantically salient features, they require, in order to be effectively learnt, much more emphasis than teachers usually give them. I believe that, if verbs were given more emphasis the prepositions they usually collocate with, will be acquired more effectively and effortlessly (sparing the students the hassle of having to look them up incessantly in dictionaries).

A greater benefit of emphasizing a more verb-based instructional input refers to the greater power to access more complex texts. Currently, most GCSE level reading and listening materials tend to be relatively poor in terms of variety of verbs – certainly poorer than authentic target language materials. Thus, course-book /materials writers are sometimes forced to ‘doctor’ authentic texts to simplify them or to create noun-ridden texts with very few unfamiliar verbs (often translated on margin) in order to facilitate comprehension. A wider repertoire of verbs may allow foreign language students not only to produce more complex output, but also to access genuinely authentic material in the target language with positive wash-back effect on learning.

The greatest benefit of learning more verbs, however, relates to another aspect of verb acquisition: the mastery of conjugations and tenses. Giving more emphasis to the learning of the meaning of verbs may also result in the learners electing to use them more often in their output, thereby increasing the chances of receiving positive/negative feedback on their correct use (i.e. verb ending is wrong); the greater exposure to such feedback may increase their focus on verb endings and their declarative knowledge of verb inflections. Moreover – to go back to the point made in the previous paragraph – being able to access more complex texts will also mean greater exposure to more complex use of verbs, tenses and moods; this may bring about improvements in their own mastery of verbs, tenses and moods, especially if teachers exploit the texts effectively in the pre-reading/-listening phases.

The Cambridge International Examinations (CIE) board seems to have acknowledged the need for more verb-based instruction in their new IGCSE syllabus. In order to obtain full score in one of the essay-writing assessment traits, the candidate now needs to produce 18 different verb forms correctly (out of 140 words). This means that only the first instance of a given verb is counted each time; e.g. if the learner writes ‘je fais’ three times, will only score one tick , whereas until last year, it would have scored three. A sign that CIE acknowledges the importance of verbs as determinant of higher proficiency? I think so.

And what about adjectives? I think adjectives get more emphasis than verbs, although not as much as they deserve. Mainly because physical and character description is a topic which receives a lot of emphasis since the very beginning of level of language instruction, in England. Moreover, the National Curriculum made it compulsory to deploy adjectives in order to attain the old Level 4. However, in the realm of adjectives, too, one cannot help notice the narrowness of scope of the adjective pool taught in British course-books. The most highly downloaded of the worksheets I have uploaded on the TES connect website ( ) and the most visited pages on my website ( , the ‘work-outs’ section) are the ones dealing with adjectives, a sign that teachers do feel straight-jacketed by the textbook in use.

What is the way forward? The implications for teaching and learning are obvious:

  • Course-books and teacher should place much more emphasis than they currently do on verbs in terms of the comprehensible input they provide and of the output they intend to elicit from students.
  • Verbs (as in: their meaning) should be explicitly taught and practised extensively across all topics, as often as possible, not necessarily in their full conjugations (although it would desirable). The remark made by some colleagues that it is difficult to find visual stimuli to associate to all verbs when presenting them is not necessarily true. The verb trainer section of demonstrates that this is not the case.
  • Listening and reading comprehension activities should be based on texts containing as wide a range of verbs as possible. The use of parallel texts of the likes found on can be very helpful in this, respect, too. Many of the activities on provide great practice, too.
  • Verbs should not be taught as discrete units only; their use should be modelled through as wide as possible a range of contexts. This does not mean simply teaching them as part of unanalyzed lexical chunks; learners need to learn how to use the verbs flexibly across contexts.
  • When the students are developmentally ready, students should receive practice in conjugating verbs through a mix of activities involving online self-marking online conjugators (e.g. or ) ; gap-activities / translations (scores of these can be found on or on the excellent )
  • Adjectives and function words (e.g. prepositions and adverbs) should receive more emphasis and be extensively recycled, too.

In conclusion, nouns seem to dominate a lot of foreign language learning. Mainly I suspect, because they play a crucial role in the comprehension of input and in communication. But verbs are equally important; exactly as it happens in first language acquisition, a rich and accurate deployment of verbs is a marker of higher proficiency and allows learners to engage with more complex texts and to produce higher level language. The recommendations I put forward above are based on common sense and on my experience as a teacher and do benefit learners, especially in the long –term. After all, much of the post-GCSE gaps that teachers have to bridge at A-level does relate to a great extent to the curriculum deficits highlighted in this article.

How lexis is stored and organized in our brains and implications for the MFL classroom

During my MA TEFL at Reading University, nearly 20 years ago, I stumbled into a book called ‘Words in the mind’ by  Aitchison’ (1986; but latest edition: 2012). That book changed the way I teach vocabulary forever because understanding the way our brain stores, organizes and forgets the words we learn meant being able to come up with strategies to speed up and consolidate lexical learning. In this article, I intend to share some of the knowledge I acquired from that book and through many other subsequent readings (e.g. McCarthy, 1990, Eysenck,2000; Nation, 2001; Macaro, 2007) and how it can enhance L2 vocabulary acquisition. Although I intend to discuss the implications for the classroom, I will do so very concisely, reserving to elaborate on them in a future article, for reasons of space. Before discussing how vocabulary is organized in Long-term memory (LTM) one need to understand a few important facts about it.

1.Long-term memory (LTM) and Spread of Activation

As you may know, once information is learnt, it is stored in LTM, a vast neural network connecting every single piece of information we have acquired in our lives. Thus, in actual fact, our LTM makes us what we are as it contains all our emotional and sensorial experiences, every cognitive and motor skill we have learnt and, basically, all we know about the world, including lexis and grammar rules.

The LTM ‘space’ where we store lexical items is referred to as ‘Mental lexicon’. Contrary to what scientist believed in the past, any information that makes it to LTM, is stored there permanently, and forgetting does not occur due to decay of the memory trace (see below).

When we need to translate a given ‘thought’ (or ‘proposition’, as psycholinguists call it) into words, the brain fires electrical impulses which travel at very high speed through LTM’s neural pathways in search of the words that match that thought. During this process, every single word associated with that thought receives activation.

2.1 How first language words are organized in our brain

When a lexical item is stored in LTM, the brain does not place it in just any random place along our neural networks. Insight form research on the slip-of-the-tongue phenomenon and aphasia indicates that the neural connections between the lexical items in our mental lexicon are determined by specific associative mechanisms which involve the physical aspect of a word as well as the metalinguistic, semantic, sociolinguistic and emotional domain.

2.2 Physical associations

Words are associated at the ‘physical level’ based on their spelling (graphemic level) and sound (phonological representation). Thus, words that look and sound similar (alliterate, rhyme and chime with each other) are more likely to be very strongly associated. Consequently, when our brain (our Working Memory) attempts to retrieve the word ‘dog’ from LTM, for example, and activation spreads in order to ‘fetch’ it, all the monosyllabic  words starting with ‘d’ and ending in ‘g’ will receive strong activation (e.g. Doug, dig, door, etc.). Interestingly, even the anagram of ‘dog’, ‘god’ will be highly activated.

This phenomenon explain slip-of-the-tongue errors, which are basically ‘computing mistakes’ often due to processing inefficiency, whereby instead of retrieving the word we need, we retrieve a ‘near homophone’. That’s why alliterations, rhymes, para-rhymes and other phonetic devices used by prose writers and poets are so effective in reinforcing the impact of two words in their texts which are already related in terms of meaning and thereby receive greater emphasis by their phonological connection.

2.3 Semantic Association (Field theory)

Words are very strongly linked to each other, based on their meaning (Field theory). Synonyms and other words that refer to items frequently associated in real life will also receive strong activation during the retrieval process. Going back to the ‘dog’ example, words like ‘pet, ‘bone, ‘puppy, ‘tail’ and ‘bite’, amongst others, will be activated during the retrieval process, each receiving more or less activation in our brain depending on: (1) how often I will have processed (receptively or productively) those words in conjunction with the word ‘dog’ in the past; (2) how frequently, in my personal life, the items those words refer to, are associated with the notion of ‘dog’.

Semantic associations will also be affected by the connotative meaning that a specific culture of sub-culture attaches to it. Thus, whereas the word ‘fox’ is associate both in Italian and English with the notion of ‘shrewdness ‘ and consequently to the related nouns and adjectives, the word ‘chicken’ will be related to cowardice in English but to gullibility in Italian.

2.4 Linguistic context

This point sort of relates to the previous one but deserves separate treatment because it specifically refers to the linguistic contexts in which two or three given words are used in a specific language and which may differ across languages. So for instance, the word ‘dog’ will bring about different associations to an English native speaker’s brain compared to, say, an Italian native speaker’s by virtue of the linguistic context they are found in a number of set phrases/idioms. An English person will associate ‘dog’ with the phrase ‘a dog’s life’ or ‘to work like a dog’ for example; an Italian, on the other hand will associate it with the idiom ‘solo come un cane’ (‘as lonely as a dog’) or ‘fa un freddo cane’ (‘it’s freezing’ or literally: ‘it’s dog cold’).

2.5 Word-class

Words are also organized by word-class, adjectives with adjectives, nouns with nouns, etc.

2.6 Emotional and sensorial connections

Every lexical item is also strongly associated to personal experiences and memories stored in our Episodic Memory. So if we had a very traumatic experience in our life which involves a dog (being bitten or scared by one when we were small, for example) ‘dog’ will evoke strong negative emotions and words describing objects, people or feelings related to that traumatic experience will receive strong activation.

Words will also be associated with sensorial perceptions (taste, smell, images, etc.)  based on one’s life experiences.

3. The foreign language mental lexicon

In a fluent foreign language learner with a sizeable vocabulary repertoire, the way words are stored in their L2 mental lexicon will be pretty much the same, except that there is another very important association, the one between an L2 word and its L1 (and L3,L4, etc.) translation(s). So the word ‘dog’ in the brain of a speaker of Italian, French and German will be connected with the words ‘chien’ , ‘cane’, hund, etc.  Consequently, when spread of activation occurs in search for the word ‘dog’ in one language, say ‘French’, all the words in the other languages will be activated too (Parallel activation theory); all languages one speaks will be activated simultaneously with different levels of activation, with the language in use being the most activated, and the weaker language(s) being the least activated. This explains the phenomenon whereby some learners when experiencing cognitive-processing issues in the target language, will retrieve an L1 word instead of its target language equivalent.

When the foreign language learner is not fluent, there will be fewer L2-to-L2 word connections as the mental lexicon will be smaller and many of the other connections that we discussed above might not be formed as yet – since the learner might have not internalized the word-class of all the words they acquired and/or their meaning might be fuzzy. This means that when spread of activation occurs, fewer linguistic items will be activated.

The fact that in a less fluent learner with a relatively small vocabulary repertoire there are fewer and weaker connections of the kind outlined above and therefore fewer neural pathways, majorly affects recall in that the more connections we have, the more likely we are to retrieve any word we need successfully and with little cost on Working Memory efficiency. Why? Because the successful retrieval of a word depends on two factors; (a) the strength of the memory trace, that is how often we have processed that word and (b) the use of an effective cue which helps Working Memory find that information in the brain; the more the connections a word has with other information stored in LTM the greater the chances of its successful recall will be.

4. How forgetting happens

In order to better understand the implications for teaching and learning one needs to be familiar with the notion of ‘Cue-dependent forgetting’.

4.1 Cue-dependent forgetting

The reason why we often fail to retrieve a word that we learnt is usually due less to a weakening of the memory trace than to failure to find that word. The factors that determine such failure refer to the context in which that word was encoded (‘learnt’) as that very context  provides the cues crucial to its retrieval. For example: if we learn a word highlighted in red, on our teacher’s whiteboard whilst sitting near a specific classmate,the colour red, the teacher’s whiteboard and that classmate have the potential to be effective retrieval cues for that word. The absence of these three factors may prevent recall of the same word.

In the context of vocabulary learning, this implies that the more associations are created by the foreign language learner in learning a word, the more likely s/he will be to remember it, because each association will have the potential to serve as a retrieval cue.

4.2 Forgetting from consolidation

Another possible reason why we forget is that when we take in new information, a certain amount of time is necessary for changes to the nervous system to take place – the consolidation process – so that it is properly recorded. If this consolidation process is not completed we will lose the information. As I have already pointed out in my article ‘The fundamentals of vocabulary teaching’ (elsewhere on this blog), without rehearsal of the target vocabulary, 60 % of it will be forgotten within 48 hours of having ‘learnt’ it. For this reason we need to recycle the information over and over again until this information is stored permanently in LTM.

5. Pedagogic implications

In view of the way words are organized in our brain, these may be some useful teaching strategies:

  • In any given lesson we ought to teach words that are as closely related as possible at semantic and grammatical level. This is often done by textbooks.
  • When teaching new words, in order to facilitate their storage and recall, teachers should try as much as possible to hook them with previously learnt lexis which alliterates, chimes or rhyme with the new vocabulary. This can be turned into a game whereby students are given the task to find (under time constraints) a rhyming or alliterating word for the new target vocabulary;
  • We should also ensure that, from the early stages of acquisition students are aware of the word class an item belongs to. This will provide the learner with an added retrieval cue in the recall process. For instance, students could be asked to categorize the target words into Adjectives, Nouns, Adverbs, etc. or to brainstorm as many words they learnt on the day in those categories;
  • As many opportunities as possible should be found for learners to relate words, especially the challenging ones, to their personal and emotional life. For instance, whilst learning colours the students may be asked to match each colour to an emotion or physical state. Or, when learning food ask the learners to say which fruit, pastry, drink. etc. they identify with and why ( e.g. a ‘raviolo’ because I am full of goodness);
  • The learners should also be involved in activities requiring them to perform more elaborate semantic associations (deep processing) between the new target vocabulary and previously learnt lexis. For instance, by asking students to create ‘lexical chains’, i.e. given two words quite far apart in meaning, learners need to produce an associative chain of lexis that links those two items logically or pseudo-logically. For example: old lady, cats, cat food, cans, aluminum, factories, pollution) This activity can be fun and does not require knowledge of complex vocabulary.
  • Activities involving semantic analysis of words, such as odd one out, definitions games, sorting vocabulary into semantic categories, matching lexical items of similar or opposite meanings, should also be performed as they create further associations, although less explicit than the ones envisaged in point (5) (see for self-marking online examples of these);
  • Teachers should be careful when teaching cognates that are graphemically or phonologically very close in the two languages. This sort of L2-cognates can be ‘tricky’ as they are so closely associated with their L1 translation that they can give rise, under processing-inefficiency conditions to the retrieval of first language form. I often experience this phenomenon (called cross-association) myself when speaking or writing in Spanish.
  • Finally – this has more to do with forgetting than word storage – teachers and learners should ensure they go back/recycle the target vocabulary across as many contexts as possible and as often as possible until it has been fully acquired – especially during the two days following the initial uptake, when most of the forgetting usually occurs.

As I have already mentioned above, I will discuss the classroom implications in greater depth in a future blog post in which I will also suggest a vast array of vocabulary building activities.

Useful follow-up to this article can be found here:

(1) Ten commonly made mistakes in L2-vocabulary instruction 

(2) Thirteen steps to effective vocabulary instruction 

Please note: more on the above can also be found in the book I have recently co-authored with Steve Smith: The Language Teacher Toolkit, available for purchase at 

The fundamentals of L2 vocabulary teaching

This article aims to answer the following questions:

  • What does ‘learning a word’ actually mean? When can we be satisfied that a student has actually learnt a given vocabulary item?
  • How can we enhance our students’ recall of the target vocabulary? How can we ensure that they do not forget what we taught one, five, ten and twenty lessons ago?
  • How can we effectively embed vocabulary instruction in the teaching of morphology and syntax? How can one ensure that vocabulary learning does not take over and that the whole lesson is not simply about learning to recognize or, at best, recall lexical items in isolation,  but also about deploying them through a range of functional and notional contexts in ways which are communicatively effective as well as morphologically and synctactically accurate?

1. What do we mean by ‘learning a word’?

1.1 Levels of vocabulary acquisition

Learning a word or lexical phrase involves more than memorizing its spelling, pronunciation and denotative meaning if one aims to use that word or phrase effectively in the real world (e.g. when grappling with an online article, attempting to understand a native talking to you in the streets of Paris or when writing an application letter overseas). Acquiring an L1 or L2 lexical item, may it be a word or a lexical phrase, also involves someone’s ability to master the following:

  • its morphology (e.g. if it is a French or Italian adjective, how is it affected by the gender and number of the subject?)
  • its word class (e.g. being aware that a word is a noun rather than a verb)
  • its other (denotative) meanings – in the case of polysemic words (e.g. ‘macchina’ in Italian means ‘machine’ but also ‘car’)
  •  any connotative meaning that word may have  (e.g. in English, is ‘chicken ‘ being used to refer to an animal, or is it used metaphorically to describe a coward?)
  • its collocation(s) (e.g. when learning the French for ‘go horse-riding’ one must be aware of the fact that in French ‘horse-riding’ is preceded by the verb “Faire”, to do, rather than “Aller”, to go).
  • how the meaning of a lexical item changes when in it is used in combination with other words (e.g as part of an idiomatic phrase)
  • its register, that is, knowing in which contexts it is appropriate or inappropriate to use a given word
  • any cultural ‘value added’ (e.g. knowing that a ‘cow’ in India is considered a sacred animal)

Hence, the practice of teaching words, as discrete items, divorced from any communicative and cultural context is not only limited but often flawed and misleading. This is less the case with denotative words such as ‘chair’ or ‘football’ than with words which are polysemic and/or loaded with connotative meaning(s). In what follows I shall focus first on how to maximize the recall of the basic aspects of vocabulary acquisitions, that is the memorization of the denotative meaning of a lexical item and of its spelling and pronunciation. In the second section of this paper I will concentrate on how to deal with the higher order levels of lexical learning.

1.2 Recognition vs Recall

Vocabulary acquisition goes through two stages. The first stage of acquisition is when the learner can recognize the word through its audio and/or visual representation. The second stage, involves being able to recall the lexical item and reproduce it verbally, either in its oral or written form.

Implications for the MFL classroom: in planning a lesson and defining the outcomes, decide which vocabulary items you intend the students to store in their mental lexicon as receptive vocabulary and which ones you want them to use actively and with what degree of accuracy, in their speaking and writing.

1.3 Level of accuracy and processing efficiency

Accuracy and speed of retrieval are two other very important dimensions of vocabulary acquisition. The faster an individual can retrieve the correct desired L2 word from Long Term memory, the more fluent and effective s/he will be in communicating the intended meaning. A vocabulary item that is, so to speak, ‘fully’ acquired, will be retrieved by the learner without hesitation with little cost in terms of Working Memory processing efficiency across a wide range of contexts. Obviously, recalling an item in isolation at relative high speed and with good accuracy is easier than doing that whilst you are holding a conversation across various topics. For learners, novices especially, it can be a very tall order.

Implications for the MFL classroom: make sure you include in your lessons/units of work plenty of opportunities for student to practise words in as many contexts as possible.

2. The fundamentals of vocabulary teaching.

2.1 Recycling and reviewing

Figure 1, below illustrates very clearly why recycling is important. The human rate of forgetting is such, that we already lose around 25% of what we attempt to commit to memory 30 minutes after having rehearsed it in Working Memory (henceforth WM). Seven days later, if no regular reinforcement has occurred, we will have lost 80 % of it. One month later, we will have forgotten virtually everything. That is why distributed practice is important and constitutes a more powerful way to consolidate memory than massed practice (i.e. better four sessions of 15 minutes a week every other day, than two sessions of thirty minutes two days away from one another ).

Implications for the classroom: Well, first of all, one should ensure that the words within a given lesson are recycled over and over again with several mini-check points every now and then to verify uptake and identify problem areas. Secondly, the students must be given plenty of opportunities to practise those words at home. Thirdly, words should be methodically recycled not just within the same unit, but across units – although this seems pretty obvious, I have rarely seen this happen in any of the institutions I have worked at: whenever one has completed a unit of work, the items taught in that unit should constantly and systematically be revisited in the context of every single unit of work to come. For instance, if we have just covered ‘staying healthy’ and we are moving to the topic of ‘holidays’, we could recycle some of the health-related vocabulary just learnt by discussing whether the food at the hotel the students were staying at was healthy and why; how the hotel’s menu could be made healthier; how healthy the students were during the holiday and what they are planning to do to get back into shape after two or three weeks of reckless eating and drinking, etc.

2.2 Factors facilitating recall

According to research, an exceptionally able student needs to have processed a word 4 times to learn it at its most basic level, an average student, 14-15 times. However, although the Latins used to say ‘Repetita juvant’, “repetitions help”, it is not simply how many times one comes across or repeats out aloud a vocabulary item which seems to be crucial in enhancing its recall. The following factors play a very important role in determining how efficient end effective memorization will be.

1. Shallow vs Deep processing – The more complex the cognitive operations involved in the learning process, i.e. the deeper the processing, the stronger the memory trace will be. On the shallow-to deep processing continuum we find, at one extreme, repeating word-lists aloud – the most classical example of shallow rehearsal. On the other end of the spectrum we find  problem solving activities where the brain has to think laterally and creatively (e.g. creating a complex mnemonic). Examples of  problem-solving activities commonly found in textbooks are sorting/categorizing activities, odd man out, riddles, definition games, etc. ( has a great variety of these).

Implications for the MFL classroom: the foreign language teacher should try as much as possible to involve students in forms of deeper processing in order to speed up the learning process. This means going beyond the textbook page, as very few MFL textbooks designed for the British curriculum, provide sufficient recycling for the words they aim to teach (that is why I created : )

2. Spread of associations – Human forgetting is often cue-dependent; that is to say, the words may be in our Long Term Memory (henceforth LTM), but we have lost the ‘access’ code so to speak to get to them.  Research has clearly shown that  the greater the number of associations/connections that we create at the physical (e.g. graphemic, phonemic, etc.), semantic and emotional level with pre-existing material in LTM, the greater the chances will be for the target item to be retrieved successfully and efficiently in the future. The explanation for this is that when WM (Working Memory) is trying to fish out the word we need from LTM, all the words related to it in meaning, spelling and sound, and word class become activated automatically, especially those that are closer in meaning and end and start with the same letters.

Implications for the MFL classroom: teachers should include as many opportunities in their lessons for new vocabulary to be linked to previously learnt one so as to create as many connections as possible. The more elaborate (deeper) the connections, the better. Point 3, below, mentions other forms of associations which widen the range of possible connections we can make.

3. Synergy of stimuli – empirical evidence has shown (e.g. Paivio, 1981) that using different stimuli synergistically to appeal to various senses simultaneously may enhance recall. This is why a lot of us have used or still use flashcards. But this explains also why videos, by combining sound, images and often the spelling of the target words are even more powerful. Getting the students to respond to a video introducing new language items by emulating the movements they see on the screen would enhance the power of the video a notch further.

Implications for the MFL classroom – (1) On presenting words with denotative meaning for the first time try to use a video combining sound, picture and written form of the word; (2) When a word appears challenging, get the students to create a mnemonic which combines as many stimuli as possible. For instance, for the Italian word ‘occhiali’ (= glasses), one could picture in their mind a big pair of OCCHI (=eyes) with ALI (=wings) flying towards a pair of glasses and choose a suitable background music. I have used this technique personally a few times and has always been very effective.

4. Distinctivenes – distinctiveness refers to whatever makes the encoding (learning) of a given item in LTM ‘stand out’, ‘special’, more ‘vivid’. The factors making an item distinctive could be purely accidental (e.g.the teacher fell from a chair whilst teaching that item); intrinsic to that item (e.g. the target L2 word sounds funny, or like a swear words in one’s native language); there are personal, emotional circumstances surrounding the learning of that item that make it stand out (the teacher showed a picture whilst teaching that item, which evoked personal memories or triggered some strong emotions in the learner).

Implications for the MFL classroom: teachers should try to make the presentation of more challenging words as memorable as possible and/or teach the students to make them so, as they try to learn hem independently, by associating them with (a) powerful images; (b) items or situations in their lives which stir strong emotions, (c) humorous anecdotes etc. The way we pronounce words as we model their pronunciation can make a huge difference in terms of their distinctiveness, too. It is not rare for students to complain about how dull the voice of their teacher or of the actor on the recording is – surely, dullness is the antonym of distinctiveness.

5. Personal/affective investment – this refers to the processing of the to-be-learnt item that taps into our affective world, our own personal experiences related to it and its relevance to our lives.

Implications for the MFL classroom: include activities which involve a degree of personal response. for instance, when teaching adjectives, ask them to use them to describe their best friend, favourite cousin, pets, etc.

6. Target-item learnability – One dimension of a word’s learnability refers to the intrinsic challenges posed by the word to Working Memory. A word is more difficult to process and therefore learn when it is hard to pronounce (Baddeley, 2005); when it is so similar to an L1 item as to cause ‘cross-association’; when it is long (this is due to the fact that WM efficiency is quite limited as it can only process between 5 and 9 digits at any one time -Miller’s magic number). Other threats to learnability may intervene when the target items are not seen by the learners as relevant to their interests/goals and when their meaning is fuzzy, unclear. Moreover, generally, abstract words which are more connotative in meaning, tend to be less easily learnt. The word class the item falls into will also affect its learnability; for instance function words (e.g. prepositions, indirect object pronouns, etc.) are going to be less easy to be recalled as they are less semantically salient. Finally, the extent to which the words taught in a lesson are semantically related will affect their intrinsic learnability. 

Implications for the MFL classroom: (1) when selecting which vocabulary items to teach, consider the threats to learnability posed by the the first language of the student and devise some strategy to enhance their learnability using the tips above; (2) when more than one word exist in the target language for an item, choose the one that is more learnable, especially if it is more frequent than the others anyway; (3) You may want to teach the learners to break up longer and/or more challenging words into chunks in order to make it easier and more efficient for the articulatory loop in WM to process the item; when possible, break the word up into chunks which resemble words in the students’ first or second language.

7. Focal attention on the target item – although this factor is the most important, I left it for last because is the most obvious of them all: for any effective learning to occur, the students must be focused on the target stimulus. All of the above will be meaningless if the students are distracted, as interference during rehearsal is the most lethal cause of forgetting. The most important fact to note is that any given information does not last in WM for longer than 15-30 seconds without rehearsal; if any disruption to attention occurs and no further rehearsal of that item takes place, forgetting by interference will occurr.

Implications for the MFL classroom: obvious, but not easy: make your teaching as motivating, engaging and stimulating as possible.
(to be continued)

Five things I do when I correct my students’ essays

download (1)

My Ph.D study,Conti (2004), (as cited in Macaro,2004 and 2005, Ko Yin Sun, 2009, Goonshooly, 2012, Barjesteh, 2014, Cohen and Macaro, 2014, etc.) has provided me with great insight into the strengths and limitations of error correction. The following is a very concise list of what I believe to be the most important strategies to deploy in the error treatment of surface level errors in foreign language writing.

0. Caveat

Please note, this is something teachers can afford to do when they have a relatively light timetable or with specific students who are particularly problematic and need a lot of attention. I wouldn’t recommend this approach with every single class and student of yours as it is very time-consuming. In language instructions the focus should be on teaching not on fixing.

1.Focus on the most important issues

No point in focusing on every single error you find in your students’ writing when you are giving individual feedback on their essays. There is only so much attention a student can invest in the remedial learning process. Select only a few errors (3 to 5) at a time using the following criteria

  1. Errors that can be treated – no point in focusing an absolute beginner learner on mistakes involving the use of the pluperfect … Only treat errors for which the learner is developmentally ready;
  2. Errors that seriously impact understanding – these errors are the most important to deal with as they mislead the reader;
  3. Errors that keep recurring and seem impervious to correction – these errors need a lot of attention because once an error is fossilized it is very difficult to eradicate. Since these errors require a lot of work, try and prioritize the ones which, in your professional judgement, are more important (e.g. the ones that mighty penalize a student in a forthcoming exam);
  4. Errors that the learner would like to eradicate – it is my belief, controversial amongst some colleagues, that the learner should have a say as to what they should address in their remedial learning process. The rationale for this is that since s/he is going to be main the agent in this process, the fact that s/he chooses which errors to target may enhance their intentionality to eradicate error.

Do not address, in individual feedback, errors that are common to most of the class, as they can be the focus of a series of remedial lessons for the class as a whole

2. Find out what the root causes of error REALLY are

One common mistake teachers make in the corrective process, is to give all errors the same blanket treatment (be it direct/indirect correction with or without explanation or editing instruction) as if they were all caused by the same cognitive process(es). A bit like some doctors do, by giving a broad spectrum antibiotic for any kind of infection.

Errors can be caused by either (a) a declarative knowledge failure (the learner does not know the rule) or (b) a Procedural knowledge failure (the learner does know the rule and can self-correct, but did not apply it correctly or forgot to apply it in a given context because of processing inefficiency issues – e.g. cognitive overload, interference, etc.). It is important to identify the correct source of error before dismissing it as a ‘careless’ mistakes. There is usually more to an error than meets the eye.

In my study I used a number of research tools to investigate my subjects’ errors and the best one was definitely asking the students to edit the essays they wrote under think-aloud protocol conditions (i.e. they verbalized their thoughts as they attempted to correct). The knowledge I gained from that process was crucial to the success of my error treatment experiment.

3.‘Make it personal’

In my opinion, like any other type of instruction, error correction is greatly enhanced by making it as personal as possible a process, especially when we are dealing with weaker and/or less confident learners. One-to-one conferences are the best way to start the never-ending dialogue between teacher and student that the corrective process should really become. Using the page or the audio track as an interface between the student and the teacher makes the process much more distant and impersonal; the human contact, on the other hand, especially in the presence of judiciously gauged motivational feedback can do wonders for student’s self-efficacy and intentionality.

Let us not forget that the teacher’s role in the success of any remedial learning is crucial just as it is in any other kind of instruction. I often use the analogy of the person who wants to lose/gain weight in the gym. If you look at the rates of people who carry on training after the first three-four sessions, those with a personal trainer/life coach are less likely to drop out by a whopping 50 %! Why? Because a lot of us need encouragement, reminders, praise and, sometimes, a good telling-off…

When embarking on the remediation process, the teacher needs to take on a role alike the one of a ‘personal trainer’ since, as I shall explain below, errors are not eradicated in one go, it may take months or in certain cases, when an error is fossilized, even years. S/he will have to remind, prod, encourage, push the learner to keep working on the target mistakes.

It goes without saying that like every personal trainer the corrector must be inspiring and empathetic both emotionally and cognitively with the correctee.

4.Ensure there is a serious and sustained cognitive investment on the learner’s part

Several studies including mine have identified lack of student cognitive/personal investment in the error treatment as a major determinant of the failure of corrective interventions. Student writers do not look at the teacher’s corrective feedback and when they do they are superficial and do not follow it up. Teachers often do the same. They do a one-off remedial lesson on finding masses of students making the same mistake, then they move on. What I learnt in the course of my investigation is that for error to be eradicated (as mentioned above) both teachers and students must work hard. The students must put a lot of effort in the process at many levels: research, self-study, writing practice, self-monitoring and introspection.

Scaffolding the feedback-handling process in order to involve the student actively in the process is crucial, in this respect. Feedback-handling activities that students may be asked to perform on receiving feedback include: explaining the teacher correction; hypothesizing why the mistakes was made; describing what the rule that was broken is; producing student-generated examples of that rule across various contexts; produce a mini-lesson to deliver to a group of peers,ect.).

In my study, all of my informants reported drawing great benefits from such activities as they enhanced their self-knowledge as to the mistakes they were more likely to make to a point that they reported looking for those mistakes without much thinking before handing in their written pieces.

Another ingenious way of involving the students in the corrective process is to ask the students to step in before the essay is even completed and the feedback given. How? By asking them to annotate on margin whilst writing the essay any doubt they may have about the deployment of a grammar structure or lexical item. I use this technique a lot and it pays great dividends. This technique, that I call LIFT (Learner Initiated Feedback Technique) is dealt with in greater detail in a dedicated post of mine on this blog (here: ‘L.I.F.T. – an effective writing-proficiency and metacognition enhancer’).

5.Provide extensive practice

Many interventionist studies which involved editing instruction have failed whilst others have succeeded in enhancing grammar and/or lexical accuracy based on their duration and intensity. As already hinted above, learners need extensive practice to eradicate the target errors. Why? Because in learners’ Interlanguage system the wrong and correct representations of a grammar rule that has not been fully or correctly learnt coexist and often have equal weight (or, when the wrong form is fossilized, this will have greater weight). This entails that when the brain needs to apply that specific grammar structure the correct and the incorrect representation will both compete for retrieval. Extensive practice (highly monitored at the beginning) is required for the correct representation of the rule to acquire greater weight until it has become so strong in terms of memory trace to win the retrieval ‘competition’.

The extensive practice envisaged should occur:

  1. across a wide range of semantic contexts;
  2. in syntactically simple sentences to start with, moving gradually to more complex and longer chunks of text;
  3. in highly monitored performances (such as non-timed essays/translations) to start with and at a later stage, in the context of less monitored ones (such as timed essays or oral conversation).

Teachers are very busy people and one cannot always do all of the above as well as they would like to. However, these strategies can make a serious difference, in my personal experience, when applied to error treatment consistently. I suggest, if one does not have the time to do all of the above with every single student one teaches, to implement these strategies at least with the most needy of our learners, or with the ones that currently, in you opinion, are not gaining much benefit from your corrective feedback.

I deal with the issue of correction much more extensively in a research-based article of mine (‘Why teachers should not bother correcting errors in their students’ writing (not the traditional way at least’) : 

More on this topic in the book I co-authored with Steve Smith : ‘The Language Teacher Toolkit’ available on and