There is no blogpost of mine which does not mention Working Memory (WM) at some point. Why? Because effective language processing and learning largely depends on how well Working Memory performs. In fact, apart from automatic processes – which bypass WM’s attentional control – all conscious processing of information (visual, auditory, etc.) occurring in the human brain is performed by WM. Whether our students are reading or listening to target language input, translating a passage into French, planning an essay or performing an oral task it will be WM that does most or all of the work.
Let us consider reading a target language text. It is WM that matches any lexis in the text with its meaning (by retrieving it from Long Term Memory). And what if we struggle with that text? Every single operation the brain performs in an attempt to decode will take place in WM, too. In the case of vocabulary learning, any rehearsal we perform in an attempt to commit the words/phrases we are trying to learn to Long-term Memory (e.g. repeating aloud) will be performed in WM, which will temporarily hold that information for as long as we repeat it. In speaking and writing, all the operations involved in ‘translating’ ideas (or ‘propositions’ as psychologists call them) into words and evaluating their accuracy will occur in WM, too.
These are but a few examples of how cognition occurs in WM. With the above in mind, it goes without saying that knowing how WM works can help foreign language instructors devise strategies to teach more effectively. The following are eight important facts about WM and their implications for L2 learning that all foreign language teachers should bear in mind when planning and delivering the curriculum, assessing and providing feedback on learner performance.
- The structure of WM
As the picture below shows, WM, which is located in the prefrontal cortex of the brain, is made up of three main components:
- A visuospatial (i.e. Graphic/Visual) sketchpad which activates areas near the visual cortex of the brain and allows us to hold images, including the graphic images of words ‘alive’ in WM so that they are available for processing;
- A phonological loop which’ uses Broca’s area as a kind of ‘inner voice’ that repeats word sounds to hold them in WM;
- A central executive which regulates the flow of information in and out of the phonological loop and the visuospatial sketchpad, both as coming from the perceptual organs and from Long-Term Memory. The central executive is basically in charge of orchestrating all the processes occurring in WM.
So, for example, when we read a target language word or phrase, the visuospatial sketchpad will hold its graphic image, the phonological loop its sound (if we are pronouncing it) and the central executive will match it to any existing information in Long-term Memory in an attempt to make sense of it. If a match is found, the process will stop there; otherwise, if the word/phrase is new, the central executive will call upon a range of interpretive processes as well as resources from Long-Term Memory in order to attempt to decode it.
2.1. There are two distinct memory systems in the human brain
WM is one of two systems which memory is made of. The other one is the ‘place’ along the brain’s neural networks where memories are stored permanently and cannot be deleted unless by disease, physical damage or intervention affecting the prefrontal cortex (Long-Term Memory). It is after rehearsal in WM that information passes into Long-Term Memory.
2.2. WM is a temporary storage ‘facility’
Whether it is processing input from the outside world or retrieving material from Long-term Memory, WM will hold any information only for a few seconds. After that, spontaneous decay will set in, unless one makes a conscious effort to keep it there by focusing a considerable amount of his/her attentional resources on it through what we call ‘rehearsal’ (shallow or deep). Distinctiveness (how much it stands out) and high relevance (how much it matters to us) of input can also result in the stimulus to stay in WM longer. This has enormous implications for foreign language instruction and learning across all macros-skills and for any teaching in general.
Take, for example, oral recasts; the teacher responds to an erroneous utterance by a student by interrupting his/her conversation flow, and recasts (i.e. reformulates) his/her utterance correctly. At that point, the student have only a few seconds to process the teacher’s correction (has the correction will very soon decay from Workin Memory), notice and make sense of it whilst s/he is supposed to restart the conversation or to attend to another students’ input. Research shows that this is unlikely to result in learning unless the student has a much bigger and more efficient WM than average. Should teacher stop recasting? Maybe so, and reserve any feedback on or treatment of the errors noticed in learner input later on in the lesson.
Another implication refers to listening. Often MFL students sit through listening tasks which require them to identify details in a text spoken at native speaker speed. With the above in mind it is clear how this task can be a very tall order for novice-to-intermediate learners, as they have to hold on to information they hear by actively rehearsing it (through the phonological loop) to prevent decay whilst the listening track is still playing. Being a listening task, the learner’s WM will be rehearsing it by engaging the phonological loop; thus, if the learner’s pronunciation is not too good, s/he will find it very hard to rehearse the information s/he hears thereby slowing down the whole process. Hence the need for teachers to implement approaches to listening instruction which lessen the cognitive load on learners (e.g. narrow listening) and include focus on micro-listening skills (see my article on micro-listening enhancers).
There are obviously many more implications for teachers, as far as the temporariness of WM storage is concerned. Too many to deal with in this article. The most important relates to the issue of distinctiveness of teacher input: the more distinctive (e.g. engaging, outstanding, impressive, particularly funny) teacher input is, the more likely it is to linger for longer than the 1-2 seconds it would normally stay in WM and to pass into Long-Term Memory. That is also why, engaging students in the semantic analysis of a target word/phrase (what psychologists call ‘elaboration’) is more likely to result in learning as such analysis, by involving deeper processing, will require the learner to hold the word in WM for longer than 1-2 seconds whilst engaging the brain in higher order thinking (which strengthens retention).
2.3 WM has limited channel capacity
WM has a very limited capacity or memory span. According to Miller (1965), it cannot contain more than 7+/- 2 items at the same time (i.e. between 5 and 9). More recent estimates concede that Miller’s number may be true of university population but not of the average person; they estimate WM’s capacity at 4 to 5 items at the same time. WM’s channel capacity is affected by genetic factors (some individual’s WM is bigger than others) and by motivation.
The amount of words WM can hold at any given time is phonologically determined (for instance, Chinese speakers can hold more words in WM than English speakers because in Mandarin each word is a syllable). This means that a novice foreign language learner will be able to hold fewer words in WM than s/he does in his/her mother tongue as s/he will pronounce the words more slowly. The more rapidly a foreign language speaker can utter a word or phrase, the less space in their working memory it will take.
The phonology-dependent nature of learning vocabulary and the limitation of the phonological loop also means that words that are long and contain complex target language sounds cannot be processed efficiently and therefore not learnt ‘properly’. Hence, work on phonics from the very early days of instruction is paramount.
One implication of this issue for MFL teaching and learning is that in order to increase MFL learners’ WM processing efficiency in a foreign language, they must receive extensive speaking practice. Such practice will also impact their listening skills in that, as already explained above, whilst listening the learner needs to hold in his/her phonological loop fairly big chunks of target language in order to comprehend the text.
Another implication relates to writing and speaking. novice L2 English learners will find it hard to produce longer or complex sentences accurately in languages like French, Italian, German or Spanish as most or all of their WM’s channel capacity will be taken up by the retrieval of the L2 lexis required to form those sentences and little space will be left to focus on less salient grammar features such as adjectival and verb endings, function words and syntactic order.
Finally, to enhance learner memory span, teachers may want to train students with poorer WM in the use of mnemonics such as the Key Word technique or other associative memory techniques. Research shows that through the effective use of mnemonic strategies WM’s digit span can be even increased tenfold.
Another strategy to increase WM’ capacity is chunking the target information. This consists in organizing a number of items which would normally would be too big for WM to hold into manageable units. An example of this is the way we memorize a phone number; by memorizing 0176324167 as 017 632 4167 we basically reduce 10 units to 3, thereby greatly reducing the cognitive load. Imagine learning the phrase ‘appareils électroménagers’ – almost impossible for a novice’s phonological loop to cope with. By chunking it into ‘appa / reils / électro / ménagers’ even a novice can cope with pronouncing and memorizing it.
2.4 Storage in WM is ‘fragile’
When items are stored in WM they can be easily lost due to interference from competition with other items (divided attention) or interference from environmental factors (e.g. noise). Anxiety, worry and self-concern during performance can also cause divided attention and WM memory loss.
The obvious implication is that our teaching should bring about as much arousal in our students as possible so as to keep the target language input in their focal awareness at all times.
Another implication is that apart from the obvious sources of distraction which pertain to student’s misbehavior or environmental factors, teachers must try to minimize any other source of distraction. A frequent source of distraction comes, in this day and age, by learning languages through the digital medium or by producing a digital artefact as part of projects in the target language.
2.5 Error is often caused by WM processing inefficiency
When we are carrying out complex tasks WM may have to juggle several tasks at the same time. Base on points 2.3 and 2.4 above the ‘multi-tasking’ that WM has to do can cause information processing or retrieval to slow down and/or result in performance error. Anxiety can have a detrimental effect in this regard, too.
The application of declarative knowledge (i.e. intellectual knowledge of L2 grammar) in speaking and listening performance is likely to cause processing inefficiency as WM needs to apply every rule consciously. Imagine, in talking about what you did yesterday in French, having to apply every step to forming the Perfect Tense of ‘Aller’ one by one as compared to simply saying ‘je suis allé’. Hence the very long pauses and hesitation when a novice-to-intermediate speaker has solid declarative knowledge of the language but little control over the speaking medium, due to lack of practice.
The implications for teaching are obvious and refer to the issues I have dealt with extensively in previous blogs. On the one hand teachers must focus their efforts on developing students’ cognitive control over the target language; on the other, they need to try as much as possible to lessen the cognitive load on students’ WM by (a) pitching the tasks they involve students in to the right level of cognitive/linguistic challenge; (b) prepping the students before each target language task through activities which recycle the language items they will need in the execution of that task; (c) keeping anxiety out of the classroom as much as possible.
Also, in order to facilitate WM processing efficiency, students may have to be taught strategies that can compensate for lack of procedural competence. For instance, teachers may raise learners’ awareness of how their WM’s processing inefficiency can cause them to make specific mistakes (e.g. agreement mistakes in writing) and model editing strategies to identify and/or prevent such mistakes (e.g. through mnemonics).
2.6 Forgetting is caused by WM failure to access the required information (cue-dependent forgetting)
Memory is context-dependent, in other words, the environment in which one is learning a given language item will enhance the chances of recalling that item later on. Hence, when we do not remember something, it is not because that information is not stored in Long-Term Memory any longer; but rather, because we are not using the right cue to retrieve that information from Long-Term Memory. So, for instance, if my teacher has used a picture of Arnold Schwarzenegger to teach the word ‘Musculoso’ in Spanish, that picture will facilitate my recall of that word.
Here, too, training students in the use of memory strategies to prevent cue-dependent forgetting can be extremely helpful.
2.7 There may be a link between poor WM and depression
Recent research has evidenced a link between poor WM and depression. They found that people with a highly efficient WM have a more positive outlook on life and are generally more self-confident. Individuals with poor WM tend be more prone to anxiety and to brood and sulk more over things.
The implications for teachers are very obvious; minimize the potential sources of anxiety for students who fall in this category. Don’t presume that this issue affect only children with special educational needs. Research shows clearly that depression amongst adolescents has risen substantially in the last decade or so. Hence one has to be very mindful of this issue and handle it with much emotional and cognitive empathy.
2.8 An efficient WM is a good predictor of academic success including MFL learning
Alloway and Alloway (2009) actually found that poor WM is a better predictor of future academic success than IQ. They found that “working memory is not a proxy for IQ but rather represents a dissociable cognitive skill with unique links to academic attainment”. Students with poor working memory do badly across all or most subjects, including foreign languages. In fact, more recent theories of language aptitude include WM as an important factor affecting success in foreign language learning.
In conclusion, MFL teaching should concern itself from the very early stages of instruction with the development of processing efficiency. A big and efficient WM allows for faster recall and processing, for more accurate performance and more ‘noticing’. This is a very important issue if one considers that WM is first and foremost the gateway to Long-Term Memory – where all the knowledge we have about a language and the world is permanently stored.
‘Noticing’ new key target language features, as Schmidt (1990) posits, propels our students’ learning forward, but only if they make the connection between what they notice and the system they have been building in their Long-Term Memory (their Interlanguage). Often this connection must be made under Real Operating Conditions (ROC) as they interact orally with an expert speaker, watch a video or listen. For this to happen in these contexts – when they operate under considerable communicative pressure- their WM must be highly efficient.
Teachers should heed the above recommendations in their daily practice and ensure that lessons are as much about developing students’ WM processing efficiency (cognitive control) what I call ‘horizontal progression’ – as they are about vertical progression, i.e. ‘jumping’ from one level of linguistic challenge to a higher one, for the sake of being able to say “we have covered three tenses” or “we have created complex sentences”. Vertical progression without horizontal progression creates very unstable system, like a tall building without strong foundations.
Finally, raising the students’ awareness of how WM’s works can be very useful in enhancing their learning and their metacognition. I have several short sessions with my KS3 classes where I summarize the key features of memory and how WM works. The teacher must create the right context for these sessions and make them as simple, visual and engaging as possible. I was so proud when last week, a year 8 girl said to another who was finding a word difficult to pronounce:”You have to chunk it” and actually modelled the chunking to her classmate. Ultimately, the more students know about how their mind works, the more they will feel in control of their learning.