The ugly truth about school-based Modern Language teaching


(with Steve Smith)

I was recently criticised by some of Stephen Krashen’s fans for something that to me and many other teachers is a sad given : MFL teachers operating in secondary schools have simply no time to teach languages the way they should ideally be taught. Time and syllabus constraints force teachers to extremely tight schedules which do not allow for the extensive listening and reading practice that it is evident from much research that every language learner benefits from before engaging in real-life-like speaking.

If I had five hours contact time a week I would teach entirely differently from the way I teach now.  This would be my recipe: lots of daily receptive exposure to compelling aural and written input ; plenty of oral interaction through fun and challenging communicative activities (even more than the 30 minutes per lesson I do now);  engaging multimedia project-based learning ;  drama and art activities ; cultural awareness-raising through videos and realia ; exciting enquiry-based grammar learning.

The problem is, for teachers working in England to effectively prepare their students for GCSE and A-Level examinations, all of the very desirable above simply cannot be done as often as one would like. We all know that. Hence, effective teaching in our context is not merely about applying what we know best benefits language acquisition ; but it is first and foremost how to make the most of the time we have available to build our students’ linguistic competence, self-confidence and motivation adapting what we know about human language acquisition to the context we operate in.

The American army knew this all too well when they had to prepare their troops linguistically for the Normandy invasion in 1945. Surely they could not afford to put their soldiers through hours and hours of receptive learning through engaging stories in the belief that languages are best learnt subconsciously through exposure to comprehensible input (as many Americans in Dr Krashen’s camp – my critics – believe). Hence they devised an approach which was drill-based ; lots of repetition through controlled tasks aimed at practising phrase after phrase to death until they were so embedded in their soldiers’ memory that they became spontaneous. In this approach, grammar was taught through robotic repetition and manipulation of small parts of sentences, e.g. I play tennis, my mother plays tennis, my father doesn’t play tennis, we play tennis.

Although ideologically I do not agree with this method at all, and it is not the way I learnt the seven languages I am fluent in and the other seven I speak less well, I see the merit of aspects of this approach in the beginning phase of every learning, the parroting stage of classroom-based acquisition. Lots of drilling does help embed the core vocabulary and grammar structures, it is undeniable. And it can be made fun, too, with a bit of imagination – e.g. my receptive drills in the game room at  or my oral communicative drills. And if the phrases and words we embed in the drills consists of lexical items and sentences which can be very useful in the real world and are taught and practised within typical real-life communicative contexts, all the better still !

The truth is that every method language researchers and educationsts have come up with in the last fifty – sixty decades or so is effective in its own way, each of them addressing one different stage or facet of the complex process that language acquisition is. To say my method is better than yours is preposterous. Yet proponents of each method do, sometimes inspired by a genuine passion for and belief in the validity of their approach, more than often driven by a business or political agenda.

We, as school-based teachers, have been historically the victims of this state of affairs, decade after decade. Subjected to fads which were not a faithful reflection of each new method,but rather the botched-up adaptation of often-sound theories and methodologies by governments and their consultants, which reshaped them to fit the target cultural, political and socio-economic context, mindful less of our needs or our students’ than of their own agendas.

The result is a teaching profession whose pedagogic beliefs – whether we are aware of it or not- are often a hybrid of all the methodological approaches it has been exposed to in the last forty years or so  – whether through word of mouth, readings, CPD, government policies, etc. So many of us are advocates of the Communicative approach whilst teaching grammar like the Romans or the Greeks used to 2,000 years ago ; believe that reading extensively for pleasure will subconsciously result in learning whilst we train our students to teach towards reading comprehension tests that teach little ; advocate the importance of oral interaction and listening but most lessons are about reading and writing – or  embrace enquiry-based learning tasks where students barely ever speak; say one should tolerate error and that mistakes are ‘good’ (as CLT preaches) but then make a huge fuss about them by excessively focusing students on correction (through D.I.R.T., stamps and time-consuming dialogic practices).

Eclecticism or pedagogic hypocrisy ? Neither, in my opinion. The ugly truth is that a lot of us are confused and disoriented ; overloaded with government and school policy requirements which change way  too often and quickly ; overflooded with information coming from different camps ; misinformed by CPDs which squeeze years of researching and theorizing in one or two Powerpoint slides ; galvanized by keynote speakers who excite us with great ideas which are difficult to translate into our classroom practice.

Hence, as I always ‘preach’ in my posts, the need for (a) having a clear understanding of modern language pedagogy so as to be able to understand the state of the art of educational pedagogy beyond the different factions and fads’ political agendas ; (b) having a basic reference framework based on that understanding that will enable us to approach lesson and curriculum planning, assessment and feedback in a no-nonsense, practical and principled way.

Having such an understanding and such a framework  – which in my case is MARS + EAR ( see my blogposts on this) – has made my everyday lesson planning much easier and hassle-free and when questioned by my superiors it has allowed me to provide them with a clear rationale for my pedagogic strategies and choices rooted in Skill-Theory and neuroscience. Maybe not perfect, but working well for me. Incidentally, it was interesting to see how Rachel Hawkes and others – who had never publicly advocated Skill theory principles before – have recently published a paper which reflects all of the views I have expressed in my blog in the last year or so. It means that after all, some English MFL ‘influencers’ have finally decided to embrace neuroscience…

The path to becoming a better teacher does come through reflectivity, as most of todays’ CPD gurus preach. But understanding the basic neuroscience facts about language acquisition and developing your own framework fuels and structures that reflectivity and significantly reduces the occurrence of the cognitive block that many teachers who contact me through social media tell me they often experience when they plan lessons. It also reduces the likelihood that your planning is driven by the activities/resources you find rather than the much healthier opposite scenario, i.e.: you choosing the activities/resources to best serve your planning.

Steve Smith and I wrote our book ‘The Language Teacher Toolkitto provide our colleagues with such an understanding of Modern Language pedagogy  and with such a principled teaching framework. Interestingly, we came to it from totally different camps, Steve being a believer in the importance of comprehensible input, whilst I am a Skill-theory fan ; still we could come to an agreement of what constitutes a useful, pragmatic, ‘fadless’ and hassle-free approach to language teaching. Other bloggers, such as Sara Cottrel of  and Justin Slocum Bailey of Indwelling languages have also been pursuing the same noble intent.

No, I am not merely trying to plug our book. My point is that once you have a clear understanding of the basic processes that regulate  human learning, are aware of the core research facts and regularly reflect on your classroom experience in the light of that understanding and that awareness, you will have a powerful pedagogic compass to orientate yourself through the jungle of bastardised pedagogic messages – like the ones I discussed in my previous post – which make our daily professional life so much more challenging and confusing.

In conclusion, the ugly truth that Modern Languages teachers have to contend to, day in day out is that time, logistics, syllabus constraints and government policies prevent them from teaching the way one ideally should. Educationists and researchers rarely recognize that, detached as they are from our world and more concerned with plugging their fads than with the often harsh reality of bog standard state schools. Curriculum designers, examination boards and textbook authors do attempt to incorporate the new methodologies and fads in their work but they often do so superficially at the detriment of sound pedagogy, giving rise to belief systems and practices which teachers often have to adhere to uncritically and which often clash with one another and with common sense. The result is the current state of affairs : an overloaded and overworked teaching profession that is often confused as to what constitutes best pedagogic practice disorientated as it is by mixed messages coming from multiple directions. This may affects teachers’ efficacy thereby eroding their self-confidence, motivation and, ultimately, their well-being.

The solution : getting a better understanding of pedagogy so that you can make an informed choice as to which method to apply where, when and with who ; so that you build instructional sequences based on a method rather than a hunch ; so that you do not let tasks and games you know or have found guide your teaching instead of your know-how; so that you can tell SLT why they got it all wrong.

The Language Teacher Toolkit is available here, on

Why I teach the way I teach. The Skill-Theory principles which underpin my teaching approach

Fig. 1 – The most influential Skill-Theory account of language acquisition (Anderson, 1983)

1. Introduction

In a previous post, I argued that every language teacher, both novice and expert, should ask themselves the question “How do I believe that languages are learnt?” as a starting point for a deep and productive reflection on their own teaching practice.

The answer to that question is key, as without a clear and solid set of pedagogic principles our curriculum planning and design and every other decision that affects teaching and learning in our classroom will be random and haphazard or based on ‘hunches’. Imagine choosing a course-book, creating assessment procedures and materials,  deciding to integrate Information Technology or Generic-skill learning in our teaching without having formed an opinion as to how languages are best taught and learnt? Would you believe me if I told you that I have seen this done, time and again, even in some of the best  schools in the world?

As I suggested in that post, teachers and language departments should identify the set of pedagogic principles that truly constitute the tenets of their teaching philosophy and classroom approach and draw on them to ‘frame’ their long-, medium- and short-term planning, their discussions on teaching and learning (e.g. the ones that occur after a lesson observation), their assessment and any big decision of theirs that may significantly impact teaching and learning. Having such a framework will warrant coherence and fairness in peer and student assessment. It will also give the course administrators a better idea of what Modern Language (ML) teaching and learning is about in the institution they manage.

This is my own personal answer to the question “How do I believe that language are learnt?”, or rather part of it, as I will narrow the scope of this post only to the main tenets of my approach to ML teaching – borrowed from Skill Theory. Hence I will leave out other major influences on my personal pedagogy (e.g. Schmidt’s Noticing Hypothesis, Bandura’s Self-efficacy theory, Selinker’s Interlanguage hypothesis, MCcLelland and Rumelhart’s Connectionism, etc.).

2. My set of guiding principles

2.1  Skill Theory – the (very) bare bones

Whilst it integrates elements from several SLA theories, My approach is rooted in Cognitive-psychology-based accounts of instructed  second language acquisition, especially what Applied Linguists call Skill Theory (as laid out in Anderson,1994; Johnson, 1996,; DeKeyser, 1998; Jensen, 2007). I underscored the word ‘instructed’ for a reason: I do not believe that Skill Theory provides an accurate account of how languages are learnt in naturalistic environments.

In a nutshell, Skill Theorists observe that every complex task humans learn is made up of several layers of sub-tasks. For instance, driving a car requires a driver to pay attention to the road and take important decisions as to where to turn, how fast to go, when to brake; however, whilst taking these decisions, the driver is carrying out multiple ‘lower-order’ tasks such as changing gear, physically pushing the brakes, operating the indicator, etc.

Skill theorists observe that lower-order tasks are performed subconsciously, without requiring the brain’s Working Memory to pay much conscious attention to them (or, as they say: they only occupy subsidiary awareness). This, in their view, points to an adaptive feature of the brain: in order to be able to solely focus on the most important aspect(s) of any complex tasks, the brain, throughout Evolution, has learnt to automatize the less complex tasks.

This is  because, based on current models of Working Memory (e.g. Baddeley,1999) the brain has very little cognitive space to devote to any given task. For instance, when it comes to numbers, Working Memory channel capacity can only process  7+/- 2 digits at any one time  Miller (1965). In simpler terms, the only way for the brain to effectively and efficiently mult-task, is to automatize sub-tasks which are less complex.

Fig. 2 – Working Memory as conceived by Baddeley (1999)

lexical priming2.png

Skill Theorists argue that the same applies to language learning. A language learner needs to automatize lower order skills so as to be able to free up space in Working Memory in order to execute more complex tasks requiring the application of higher order skills. Example: you cannot form the perfect tense if you do not form the past participle of a verb and have not learnt the verb ‘to have’. Hence, the aim of language teaching is to train language learners to automatize the knowledge that the instructor provides explicitly to them (i.e. the knowledge of how a rule is formed). Once automatized, it will not require the brain’s conscious attention and the learner will have more space in their Working Memory to deal with the many demands that a language task poses to them.

Imagine having to produce a sentence and  having to think simultaneously (in real time!) about the message you want to convey, the most suitable vocabulary to convey it through, tense, verb endings, word order, agreement, etc. an impossible task for a novice whose mistakes will be due mainly to (cognitive) overload). Such a task would be a fairly easy one for an advanced learner as s/he will have automatized most of the grammar- and syntax-related tasks and will only have to focus on the message and the lexical selection.

This automatization process is long and requires a greater focus on fluency,  lots of scaffolding in the initial phase and negative feedback (correction) plays an important role.

A final point: Skill theorists (e.g. De Keyser 1998) propose that Communicative Language teaching which integrates explicit grammar instruction and focus on skill-automatization constitutes to date the most effective ML teaching methodology.

2.2 Skill-Theory principles and their implications for teaching and learning

2.2.1 Principle 1: language skills are acquired in the same way as any other human skill

The main point Skill-theory proponents make is that languages are learnt in much the same way as humans acquire any other skill (e.g. driving a car, cook, paint). This sets it apart from other influential schools of thoughts, which view language skills as a totally unique set of skills, whose functioning is regulated by innate mechanisms that formal instruction cannot impact (the so-called Mentalist approaches). This is a hugely important premise as it endorses what Applied Linguists call a strong interface position, i.e. the belief that whatever is learnt consciously (e.g. a grammar rule) can become automatized, i.e. executable subconsciously, through practice.

2.2.2 Principle 2: In instructional settings where the L2 grammar is taught explicitly, grammar acquisition involves the transformation of Declarative into Procedural knowledge

Whatever we learn is stored in the brain in one of two forms: (1) Declarative Knowledge, or the explicit knowledge of how things work and it is applied consciously (like knowing all the steps involved in the formation of the perfect tense) or (2) Procedural knowledge, the knowledge we acquire by doing and that we use to perform a specific task automatically, without thinking (like knowing how to ride a bike).

Example: I have declarative knowledge of the English  perfect tense when I can explain the rule of its formation and application. I have procedural knowledge of it when I can use it without knowing the rule (e.g.  because I have picked it up whilst listening to English songs or interacting with English native speakers).

Declarative knowledge has the advantage of having generative power, e.g.: if I learn the rule of perfect tense formation for French regular verbs I will be able to apply it to every single regular verb I come across. On the other hand, Procedural knowledge is limited only to the regular perfect forms I learn.

An advantage of Procedural Knowledge is that it is fast. So, a beginner who was taught ten perfect verb forms by rote learning can apply all of them instantly without thinking. Another beginner who was taught the rule of perfect tense formation, will have to apply each step of the rule one by one, which will slow down production.

According to Skill Theorists the aim of any skill instruction, including Modern Language teaching is to enable Declarative Knowledge to become Procedural (or Automatic). In the context of grammar learning, this means that a target rule which is initially applied slowly, step by step, occasionally referring to conjugation tables, will be applied – after much practice of the kind described in 2.2.6 below – instantly with little cost in terms of Working Memory processing efficiency.

It should be noted that our students pick up Procedural knowledge all the time in our lessons when we teach them unanalysed chunks such as classroom instructions or formulaic language. Whilst teaching such chunks should not be discouraged, Skill Theorists do believe that, in view of their limited generative power, instruction should not excessively rely on rote learning.

2.2.3 Principle 3: The human brain has limited cognitive space for  processing language, so it automatizes lower order receptive and productive skills in order to free up space and facilitate performance

When we learn to drive, we need to learn basic skills such as how to switch on the engine, change gear, press the clutch, turn on the wipers, operate the brakes, etc. before we actually take to the road. Once the lower order operations and skills listed above have been automatized or at least routinized to the extent that we do not have to pay attention to them (by-pass Working Memory’s attentional systems), we can actually be safe in the assumption that we can wholly focus on the higher order skills which will allow us to take the split seconds decisions that will prevent us from getting lost, clash with other cars, break the traffic laws whilst dealing with our children messing about in the back seats.

This is what the brain does, too, when learning languages. Because Working Memory has a very limited space available when executing any task,  the brain has learnt to automatize lower order skills so that, by being performed ‘subconsciously’ they free up cognitive space. So, for instance, if I am an advanced L2 speaker who has routinized accurate L2 pronunciation, grammar and syntax to a fairly high degree , I will be able to devote more conscious attention (Working Memory space) to the message I want to put across. On the other hand, if I still struggle with pronunciation, word order, irregular verb forms and sequencing tenses most of my attention will be taken up by the mechanics of what I want to say, rather than the meaning; this will slow me down and limit my ability to think through what I want to say due to cognitive overload.

In language teaching this important principle translates as follows: in order to enable our students to focus on the higher order skills involved in L2 comprehension and production we need to ensure that the lower-order ones have been acquired or performance will be impaired. Here are a few scenarios which illustrate what I mean.

Example 1: a student who struggles with pronunciation and decoding skills in English (i.e. being able to match letters and combinations of letters with the way they are sounded) will find it difficult to comprehend aural input from an English native speaker as they will not be able to identify the words they hear with the phonological representation they have stored in their brain. Hence, listening instruction ought to concern itself with automatizing those skills first (read here why and how).

Example 2: for a student who has not routinised Masculine, Feminine and Neuter endings in German, applying the rules of agreement in real time talk will be a nightmare. The same student will take for ever to write a sentence containing a few adjectives and nouns because his brain’s (working memory’s) capacity will be taken up by decisions such as what agrees with what, what the correct ending is and what the word order is; by having to deal with these lower order decision s/he will lose track of the higher order issue: to generate a meaningful and intelligible sentence

Example 3: if you teach long words (e.g. containing three syllables or more) to a beginner who has not automatized the pronunciation of basic target language phonemes, his Working Memory will struggle to process it (because of Phonological Loop overload), which will impair rehearsal and its commitment to Long-Term Memory.

Example 4: you cannot hope for a student of French or Italian to be able to acquire the Perfect tense if they have not automatized the formation of the verbs ‘to be’ and ‘to have’ and of the Past Participle. Yet, often we require our students to produce under time constraints Perfect tense forms a few minutes after modelling the formation of the Past Participle.

Hence, teaching ought to focus much more than it currently does, on the automatization of lower order skills (or micro-skills as we may also call them) across all four language skills . In this sense, progression within a lesson should mainly refer to the ability of our students to produce the target L2 item with greater ease, speed and accuracy (horizontal progression), rather than moving from a level of grammar complexity to a higher one, from using two adjectives in a sentence to using five or from using only one tense to using three (vertical progression).

The progression I believe teachers should prioritize is of the horizontal kind. We should concern ourselves with vertical progression only if and when horizontal progression has achieved automatization of the target L2 item.

Most of the failures our students experience in our lessons is due to focusing on vertical progression to soon, mostly because of teachers’ rush to cover the syllabus and/or ineffective recycling.

2.2.4 Principle 4: Acquisiton is a long pain-staking process whose end-result is highly-routinized consistently- accurate performance (which approximates, rarely matches native-speaker performance)

Automatization is a very long process. Think about a sport, hobby or other activity you excel at. How long it took you to get there. How much practice, how many mistakes, how much focus. Every skill takes huge amounts of practice in order for it to be automatized, lower order skills usually taking less time than higher order ones as they require simpler cognitive operations (there are exceptions though, e.g., in language learning, the acquisition of rules governing items which are not salient such as articled prepositions in French, Spanish or Italian).

The process is long for a reason; whenever a given L2 grammar rule is fully acquired, it gives rise to a cognitive structure (called by Anderson,2000, a ‘production’) which can never be modified. As a  result, the brain is very cautious and requires a lot of evidence that whatever rule we apply in our performance is correct. Hence we need to use a specific grammar rule lots of times and receive lots of positive feedback on it, before a permanent production is formed and incorporated.

Do not forget, also, that when a learner is figuring out if their grasp and usage of a given L2 grammar rule is correct s/he might have two or even more possible hypotheses about how it may work and try them concurrently, awaiting positive or negative feedback to confirm or discard them. Hence, the brain needs to make sure that one of the hypotheses it is testing about how a given language item works ‘prevails’ so to speak over the others substantially before ‘accepting’ to incorporate it as a permanent structure. In the absence of negative feedback – hence the importance of correction, especially in the initial stages of instruction – the brain might store more than one form.

Example: a student keeps using (1) ‘j’ai allé’ and (2) ‘je suis allé’ alternatively to mean ‘I went’ in French ; if he does not heed or receive regular corrective feedback pointing to (2) as the correct one and  does not use (2) in speaking and writing often enough to routinize it, (1) and (2) will still compete for retrieval in his brain.

2.2.5 Principle 5 : the extent to which an item is acquired depends largely on the range and frequency of its application (i.e. across how many context I can use it accurately and automatically)

A tennis player being able to perform a back-hand shot only from one specific point of the tennis court cannot be said to have acquired mastery of back-hand shooting. Evidently, the more varied and complex the linguistic and semantic contexts I can successfully apply  a given grammar rule and vocabulary in,  the greater will be the extent of its acquisition.

Example: whilst learning the topic ‘animals’ student X  has practised over and over again the word ‘dog’ for three weeks only in the contexts ‘I have a dog’,’ my dog is called rex’,  ‘Mark has a dog’, ‘I like dogs because they are cute and playful’, ‘we have a dog in the house’. Student Y, on the other hand, has been given plenty of opportunities to practise the word dog in associations with all the persons of the verb ‘to have’, with many more verbs (e.g. feed, groom,  love,  walk , etc.), with a wider range of adjectives new and old (good, bad, loyal, funny, lazy, grredy,etc.) and other nouns (I have a dog and a turtle, a dog and a cat, etc.). Student Y will have built a more wide-ranging and complex processing history for the word ‘dog’ which will warrant more neural associations in Long-term memory and, consequently greater chances of future recall and transferrability across semantic fields and linguistic contexts.

Consequently, language teachers must aim at  recycling each core target item across as many linguistic and semantic  contexts as possible. For instance, if I am teaching the perfect tense in term 3 and I have covered four different semantic areas prior to that, I would ensure that that tense is recycled across as many of those areas too. In a nutshell: the extent to which the target L2 items have been acquired by our students will be largely a function of their processing history with those items.

In concusion, the more limited the input we provide them with and the output we demand of them the less deeply we are likely to impact their learning.

2.2.6 Principle 6: Acquisition is about learning to comprehend and produce language faster under Real Operating Conditions

The five principles laid out above entail that for language acquisition to occur, effective teaching must aim at enabling the learners to understand and produce language under real life conditions or, as Skill-Theorists say ‘Real Operating Conditions’ (ROC). This changes the focus of instruction from simply passing the knowledge of how grammar works and what vocabulary means (Declarative Knowledge) to enabling students to apply it quickly and accurately (Procedural knowledge) by providing lots of training in fluency. Hence, for grammar to be acquired we must go beyond lengthy grammar explanations, gap-fill exercises and quizzes. E.g.: students must be asked to use the grammar in speaking and writing under time pressure.

Training students to be fluent across all four skills means scaffolding instruction much in the same way as one would do in tennis or football coaching. First, one would start by working on automatizing the micro-skills, as already discussed above. Secondly, one would focus on routinizing the higher-order skills by providing an initial highly structured support which is gradually phased out. This translates itself, in my classroom practice as follows:

(1) An initial highly controlled phase which includes: modelling, receptive processing and structured production– During this phase the target L2 item is practised in a controlled environment. The phase starts with lots of comprehensible input through the listening and written medium. The target grammar/vocabulary is recycled extensively before the students engage in production.

A structured production phase ensues. The input given and the output demanded are highly controlled and the chances of error are minimised by providing lots of scaffolding (e.g. vocab lists; grammar rule reminders; writing mats,dictionaries, etc.) and guidance and by imposing no time constraints. Example (speaking practice in the present tense ): highly structured role-play in the present tense only,  where each student has to translate their respective lines from the L1 to the L2 or are given very clear L1 prompts; the language is simple and the students are very familiar with the verbs to be conjugated; verb tables are available on the desk.

(2) A semi-structured expansion phase –This phase is about consolidation and recycling and cuts across all the topics subsequently taught. So, for instance, if one has introduced the French negatives in Term 1 under the topic Leisure, they will recycle them throughout the subsequent terms as part of the topics taught in those terms until the teacher feels fit. This will ensure that the target structure/vocabulary is systematically recycled in combination with old and new.

During this phase, the support is gradually reduced. The input provided and the output expected are more challenging but the teacher still designs the activities with a specific set of vocabulary and grammar structures in mind. Some form of support still available. Example (speaking practice in the present tense): interview in the present tense across a range of familiar topics. Prompts for questions and answers are provided by the teacher (in the L1 or L2). The students are given some time to look at the prompts and think about the answers. Prompts look like this:

Partner 1: ask where Partner 2 usually goes at the week-end

Partner 2: answer providing three details of your choice relating to sport

This phase ends when the teacher feels the students can produce the target structure/vocabulary without support.

(3) An autonomous phase – Here the support is removed. Examples (speaking practice in the present tense): (1) Students are shown pictures and are recorded and assessed as they describe them. The task may elicit a degree of creativity and the use of communication strategies to make up for lack of vocabulary. (2) students are asked to have a conversation about the target topic with only a vague prompt as a cue (e.g. talk about your hobbies). They generate questions and answers impromptu under time constraints. Conversation is recorded and assessed.

(4) A routinization phase – in this phase, the only concern is speed of delivery. The teacher focuses on training the students to produce language ‘fast’, under R.O.C. (real operation conditions), i.e. real life conditions, across various topics and in spontaneous conversations. In this phase the production activities of election will be oral translation drills and communicative activities (e.g. general conversations, simulations, more complex picture tasks) under time constraints. The tasks will not limit themselves to topic X or Y; rather, they will tap on various areas of human experience at once.

It must be stressed that the four phases above may stretch over a period of several months.

3. Concluding remarks

A lot of L2 teaching nowadays concerns itself with the passing of grammar and declarative knowledge of the target language. Such knowledge stays in our students’ brains as declarative because way too often teachers are obsessed with vertical progression at all costs. This attitude, though, short-circuits and straight-jackets learning preventing the learners from truly automatizing the grammar structures and vocabulary we aim to teach them.

L2 students’ failure at acquiring what we teach them and eventually their disaffection with the learning process is often due to the inadequate amount of horizontal progression we allow for in our classrooms. Automatization, ACROSS ALL FOUR SKILLS,  the ability to apply the core L2 items in the performance of tasks rapidly, fluidly and accurately should take priority in the classroom over activities which build intellectual knowledge (e.g. lengthy grammar explanations and gap-fills), concern themselves  with producing artefacts (e.g. iMovies) or simply entertain (e.g. games and quizzes).

Grammar teaching is currently taught in many classrooms through teacher –led explanations followed by gap-fills. This does not lead to automatization and fluency. Grammar structures ought to be taught in the context of interaction which mimicks real life, first through communicative (highly structured) drills then through activities which increasingly allow the students more creativity and freedom in terms of output choice.

Vocabulary ought to be recycled through as many linguistic contexts as possible, shying away from the almost behaviouristic tendency  one observes in many language classrooms to teach and practise the target words in isolation or almost exclusively in the same unambitiously narrow range of phrases (a tendency encouraged by current ML textbooks and many popular specialised websites, e.g. the tragically unambitious Linguascope).

In conclusion, effective ML teaching, as viewed by Skill theory, concerns itself with

  • the micro-skills needed by the students to carry out the complex tasks teachers often require their students to perform. In many contexts, e.g. listening instructions,such micro-skills (e.g. decoding skills) are grossly neglected, often leading to failure and learner disaffection;
  • providing the students with opportunities to automatize everything they are taught before the class move on to another set of grammar rules, vocabulary or learning strategies;
  • building a wide-ranging processing history so that many neural connections are built between a new target item and as many ‘old’ items as possible through real-time language exposure/use;
  • fluency, i.e. the ability to perform each target L2 item as rapidly and accurately as possible;
  • skill-building rather than knowledge-building. Knowledge building is only the starting point of acquisition; that is why error correction that merely informs of the error and cryptically states the rule is considered as having very limited impact on learning.

For those interested in finding out more, please check out this online article by  Jensen (2007) [click on the rectangular download button]

References and suggested bibliography

Anderson, J.R. (1987). Skill acquisition Compilation of weak-method solutions. Psychological Revie. 94(2) 192-210

Anderson. J.R. et al. (1994). Acquisition of procedural skills from examples. Journal of experimental psychology, 20, 1322 -1340.

DeKeyser, R.M. (1998). Beyond focus on form: Cognitive perspectives on learning and practicing a second language grammar . In C. Doughty and J. Williams. (EDs). Focus on form in classroom second language acquisition. (pp42-63) New York: Cambridge university Press

Jensen, E. (2007) Introduction to brain-compatible learning, 2nd edn. Thousand

Oaks, CA: Corwin Press

Johnson, K. (1996). Language Teaching and Skill learning. Oxford: Blackwell.

Schneider, W. & Shiffrin. R. (1997) Controlled and automatic information processing.

10 common shortcomings of secondary curriculum design and textbooks in the UK


Please note: this post was written in collaboration with Steve Smith of Many thanks to Dylan Vinales of Garden International School, too, for the thought-provoking discussion we had on the topic prior to writing this.


In this post I will concern myself with issues in typical secondary school MFL curriculum design as evidenced by the schemes of work – and the textbooks these are often based on – which in my view seriously undermine the effectiveness of foreign language instruction in many British secondary schools.

Effective curriculum design is as crucial to successful MFL instruction as effective classroom delivery is and must be based on sound pedagogy and skillful planning. As I intend to discuss in this post, much curriculum planning and textbook writing flouts some of the most fundamental tenets of sound foreign language pedagogy and neglects important dimensions of language acquisition. Although Steve Smith of – with whom I am currently writing ‘The MFL teacher handbook’ – noted in his blog that the new editions of some British textbooks are actually addressing some of the issues I am about to discuss, there is still much scope for improvement.

Issue n 1 – Coverage vs Time available

Schemes of work are typically over-ambitious as they often reflect the structure of the textbook adopted; they usually aim to cover a given topic (i.e. a chapter / module in the textbook) in 6-7 weeks. This does not allow the students to truly acquire the target material, especially when it comes to grammar structures. As I have showed in a number of previous posts, the acquisition of grammar structures which involve ending manipulations/agreement and differ substantially from their L1 equivalent may take months to internalize. Another problem is that schemes of work – when based on textbooks – often devote only one or two lessons to each of the five or six sub-topics that make up the unit-in-hand and then move on to the next sub-topic. This does often not allow for sufficient recycling.

Solution – obvious: teach less but in greater depth; recycle more.

Issue n 2 – Fluency: the neglected objective

In previous blogs I pointed out how effective foreign language teaching ought to aim at developing fluency across all four skills and especially into areas where speed of processing is paramount to be an effective communicator: oral interaction and interpersonal writing (e.g. instant messaging). Fluency was defined in previous post as the ability to produce intelligible oral or written speech in response to a stimulus at high speed. This is a crucial skill for students to develop if we want to enable them to use the target language in the real world, especially in the workplace. Yet, fluency rarely – if ever- features expicitly as a goal in UK MFL departments’ schemes of work. Hence, teachers neither plan for fluency development nor are allocated adequate resources and training to teach fluency. Nor do they formally assess fluency.

Moreover, the issue highlighted in the previous paragraph often works against the attainment of fluency as rushing through a unit entails neglecting horizontal progression. Without sufficient horizontal progression fluency cannot be obtained.

Solution – Plan for the attainment of fluency. Include activities to develop speech automatization and opportunities for its assessment.

Issue n 3 – Topic compartmentalization / Lack of recycling

Schemes of work – even those that are not based on textbooks – rarely recycle adequately. Many colleagues – obviously not language teachers – ask me why I have uploaded over 1,600 teaching resources in two years on  and why I created a whole website devoted mainly to vocabulary teaching ( The answer is that textbooks and schemes of work usually compartmentalize teaching; term 1a one teaches topic X, term 1b topic Y, term 2a topic Z etc. Each time a topic or structure is covered, it is rarely consciously and systematically recycled in later units. I have had to produce my own worksheets and online resources to guarantee the necessary recycling; it has paid off, but teachers, as overloaded with work as they already are, should not have to do this.

Solution: include in the schemes of work a section in each unit headed ‘recycling opportunities’ and include activities aiming at consolidating old material. Also, make sure that each end of unit assessment tests students on material covered in previous units – or even previous years.

Issue 4 – What about communicative functions?

Most UK textbooks and MFL departments more or less explicitly adopt a weak communicative notional/functional syllabus with a variable focus (i.e. functions/notions + grammar). However, they usually patently neglect to focus adequately on important communicative functions. A glance at Finocchiaro and Brumfit’s (1983) classification of communicative functions (at ) will clarify what I mean. Much typical British secondary school teaching focuses mainly on Referential communicative functions and on only a few interpersonal functions. However, many Interpersonal and Imaginative functions are hardly touched on. Moreover, many important Personal functions are grossly neglected, too – although, I am sure you will agree,  they are crucial in daily life.

In PBL-based schemes of work this issue is worsened by the nature of the approach adopted which focuses on the attainment of a product rather than interpersonal communication.

Communicative functions are pivotal to effective target language proficiency. They are way more important than many other things textbooks teach.

Solution: use Finocchiaro and Brumfit’s taxonomy to fill the gaps in this area that you will identify in your schemes of work. Make sure that you recycle functions over and over again throughout the year.

Issue 5 – The 2 neglected word-classes

Textbooks, schemes of work and specialized websites focus mainly on nouns and –tragically – neglect verbs and adjectives – and hence adverbs from which adjectives are obtained. Verbs, as I pointed out in previous blogs, are essential in order to acquire a high level of autonomous speaking competence (spontaneous talk). One of the reasons for this neglect, I suspect, is that state-school English learners are notoriously bad at conjugating verbs; hence, textbooks dumb down their comprehensible input and target vocabulary by including only few essential and often more ‘learnable’ verbs.

Solution: include lists of target verbs in the schemes of work. Using quizlet or memrise to create your own online activities to drill them in (in the infinitive). You could use my verb trainer at – the pictures help the students learn the verb meaning as they conjugate – or my Work-outs.

Issue 6 – How about improvisation?

Schemes of work are usually planned around specific topics, which, in England, repeat themselves every year – how boring! However, autonomous speaking competence (spontaneous talk) is about being able to talk ‘across topics’ so to speak; to be able to have a ‘natural’ conversation with a speaker of the target language which is not bound to a specific topic or sub-topic but touches different aspects of human life and experiences. MFL departments – at least to my knowledge – never really plan for this. Yet, nearly everyone these days states that spontaneous talk is high on their agenda.

Solution: plan for one or two lessons every now and then – maybe in between half-terms? – which are entirely dedicated to talking, reading, listening and writing in the target language without being tied down to a specific topic. A very easy-to-set-up task is a general conversation task where the students ask each other a wide variety of questions covering several topics, including some that have never been covered before – but that the students possess the linguistic tools to talk about.

Issue 7 – Grammar, the ‘poor sister’

This point is so obvious that I will not dwell too long over it. British textbooks devote a ridiculously small amount of space to grammar and to its recycling. Teachers have to toil on a daily basis to resource grammar teaching.

Solution: teach more grammar and recycle it to death (see my previous post: 16 tips for effective grammar teaching’.

Issue 8 – Intercultural competence

Textbooks and schemes of work often include sections about ‘La Francophonie’ or other facts about the target language civilization. However, one very important dimension of cultural awareness is nearly always missing: how to avoid culture shock or other ‘faux pas’ and, more generally, how to train students to deal with target language native speakers in a way which is culture-sensitive and can foster effective integration. In an era where the labour market is so globalized, intercultural competence has become an important lifelong learning skill which our students need to be equipped with.

Solutions: Cultural awareness teaching should be more about the (cross-cultural) skills than the facts.

Issue 9 – Variety of topics

Every year, from year 6/7 to year 11, English teenagers keep learning about the same blocked topics, often relearning the same words. Here again, textbooks play an important role. As I tweeted earlier on today, most English textbooks seem to replicate the Metro textbook blueprint.

Solution: try new topics or combinations of topics. Prioritize topics teenagers are really interested in like relationships, entertainment, gadgets, social media, fashion, etc, rather than house chores or pets…

Issue 10 – Teaching sequences

The ‘Metro textbook blueprint’ is evident in all its successors not only in terms of the topics which receive more emphasis, but also in the way they sequence grammar structures. In a future post Steve and I will propose how we believe grammar structures should be sequenced and the rationale for it. There are many things we believe textbook writers and curriculum designers in the UK should change. One thing that springs to mind, for instance is modal verbs (e.g. Vouloir, Pouvoir, Devoir in French). One wonders why they are always introduced quite late when they are so important in everyday communication and have very high surrender value. Imagine how ‘handy’ they can be to a beginner learner, before they even start conjugating verb, followed as they are by infinitives. Moreover, their acquisition earlier on would partly address issue 5 by enabling the students to use many verbs at will quite easily.

Solution: Consider the surrender value and learnability of the target grammar structures. Would learning them earlier or later facilitate acquisition in your opinion? If so, don’t wait for the textbook sequence to teach them.


Some of the shortcomings in the typical secondary school MFL curriculum and course-book design I have just discussed are much more important than others. My pet hates are the lack of recycling, the insufficient focus on oral fluency, the neglect of verbs and adjectives and the sketchy and superficial approach to grammar. The reader should note that I have deliberately not dealt with the teaching of lifelong learning skills as I do believe that MFL teacher contact time being so limited, most of them are best taught explicitly as separate from the foreign language curriculum – unless, of course they overlap with the aims of the course (e.g. independent enquiry skills, problem solving, intercultural communication, effective communication, empathy, resilience).

Your greatest priority as a curriculum designer – and every teacher to a certain extent is one – should definitely be the systematic recycling of the target vocabulary, grammar and communicative functions and the allocation of sufficient time for deep encoding to occur. This will entail doing away with the one chapter per half-term approach, a tragic legacy of the Metro-based Schemes of Work.

Six writing research findings that have impacted my teaching practice


Every now and then I post concise summaries of research findings from studies I come across in my quest for emprical evidence which supports or negates my intuitions or experiences as a language teacher and learner. As I have mentioned in a previous post (‘ten reasons why you should not trust ground-breaking educational research’), much of the research evidence out there is far from being conclusive and irrefutable, due to flaws in design, data elicitation and analysis procedures which often undermine both their internal and external validity. However, when three or more  reasonaby well-crafted studies (however small) find concurring evidence which challenge commonly held assumptions  and/or resonates with our own ‘hunches’ or experiences about teaching and learning, it is reasonable to assume that ‘there is no smoke without fire’.

The following studies have been picked based on the above logic. They are small and less than perfect in design, but do reflect my professional experience and indicate that the validity some dogmata many teachers hold about language teaching and learning may be questionable.

1. Baudrand-Aertker (1992) – Effects of journal writing on L2-writing proficiency

21 students of French in the third year at a high school in Louisiana were asked to keep a journal over a nine-month period. They were required to write two entries per week at least and were not engaged in any other type of writing tasks for the whole of the duration of the study. The teacher responded to the students’ journal entries focusing only on content – not on form. Using a pre-/post-test design Baudrand-Aertker found that:

  • The students’ written proficiency improved significantly as evidenced by the post-test and their own perception;
  • The students felt that the journals helped them improve their overall mastery of the target language;
  • The students reported positive attitudes towards the activity;
  • The vast majority of the students did not want to be corrected on their grammatical mistakes when engaging in journal writing.

Although this study has important limitations in that there was no control group to compare the independent variable’ effects with, I find the results interesting and I intend to give journal-writing a try myself next year.

  1. Cooper and Morain (1980) – Effects of sentence combining instruction

The researchers investigated the effect of grammar instruction involving sentence combining tasks on the essay writing of 130 third quarter students of French. The subjects were divided into two groups: the experimental group received 60 to 150 minutes instruction per week through sentence combining exercises whilst the control group was taught ‘traditionally’ through workbook exercises. The experimental group outperformed the control group on seven of the nine measures of syntactic complexity adopted. Although the study did not look at the overall quality of the informants’ essays but only at the syntactic complexity, its findings are very interesting and has encouraged me to incorporate sentence combining tasks more regularly in my teaching strategies. Here is an discussion of the merits of sentence combining instruction and how it can be implemented

  1. Florez Estrada (1995) – Effects of interactive writing via computer as compared to traditional journaling

In this small scale study (28 university students of Spanish) Florez-Estrada compared a group of learners exchanging e-mail and chatting online with native-speaking partners with another group of students engaged in interactive paper writing with their teachers. The researcher found that the computer group outperformed the control group on the accuracy of key grammar points such as preterite vs imperfect, ‘ser’ vs ‘estar’, ‘por’ vs ‘para’ and others. The findings of this study were echoed by another study of 40 German students, Itzes (1940), which involved students in chatting via computer amongst themselves in the TL. A notable feature of this study is that the students chose the topics they wanted to chat about. These two studies confirms finding from my own practice; I often use Edmodo or Facebook to create a slow student-initiated chat on given topics in which the whole class is involved, every students sharing their opinions/comments with their peers with the assistance of the dictionaries. I have found this activity very beneficial even with groups of less able learners.

  1. Nummikoski (1991) and Caruso (1994) – Effects of extensive L2-reading on L2-writing proficiency as contrasted with written practice.

Both studies investigated if L2 learners who are engaged in extensive L2-reading (with no writing instruction/practice) write more effectively than L2 learners who are involved in writing tasks but do no reading. The results of both studies show a significant advantage for the writing-only condition. These studies, which are by no means flawless, do challenge the commonly held assumption that we can improve our students’ writing proficiency by engaging them in extensive reading.

  1. Martinez-Lage (1992) – Comparison of focus-on-form with focus-on-form-free writing

The researcher investigated the impact of two writing-task types on the writing output of 23 second-year university Spanish students. The same students were asked to write (a) typical assigned compositions and (b) dialogue journals in which they were told they would not be assessed on grammar accuracy. The surprising finding was that the syntactic complexity across both task types was equivalent but the focus-on-form-free task type (journal writing) was grammatically more accurate. I concur with Martinez-Lage on this one as I have tried this strategy myself with many of my AS groups over the years.

  1. Hedgcock and Lefkowitz (1992) – Effect of peer feedback in L2 writing

The researchers studied 30 students in an accelerated first year college French class, who wrote two essays involving three separate drafts. The experimental group was involved in peer feedback (essays were read aloud to each other and oral feedback was given), whilst the other group received written teacher feedback. In terms of performance from the first to the second essay both groups made significant improvements, but in different areas: the peer-feedback group got worse in grammar but did better on content, organization and vocabulary; the teacher feedback group, exactly the opposite. It should be noted that a previous study by Piasecki (1988) which adopted a very similar design but lasted much longer (8 weeks) and involved 112 students of third-year high school students of Spanish, found no significant differences between the two conditions. This confirms my reservations about using peer-feedback as an effective way to correct learner output and as a blanket corrective strategy; in my opinion it may work quite well with certain groups of individuals with highly developed grammar knowledge and critical thinking skills but not with others.

What is the most effective approach to foreign language instruction? – Part 1

download (6)

Introduction – Of metaphors teachers live by and pedagogy ‘evangelists’

Every single one of us lives by metaphors, behavioural templates which we acquire through our interaction with the environment we grow up and live in. The language learning metaphors that are at the heart of our teaching come to a large extent from our experiences as language learners. These images of learning are so strongly embedded in our cognition that according to researchers it takes years of training and teaching practice to replace them with new templates; in certain cases, they are even impervious to  ‘conditioning’, despite the demands of teacher trainers, course administrators or students – I have observed this phenomenon first-hand time and again in most of the schools I have worked at.

Our beliefs about L2 learning play an enormous role in determining what teachers we will become and our response to any new methodology that we are asked to adopt. Some individuals will reject new instructional approaches in the belief that if they are such good linguists and their teachers’ approach worked so well for them, why should it not work for their own students? Some others – like I did, for instance, during and after my PGCE – will integrate elements of their existing belief system with the new methodology (-ies) to create a sort of personalized ‘hybrid’ – a ‘syncretistic’ approach. Others, instead – what I call the ‘radical converts’ – will espouse the new methodology with some kind of fanaticism often becoming zealous evangelists of their new pedagogic ‘dogmata’

It is the third attitude that one must be wary of: the blind allegiance to any approach that claims to have found a universal pedagogical fit for every learner. Any such claim will be unfounded because every learner brings to bear on the learning process a range of genetic and acquired individual variables that play an important role in language aptitude as well as in the cognitive/emotional response to teachers and their methodology. Whilst some guiding principles may be ‘universal’ in that they refer to general mechanisms that regulate human cognition across age, race, gender, G.I. factor and language aptitude, their implementation will ALWAYS be conditioned by contextual variables.

Consequently, I am not going to play the ‘know-all L2-pedagogue’, here, and tell teachers what the best approach is. After all, if your students are happy, motivated and learning lots, you have found the best approach already. You may want to enhance and vary your repertoire of teaching strategies, but after all, if the vast majority of your students are getting where you want them to be in the time and with the resources that you have been allocated by your course administrators, you do not need anyone to tell you how to teach; unless someone throws the spanner in the works, that is, and tells you that you must ‘integrate’ new technology, life-long learning skills, etc. into your healthy and balanced teaching echo-system…

Psychology, however, does give us some clear indication of how humans acquire cognitive skills. So, if one believes, as it is logical to presume, that language acquisition involves the same processes and mechanisms involved in the acquisition of any other cognitive ability, it is possible to identify some core pedagogical principles as crucial to any form of explicit foreign language instruction. Moreover, there is some sound research empirical evidence out there that should inform our teaching; to claim that it is conclusive and irrefutable would be preposterous, but to ignore it because it is not would be irresponsible. After all, what teachers must do with research evidence is to make an informed choice and ask themselves the questions: do these findings resonate with me and my past experiences? Is it worth trying this out? And, after trying it out: did it work? And if it didn’t, you can modify it or reject it altogether and look elsewhere.

Thirteen pedagogic principles rooted Cognitive psychology

The following are the pedagogical principles rooted in Cognitive psychology theory and research that worked for me. I am no evangelist, thus I am not positing them as the Gospel’s truths: these are merely some of the beliefs I formed in more than 2 decades of primary, secondary and tertiary MFL teaching, researching and, most importantly, reflecting on my own practice and listening to my students.

I am not concerning myself explicitly with the most important issue– motivation. It goes without saying that no methodology will ever be effective unless the teacher brings about a high level of his/her learners’ cognitive and emotional arousal and develops their self-efficacy.

Finally, let me reiterate that the principles below are based on the epistemological assumption that language skills are acquired in the same way as any other cognitive human skill.

  1. Practice makes perfect – Every language skill and item, in order to be acquired, is subject to the ‘Power Law of Practice’ (Anderson, 2000). Hence Listening, Speaking, Reading, Writing, Translation/Interpreting, Grammar and any other skills must all be practised extensively. This entails that any instructional approach (e.g. Grammar Translation and PBL) which does not emphasize all four skills in a balanced manner is defective. Instruction can be successful only through extensive practice and recycling of the kind envisaged in the next two points.
  1. Recycling must start from day one – forgetting starts occurring immediately after a given item has passed into Long-term Memory (Anderson and Jordan,1998). As the diagram below clearly shows, after 19 minutes one loses 40 % of what was recalled at time 0; after 9 hours, 56 % and after 6 days, 75 %. Recycling is imperative and must be of the spaced, distributed kind (a bit every so often) not of the massed kind (a lot of it once a week). Moreover, recycling must start on the same day something has been learnt. Instruction must model independent vocabulary learning habits which focus on autonomous recycling; it must also be mindful of human forgetting rate and provide for consolidation accordingly.


  1. Effective language learning = high levels of cognitive control – A language item can be said to be acquired only when it can be performed accurately and efficiently (with little hesitation) under real time conditions in unmonitored execution (e.g. spontaneous conversation). This means that acquisition occurs along a conscious to automatic continuum; it starts from a declarative stage where the application of the knowledge about a specific language item is applied slowly under the brain’s conscious control and it ends when the execution of that item is fully automatic and bypasses working memory (Johnson, 1996). Instruction must involve extensive practice which starts with highly structured tasks (i.e. gap-fill or audiolingual drills) which become increasingly less structured with time and aim at developing cognitive control (the ability to perform effectively in real operating conditions).
  1. Production should always come after extensive receptive processing – Humans learn languages by imitating others’ linguistic input. Instruction should engage learners in masses of receptive practice before engaging them in production. Thus, ideally, extensive listening/reading practice (in the way of comprehensible input) should always precede speaking/writing practice. This rules out reading or listening comprehension tasks as valuable receptive practice, as these are tests, not effective sources of modelling; reading/listening for personal enjoyment or enrichment would be more conducive to learning in this regard.
  1. Cognitive overload should be prevented and controlled for – cognitive overload occurs when learners are engaged in tasks that pose challenging demands on their working memory. Teachers ought to prepare their students for a given task by facilitating their cognitive access to each level of challenge posed by that task. Thus, before reading a challenging text, the learners should be taught the key vocabulary and grammar points it contains and effective strategies to tackle it. Moreover, the text could be adapted to incorporate more contextual clues that may facilitate inference of unfamiliar lexis.
  1. Focus on micro-skills as much as you do on the macro- ones – To execute any task in the L2 (e.g. an unplanned role-play) effectively, the brain must acquire effective cognitive control over both the higher meta-components (e.g. generating meaning) and the lower order skills involved (e.g. pronunciation and intonation). By automatizing lower order language skills, the brain frees up space in learner Working Memory thereby facilitating processing efficiency and cognitive control and, consequently, performance – this is like learning to drive a car whereby a driver automatizes the basic skills such as changing gear or accelerating so that s/he can focus on the road. Instruction must identify and systematically address every set of macro- and micro-skills that typical language tasks involve. Following on from (2) such micro-skills must be practised extensively, too.
  1. Learning is enhanced by depth of processing, distinctiveness of input and personal investment – Learning of any language item does not simply involve practice, but also depth of processing. Instruction must engage learners in semantic analysis and association in order to strengthen the memory trace and to increase the range of context-dependent cues at encoding which will enhance the recall of any target item. The distinctiveness of instructional input (how outstanding and memorable it is) is also an important learning enhancing factor. Personal investment, how much the learning taps into an individual’s emotions and personal background increases retention, too. Hence, in choosing topics and learning materials learner opinions and tastes should always be taken into account (e.g. personalized reading-for-enjoyment activities).
  1. Grammar taught explicitly can be acquired – On condition that it is practised extensively, in context, and through masses of communicative practice which starts from controlled tasks and progresses through increasingly challenging unstructured ones. The process is a lengthy one so it may require training students to work on it independently, too. Implications: recycling is imperative and must occur mostly through the cognitive-control enhancement dimension, i.e. less gap-fills and written translation and more oral semi-structured and unstructured tasks. To enhance grammar acquisition the exceptions to the rule governing an ‘X’ structure should be taught before the dominant rule, e.g. irregular before irregular forms (see my article ‘Irregular before regular…’ for the psycholinguistic rationale for this approach).
  1. Corrective feedback is important, especially at the early stages of instruction – However, in order to be effective it must be processed by the brain long and deeply enough for it to be rehearsed in Working Memory and stored permanently in Long-term memory. Hence, any feedback practice on an erroneous executed ‘X’ item must :
  • Be distinctive;
  • Engage learners in deep processing;
  • Recycle the corrective feedback;
  • Be carried out through various means in order to provide more contextual cues for its recall;
  • Not limit itself to treating the symptom (i.e. the error) but also and more importantly the root cause (whether lack of knowledge, processing inefficiency, etc.)
  • Bring about learner intentionality to eradicate the error (i.e. motivate them to address the error in the future in a sustained effort to eliminate it).

(Conti, 2004)

  1. Learning strategies can be taught – On condition that a persuasive rationale for their instruction is provided; that they are modelled and scaffolded effectively and are practised very extensively through a variety of contexts (Cohen, 1998; Macaro, 2007)
  1. Metacognition should be modelled regularly – enhancing learner metacognition is imperative as a learner who knows how to learn and perform best is a learner who is bound to be more successful. Research shows clearly that highly metacognizant individuals are more successful at L2 learning (Macaro, 2007). Ideally, teaching should regularly scaffold holistic and task specific metacognition by prompting students to monitor and evaluate every level of their language learning and performance. The same approach concisely outlined in point 9 applies here.
  1. Individual variables must be assessed at the beginning of instruction – Learner individual factors may inhibit or facilitate learning. Ideally, at the beginning of instruction it may be helpful (but not always viable, I know…) to obtain as much information as to the following students’ characteristics
  • Previous history as language learners;
  • Personality traits;
  • Learning strategies;
  • Learning preferences (NOT learning styles – but rather how one enjoys learning)
  • Language proficiency across all skills;
  • Language aptitude;
  • Personal interests;
  • Processing efficiency (e.g. how well learners process language);

    This is very time consuming and does require quite a lot of resources and expertise.

  1. Sources of divided attention must be controlled for – This is the most obvious learning principle (Eysenk, 1988); that is why I placed it last. In a lot of UK state school classrooms to expect every student to be focused 100 % of the time is unrealistic. However, in settings where behavior management is not an issue, teachers should endeavour to minimize any distraction stemming from any sources which are directly under their control. One of them is the excessive manipulation of digital media (e.g. app smashing) which hijacks learners’ finite attentional resources away from language processing. Digital media can be effective target language learning enhancers, but must be used judiciously to expand not shrink learning.

In conclusion, as already stated above, the above list is by no means exhaustive. It only includes some of the many pedagogic principles which, in my opinion, ought to underlie any instructional approach regardless of the educationl setting and espoused theory. Unfortunately, something important is missing: how should one implement the above principles in curriculum design, lesson planning and across all four macro-skills? Some of the answers can be found in the other articles on this blog. More answers will be provided in the sequel to this article in the very near future, in which I will concern myself with how those principle should inform pedagogy vis-à vis the four macro-skills, grammar, translation and learning strategy instruction.

Nine interesting foreign language research findings you may not know about

images (5)

In  this post I am going to share with the reader a very succinct summary of 9 pieces of research I have recently come across which I found interesting and have impacted my classroom practice in one way or another. They are not presented in any particular order.

  1. Green and Hecht 1992 – Area: Explicit grammar instruction and teaching of aspect

Green and Hecht investigated 300 German learners of English. They asked them to correct 12 errors in context and to offer an explanation of the rule. Most interesting finding: the students could correct 78 % of the errors but could not provide an explanation for more than 46 % of the grammar rules that referred to those errors. They identified a set of rules that were hard to learn (i.e. most students did not recall them) and a set of easy rules (the vast majority of them could recall them successfully). Their implications for teaching: the explicit teaching of grammar may actually not work for all grammar items. For example, the teaching of aspect (e.g. Imperfect vs Preterite in Spanish), would be more effectively taught, according to them, by exposure to masses of comprehensible input (e.g. narrative texts) rather than through the use of PPTs or diagrams on the classroom whiteboard/screen – in fact Blyth (1997) and Macaro (2002a) demonstrated the futility of drawing horizontal lines interrupted by vertical ones to indicate that the perfect tense ends the action.

My conclusions: I do not entirely agree with Blyth and Macaro that explicit explanation of grammar in the realm of aspect does not work and I do like diagrams (although they do not work with all of one’s students). However, I do agree with Green and Hecht (1992) that the best way to teach aspect is through exposure to masses of comprehensible input containing examples of aspect in context. The grammar explanation and production phase may be carried out at a later stage.

  1. Milton and Meara (1998) – Comparative study of vocabulary learning between German, English and Greek students aged 14-15 years.

197 students from the three countries studying similar syllabi for the same number of years were tested on their vocabulary. The findings were that:

1.The British students’ score was the worst (averaging at 60 %). According to the researchers, they showed a poor grasp of basic vocabulary ;

2.They spent less time learning and were set lower goals than their German and Greek counterparts;

3. 25 % of the British students scored so low (after four years of MFL learning) that the researchers questioned whether they had learnt anything at all.

The authors of the study also found that British learners are not necessarily worse in terms of language aptitude; rather, they questioned the effectiveness of MFL teaching in the UK.

My conclusions: this study is quite old and the sample they used may not be indicative of the overall British student population. If it were, though, representative of the general situation in Britain, teachers may have to – as I have advocated in several previous blogs of mine – consciously recycle words over and over again, not just within the same units, but across units.

Moreover a study of 850 EFL learners, by Gu and Johnson (1996), may indicate an important issue underlying our students poor vocabulary retention; they found that students who excelled in vocabulary size were those who used three metacognitive strategies in addition to the cognitive strategies used by less effective vocabulary learners : selective attention to words (deciding to focus on certain words worth memorizing), self-initiation (making an effort to learn beyond the classroom and the exam system) and deliberate activation of newly-learnt words (trying out using that word independently to obtain positive or negative feedback as to the correctness of their use) . Teaching should aim, in other words, at developing learner autonomy and motivation to apply all of these strategies independently outside the classroom.

  1. Knight (1994) – Using dictionaries whilst reading – effects on vocabulary learning

Knight gave her subjects a text to read on a computer. One group had access to electronic dictionaries whilst the other did not. She found that those who did use the dictionary and not simply guessing strategies, actually scored higher in a subsequent vocabulary test. This and other previous (Luppescu and Day, 1993) and subsequent studies (Laufer & Hadar, 1997; Laufer & Hill, 2000; Laufer & Kimmel,1997) suggest that students should not be barred from using dictionaries in lessons. These findings are important for 1:1 (tablet or PC) school settings considering the availability of free online dictionaries (e.g.

  1. Anderson and Jordan (1998) – Rate of forgetting

Anderson and Jordan set out to investigate the number of words that could be recalled by their informants immediately after initial learning, 1 week, 3 weeks, and 8 weeks thereafter. They identified a learning rate of 66%, 48%, 39%, and 37% respectively. The obvious implication is that, if immediately after learning the subjects could not recall 66 % of the target vocabulary, consolidation should start then and continue (at spaced intervals – through recycling in lessons or as homework) for several weeks. At several points during the school year, I remind my students of Anderson and Jordan’s study and show them the following diagram. It usually strikes a chord with a lot of them:


  1. Erler (2003) – Relationship between phonemic awareness and L2 reading proficiency

Erler set out to investigate the obstacles of learners of French as a foreign language in England. She studied 11-12 year olds. She found that there was a strong correlation between low level of phonemic awareness and reading skills (especialy word recognition skills). She concluded that explicit training and practice in the grapheme-phoneme system (i.e. how letters/combination of letters are pronounced) of French would improve L1-English learners’ reading proficiency in that language. This find corroborates other findings by Muter and Diethelm (2001) and Comeau et al (1999). The implications is that micro-listening enhancers of the like I discussed in a previous blog (e.g. ‘Micro-listening skills tasks you may not do in your lessons’) or any other teaching of phonics should be performed in class much more often than it is currently done in many UK MFL classrooms.

Please note: teaching pronunciation and decoding skills instruction are not the same thing.  Pronunciation is about understanding how sounds are produced by the articulators, whilst teaching decoding skills means instructing learners on how to convert letters and combination of letters into sound. Also, effective decoding-skill instruction occurs in communicative contexts (whether through receptive or productive processing) not simply through matching sounds with gestures and/or phonetic symbols.

  1. Feyten (1991) – Listening ability as predictor of success

Feyten investigated the possibility that listening ability may be a predictor of success in foreign language learning. The researcher assessed the students at pre-test using a variety of tasks and measures of listening proficiency. After a ten-week course she tested them again (post-test) and found that there was a strong correlation between listening ability and overall foreign language acquisition, i.e.: the students who had scored high at pre-test did better at post-test not just in listening, but also in written grammar, reading and vocabulary assessment. Listening was a better predictor of foreign language proficiency than any other individual factor (e.g. gender, previous learning history, etc.).

My implications: we should take listening more seriously than we currently do. Increased exposure to listening input and more frequent teaching of listening strategies are paramount in the light of such evidence. Any effective baseline assessment at the outset of a course ought to include a strong listening comprehension component; the latter ought to include a specific decoding-skill assessment element.

  1. Graham (1997) – Identification of foreign language learners’ listening strategies

This study investigated the listening strategies of 17-year-old English learners of German and French. Amongst other things she found the following issues undermining their listening comprehension. Firstly, they were slow in identifying key items in a text. Secondly, they often misheard words or syllables and transcribed what they believed they had heard thereby getting distracted. Graham’s conclusions were that weaker students overcompensated for lack of lexical knowledge by overusing top-down strategies (e.g. spotting key words as an aid to grasp meaning).

My implications are that Graham’s research evidence, which echoes finding from Mendelsohn (1998) and other studies, should make us wary of getting students to over-rely on guessing strategies based on key-words recognition. Teachers should focus on bottom-up processing skills much more than they currently do, e.g. by practising (a) micro-listening skills; (b) narrow listening or any other listening instruction methodology which emphasizes recycling of the same vocabulary through comprehensible input (N.B. not necessarily through videos or audio-tracks; it can be teacher-based, in absence of other resources); (c) listening with transcripts – whole, gapped or manipulated in such a way as to focus learners on phoneme-grapheme correspondence.

  1. Polio et al. (1998) – Effectiveness of editing instruction

Polio et al. (1998) set out to investigate whether additional editing instruction – the innovative feature of the study – would enhance learners’ ability to reduce errors in revised essays. 65 learners on a university EAP course were randomly assigned to an experimental and a control group who wrote four journal entries each week for seven weeks. Whereas the control group did not receive any feedback, the experimental group was involved in (1) grammar review and editing exercises and (2) revision of the journal entries, both of which were followed by teacher corrective feedback. On each pre- and post-tests, the learners wrote a 30-minute composition which they were asked to improve in 60 minutes two days later. Linguistic accuracy was calculated as a ratio of error-free T-units to the total number of T-units in the composition.

The results suggested that the experimental group did not outperform the control group. The researchers conjectured that the validity of their results might have been undermined by the assessment measure used (T-units) and/or the relatively short duration of the treatment. They also hypothesised that the instruction the control group received might have been so effective that the additional practice for the experimental group did not make any difference.

The implications of this study are that editing instruction may take longer than seven weeks in order to be effective. Thus, the one-off editing instruction sessions that many teachers do on finding common errors in their students’ essays to address the grammar issues that refer to them, are absolutely futile, unless they are followed up by extensive and focused practice with lots of recycling.

  1. Elliott (1995) – Effect of explicit instruction on pronunciation

Elliott set out to investigate the effects of improving learner attitude toward pronunciation and of explicitly teaching pronunciation on his subjects (66 L1 students of Spanish). He compared the experimental group (which received 10-15 minutes of instruction per lesson over a semester) with a group of students whose pronunciation was corrected only when it impeded understanding. The results were highly significant, both in terms of improved accent and of attitude (92 % of the informants being positive about the treatment). The experimental group outperformed the control group.

Implications: this study , which confirms evidence from several others (e.g. Elliot 1997; Zampini, 1994), confirms that explicit pronunciation instruction is more effective than implicit instruction whereby L2 learners are expected to learn pronunciation simply by exposure to comprehensible input. Arteaga’s (2000) review of US Spanish textbooks found that only 4 out of 10 Spanish textbooks include activities attempting to teach pronunciation. I suspect that the figure may be even lower in the UK. In the light of Elliott’s findings, this is quite appalling, as the mastery of phonology not only is a catalyst of reading ability but also of listening and speaking proficiency as well as playing an enormous role in Working Memory’s processing efficiency in general (see my blog: ‘ Eight important facts about Working Memory’).

How the brain acquires foreign language grammar – A Skill-theory perspective

Caveat: Being an adaptation of a section of a chapter in my Doctoral thesis, this is a fairly challenging article which may require solid grounding in Applied Linguistics and Cognitive Theories of Skill Acquisition.

1. L2-Acquisition as skill acquisition: the Anderson Model

The Anderson Model, called ACT* (Adaptive Control of Thought), was originally created as an account of the way students internalise geometry rules. It was later developed as a model of L2-learning (Anderson, 1980, 1983, 2000). The fundamental epistemological premise of adopting a skill-development model as a framework for L2-acquisition is that language is considered as governed by the same principles that regulate any other cognitive skill. A number of scholars such as Mc Laughlin (1987), Levelt (1989), O’Malley and Chamot (1990) and Johnson (1996), have produced a number of persuasive arguments in favour of this notion.

Although ACT* constitutes my espoused theory of L2 acquisition, I do not endorse Anderson’s claim that his model alone can give a completely satisfactory account of L2-acquisition. I do believe, however, that it can be used effectively to conceptualise at least three important dimensions of L2-acquisition which are relevant to type of Explicit MFL instructional approaches implemented in many British schools: (1) the acquisition of grammatical rules in explicit L2-instruction, (2) the developmental mechanisms of language processing and (3) the acquisition of Learning Strategies.

 Figure 1: The Anderson Model (adapted from Anderson, 1983)


The basic structure of the model is illustrated in Figure 1, above. Anderson posits three kinds of memory, Working Short-Term Memory (WSTM), Declarative Memory and Production (or Procedural) Memory. Working Memory shares the same features discussed in previous blogs (see ‘Eight important facts about Working Memory’) while Declarative and Production Memory may be seen as two subcomponents of Long-Term Memory (LTM). The model is based on the assumption that human cognition is regulated by cognitive structures (Productions) made up of ‘IF’ and ’THEN’ conditions. These are activated every single time the brain is processing information; whenever a learner is confronted with a problem the brain searches for a Production that matches the data pattern associated with it. For example:

IF the goal is to form the present perfect of a verb and the person is 3rd singular/

THEN form the 3rd singular of ‘have’

IF the goal is to form the present perfect of a verb and the appropriate form of ‘have’ has just been formed /

THEN form the past participle of the verb

The creation of a Production is a long and careful process since Procedural Knowledge, once created, is difficult to alter. Furthermore, unlike declarative units, Productions control behaviour, thus the system must be circumspect in creating them. Once a Production has been created and proved to be successful, it has to be automatised in order for the behaviour that it controls to happen at naturalistic rates. According to Anderson (1985), this process goes through three stages: (1) a Cognitive Stage, in which the brain learns a description of a skill; (2) an Associative Stage, in which it works out a method for executing the skill; (3) an Autonomous Stage, in which the execution of the skill becomes more and more rapid and automatic.

In the Cognitive Stage, confronted with a new task requiring a skill that has not yet been proceduralised, the brain retrieves from LTM all the declarative representations associated with that skill, using the interpretive strategies of Problem-solving and Analogy to guide behaviour. This procedure is very time-consuming, as all the stages of a process have to be specified in great detail and in serial order in WSTM. Although each stage is a Production, the operation of Productions in interpretation is very slow and burdensome as it is under conscious control and involves retrieving declarative knowledge from LTM. Furthermore, since this declarative knowledge has to be kept in WSTM, the risk of cognitive overload leading to error may arise.

Thus, for instance, in translating a sentence from the L1 into the L2, the brain will have to consciously retrieve the rules governing the use of every single L1-item, applying them one by one. In the case of complex rules whose application requires performing several operations, every single operation will have to be performed in serial order under conscious attentional control. For example, in forming the third person of the Present perfect of ‘go’, the brain may have to: (1) retrieve and apply the general rule of the present perfect (have + past participle); (2) perform the appropriate conjugation of ‘have’ by retrieving and applying the rule that the third person of ‘have’ is ‘has’; (3) recall that the past participle of ‘go’ is irregular; (4) retrieve the form ‘gone’.

Producing language by these means is extremely inefficient. Thus, the brain tries to sort out the information into more efficient Productions. This is achieved by Compiling (‘running together’) the productions that have already been created so that larger groups of productions can be used as one unit. The Compilation process consists of two sub-processes: Composition and Proceduralisation. Composition takes a sequence of Productions that follow each other in solving a particular problem and collapses them into a single Production that has the effect of the sequence. This process lessens the number of steps referred to above and has the effect of speeding up the process. Thus, the Productions

P1 IF the goal is to form the present perfect of a verb / THEN form the simple present of have

P2 IF the goal is to form the present perfect of a verb and the appropriate form of ‘have’ has just been formed / THEN form the past participle of the verb would be composed as follows:

P3 IF the goal is to form the present perfect of a verb / THEN form the present simple of have and THEN the past participle of the verb

An important point made by Anderson is that newly composed Productions are weak and may require multiple creations before they gain enough strength to compete successfully with the Productions from which they are created. Composition does not replace Productions; rather, it supplements the Production set. Thus, a composition may be created on the first opportunity but may be ‘masked’ by stronger Productions for a number of subsequent opportunities until it has built up sufficient strength (Anderson, 2000). This means that even if the new Production is more effective and efficient than the stronger Production, the latter will be retrieved more quickly because its memory trace is stronger.

The process of Proceduralisation eliminates clauses in the condition of a Production that require information to be retrieved from LTM memory and held in WSTM. As a result, proceduralised knowledge becomes available much more quickly than non-proceduralised knowledge. For example, the Production P2 above would become

IF the goal is to form the present perfect of a verb

THEN form ‘had’ and then form the past participle of the verb

The process of Composition and Proceduralisation will eventually produce after repeated performance:

IF the goal is to form the present perfect of ‘play’/ THEN form ‘ has played’

For Anderson it seems reasonable to suggest that Proceduralisation only occurs when LTM knowledge has achieved some threshold of strength and has been used some criterion number of times. The mechanism through which the brain decides which Productions should be applied in a given context is called by Anderson Matching. When the brain is confronted with a problem, activation spreads from WSTM to Procedural Memory in search for a solution – i.e. a Production that matches the pattern of information in WSTM. If such matching is possible, then a Production will be retrieved. If the pattern to be matched in WSTM corresponds to the ‘condition side’ (the ‘if’) of a proceduralised Production, the matching will be quicker with the ‘action side’ (the ‘then’) of the Production being deposited in WSTM and make it immediately available for performance (execution). It is at this intermediate stage of development that most serious errors in acquiring a skill occur: during the conversion from Declarative to Procedural knowledge, unmonitored mistakes may slip into performance.

The final stage consists of the process of Tuning, made up of the three sub-processes of Generalisation, Discrimination and Strengthening. Generalisation is the process by which Production rules become broader in their range of applicability thereby allowing the speaker to generate and comprehend utterances never before encountered. Where two existing Productions partially overlap, it may be possible to combine them to create a greater level of generality by deleting a condition that was different in the two original Productions. Anderson (1982) produces the following example of generalization from language acquisition, in which P6 and P7 become P8

P6 IF the goal is to indicate that a coat belongs to me THEN say ‘My coat’

P7 IF the goal is to indicate that a ball belongs to me THEN say ‘My ball’

P8 IF the goal is to indicate that object X belongs to me THEN say ‘My X’

Discrimination is the process by which the range of application of a Production is restricted to the appropriate circumstances (Anderson, 1983). These processes would account for the way language learners over-generalise rules but then learn over time to discriminate between, for example, regular and irregular verbs. This process would require that we have examples of both correct and incorrect applications of the Production in our LTM.

Both processes are inductive in that they try to identify from examples of success and failure the features that characterize when a particular Production rule is applicable. These two processes produce multiple variants on the conditions (the ‘IF’ clause(s) of a Production) controlling the same action. Thus, at any point in time the system is entertaining as its hypothesis not just a single Production but a set of Productions with different conditions to control the action.

Since they are inductive processes, Generalization and Discrimination will sometimes err and produce incorrect Productions. As I shall discuss later in this chapter, there are possibilities for Overgeneralization and useless Discrimination, two phenomena that are widely documented in L2-acquisition research (Ellis, 1994). Thus, the system may simply create Productions that are incorrect, either because of misinformation or because of mistakes in its computations.
ACT* uses the Strengthening mechanism to identify the best problem-solving rules and eliminate wrong Productions. Strengthening is the process by which better rules are strengthened and poorer rules are weakened. This takes place in ACT* as follows: each time a condition in WSTM activates a Production from procedural memory and causes an action to be deployed and there is no negative feedback, the Production will become more robust. Because it is more robust it will be able to resist occasional negative feedback and also it will be more strongly activated when it is called upon:
The strength of a Production determines the amount of activation it receives in competition with other Productions during pattern matching.Thus, all other things being equal, the conditions of a stronger Production will be matched more rapidly and so repress the matching of a weaker Production (Anderson, 1983: 251)
Thus, if a wrong Interlanguage item has acquired greater strength in a learner’s LTM than the correct L2-item, when activation spreads the former is more likely to be activated first, giving rise to error. It is worth pointing out that, just as the strength of a Production increases with successful use, there is a power-law of decay in strength with disuse.
2.Extending the model: adding a ‘Procedural-to-Procedural route’ to L2-acquisition
One limitation of the model is that it does not account for the fact that sometimes unanalysed L2-chunks of language are through rote learning or frequent exposure. This happens quite frequently in classroom settings, for instance with set phrases used in everyday teacher-to-student communication (e.g. ‘Open the book’, ‘Listen up!’). As a solution to this issue Johnson (1996) suggested extending the model by allowing for the existence of a ‘Procedural to Procedural route’ to acquisition whereby some unanalysed L2-items can be automatised with use, ‘jumping’, as it were, the initial Declarative Stage posited by Anderson.