Beyond imitation – Five L2 writing teaching techniques that work, yet few Modern Language teachers use

Please note: this post was co-authored by Steve Smith of 


  1. Introduction – Why modern language teachers need to re-evaluate their attitudes and strategies about teaching writing

In these writers’ experience, many foreign classrooms instructors’ attitudes and strategies about teaching writing are more product- than process-oriented. By this we mean that explicit writing instruction – when it does occur – tends to rely mostly on

(1) ‘imitation’ – the provision of lists of model phrases/sentences which in the best scenarios are ‘drilled in’ through gap-fill practice;

(2) explicit grammar instruction which is rarely contextualized in whole-discourse practice (e.g. essay writing);

(3) learning from feedback on written output (usually a creative piece of narrative or discursive essay) – which usually occurs through annotations on margins, lists of targets or, in the best scenarios, one-to-one conferences. Feedback usually tackles all the deficit areas through one set of corrections / conference session;

(4) essay writing practice – often from day one, in the belief that practice makes perfect.

Some teachers train their students to compose in their native language first and  then  translate the L1 output thereby generated into the L2. This cumbersome process, however, must be utterly discouraged by teachers if they want students to attain some degree of fluency in target language writing and become more ‘spontaneous’ writers. Moreover, as Kobayashi and Rinnert (1992) found, this approach does indeed lead to more complex L2 output, but also lead to making many more errors than composing directly in the L2, for obvious reasons: the sentences language learners create when composing in their native language are usually too cognitively challenging and linguistically complex for their existing levels of L2 proficiency. Hence, the translations are bound to be inaccurate. Finally, it is this kind of approach which encourages less resilient and committed students to ‘google-translate’.

None of the above focuses the learners explicitly on the process of writing intended as the transformation of concepts or ideas (or ‘propositions’ as cognitive pyschologists call them) into words and syntactic structures.

One scenario in which the process of writing is indeed focused on in the UK modern language classroom refers to the teaching of higher meta-components of essay composition, e.g. planning, prioritizing, organizing and evaluating ideas. In other words: content production and organization. To our knowledge, however, even this practice is not as frequent and systematic as it should be.

Another strategy is to engage students in L2 reading in the belief that the language items in the articles or narratives they read in class or as assignments will be internalized and eventually resurface in their written pieces. This is not an erroneous assumption if students do read frequently and extensively in the target language.

Our take on the above is that these approaches usually work with the more talented, committed, self-reliant and highly metacognizant language learners, especially those who are highly proficient writers in their first language. But what is the student is not a gifted L1 writer; does not memorize the lists of connectives, model phrases and key terms her teacher diligently prepared for her whilst writing her essay; does not read extensively outside the classroom; does not spend more than a few minutes’ time – as most students do – processing her teacher’s feedback and rarely refers to the targets set for her? How do we expect such students to improve their essay writing?

  1. Strategies suggested in previous posts

In previous posts Gianfranco tackled the issue by suggesting that:

(1) teachers practice writing instruction which addresses different communicative and discourse functions/skills as discrete items in each lesson or set of lessons. The assignments set would engage students in intensive practice of those functions. So, for example, if the functions is ‘explaining’, instructors would teach a series of lessons on the relevant discourse markers (e.g. because, due to the fact that, etc.) – contextualized in the topic at hand – and provide in- and out-of-the classroom practice in those discourse markers. This would still be ‘imitation’, though, if the teachers simply provided lists and asked the learners to ‘get on with it’ and fill in a cloze text or make-up random sentences. It would focus on the process, however, if the students were asked to analyze the use of the discourse markers under study in model L2 texts; to work out different ways to convey the message contained in a sentence through using a range of discourse markers without significantly altering the meaning.

(2) teachers do not throw students in the deep end by asking them to write essay after essay from day one; but rather, that classroom and out-of-the-classroom activities focus on micro-writing, i.e. the process of writing an introduction/conclusion or developing one of the ideas brainstormed in the idea-generation phase into a paragraph.

(3) teacher feedback focus not simply on providing a correct L2 alternative to the student erroneous output (product-based feedback), but attempt to address the cognitive causes of learner deficits by collaboratively investigating the processes that underlie those deficits (process-based feedback). This entails that feedback on a piece of writing may be provided over several sessions each session focusing on different deficits identified in student output (e.g. one session on the relevance of some of the concepts selected by the students ; one on the organization of the essays; one on sentence level errors).

(4) parallel texts be used in order to raise learner awareness of the differences between L2 and L1 writing across a number of dimensions of the text. If done from the very early days of instruction, this kind of work can dispel the assumptions held by many language learners that the L2 is but a literal, word for word translation of the L1.

  1. The writing process as transformation

In this post we shall tackle the issue from a different angle: we shall focus on one aspect of L2 writer proficiency development which is often neglected by modern language teachers: the development of linguistic variety (both in terms of vocabulary and grammar structures), clarity, concision and, most importantly, syntactic maturity (i.e. the ability to produce complex sentences). The rationale for choosing these aspects of writer development is motivated by the fact that, as Phillips (1996) rightly notes:

Currently, theorists regard writing not as a product but as a continuous process of arranging and re-arranging words and syntactic structures until a writer finds the ones which best communicate the desired idea or message.

Syntactic maturity is based on the principle that mature writers tend to use more transformations in their writing and therefore write with more syntactic complexity. William Strong says “that syntactic growth (in terms of increased sentence length, depth of modification, and subordination) is a natural and inexorable feature of normal language development ” (1986). In “An I-Search Perspective on Language/Composition Research” he identifies three indices of syntactic growth:

(1) increased noun modification by means of adjectives, relative clauses, and phrases;

(2) increased nominalization in clausal, infinitive, and gerund constructions;

and (3) increased depth of modification through embedding.

In the twenty-first century class, more than ever, teachers need to identify methods for teaching writing which provide students with choice and flexibility (both lexical and structural). Why ‘more than ever’? Because in this day and age, the ‘cut and paste’ attitude to processing and sharing knowledge is rampant. Hence, our learners need to be equipped with the cognitive and linguistic tools to transform whatever knowledge they process into their own words, effectively, not merely to avoid plagiarism, but also because transformation involves higher order thinking and consequently deeper learning and greater ownership over the information being communicated.

If we, as teachers, accept this premise, then the predominantly imitative / model-based approach to writing currently in use in most UK modern language classrooms needs to be replaced by or at least supplemented with a more dynamic approach which explicitly promotes and nurtures syntactic complexity by actively engaging the learners in more than mere imitation – i.e. the sheer application of a pre-packaged model; an approach, that is, that explicitly encourages the student writer to use the model phrases/sentences provided by teachers, L2 texts or reference materials in a transformational, creative and risk-taking fashion.

4.Beyond imitation

3.1 – Sentence-combining techniques

One set of techniques that does push writing instruction well beyond the boundaries of sheer imitation and has a highly successful track-record -evidenced by scores of L1 and L2 writing research studies – is Sentence combining, defined by Phillips (1996) as

A technique of putting strings of sentence kernels together in a variety of ways so that completed sentences possess greater syntactic maturity.

In her seminal review of L1 sentence combining studies Phillips (1996) concludes that

Most of the experiments on sentence combining relate sentence combining and cumulative sentence exercises to gains in syntactic maturity.

Mounting evidence indicates that L2 student writing, too, benefits from intensive sentence combining instruction (Cooper and Morain, 1980; Enginarlar, 1994; Riazi, 2002; Juffs et al., 2014).

3.1.1 Signaled combining.

Two sentences are provided and specific instructions for sentence construction are provided. Here is an example I have used with a pre-intermediate class

I have a sister (who)

Her name is Marie

The result would be :

I have a sister who is called Marie

Signaled combining is useful when one wants to drill in a particular grammar structure or connectives in a controlled linguistic environment.

3.1.2 Open sentence combining.

In this approach,the students are not cued. For example, the kernels below

I have a sister

My sister is called Marie

She is friendly, pleasant and helpful

I argue with her from time to time

she is too talkative

could be combined as

my sister, who is called Marie, is very friendly, pleasant and helpful but from time to time I argue with her because she is too talkative

As Mellon (cited in Daiker, 1985) notes, open combining has the advantage to allow the students to learn a variety of ways ‘to transform sentences, make linguistic choices, experiment with structures and discern which sentences produce the most effective results in written language.

3.1.3 The cumulative sentence (page 13)

This approach has Robert Marzano, Joseph Lawlor, Terry Phelps, Nancy Swanson, and Dennis Packard amongst its strongest advocates. The concept of the cumulative sentence evolved from Christensen’s belief that written composition is an additive process in which a writer begins with a major idea and then adds to it so that the reader can grasp the meaning.

The cumulative sentence, says Christensen, “is the opposite of the periodic sentence. . . . It is dynamic rather than static, representing the mind thinking. The main clause exhausts the mere fact of the idea. . . . The additions stay with the main idea”. A cumulative sentence contains a main clause and several modifying clauses. Here is an example:

she came to our house

she came yesterday

she was dressed in black

she was accompanied by her brother

her brother looked sad

Could be combined as :

she came to our home yesterday, dressed in black, accompanied by her brother who looked sad

As Phillips points out, cumulative sentences encourage students to vary their output, add metaphoric descriptions, rephrase confusing periodic sentences into clearer ones and eliminate redundant elements.

 3.1.4 Whole-discourse exercises

These are more challenging but more useful if we are trying to forge effective essay writers as they do not confine syntactic transformation and manipulation to stand-alone sentences but contextualize them in the development of a concept or set of concepts. Whole discourse exercises build on the previous techniques by presenting the students with various sets of sentence kernels (Gianfranco usually uses 5 or six sets); the task: to create a sentence out of each set and then group the resulting sentences cohesively into a meaningful and logically arranged paragraph.

Mellon (1985) says that whole-discourse exercises have two benefits. The first is that by freeing students from concern with content, whole-discourse exercises help students improve their syntactic manipulations. The second is that whole discourse exercises help students improve writing both within and between sentences.

The way Gianfranco goes about creating whole discourse exercises is by decombining a paragraph from a textbook and asking the students to recombine it. Students enjoy it and learn a lot of vocabulary in the process, too.

3.1.5 Decombining

Decombining can be used as a starting point for any of the recombining activities described above, not simply for Whole-discourse exercises. In the absence of sentence combining exercises in published MFL materials, teachers can make their own by decombining sentences found in the coursebooks or L2 sources available to them.

However, decombining is a great learning activity for students, too, as by deconstructing texts they become more aware of the writing process, especially when they are required to analyze the choices made by the author. What Gianfranco normally does, is to ask the students to decombine a text in a given lesson and ask them to go back to it two or three lessons later and have a go at recombining it –without having the original in front of them, obviously.

  1. Paraphrasing

Paraphrasing is an important skill to have and one which requires transformation. It develops students’ vocabulary by forcing them to use synonyms; their grammar/syntax by often having to drastically alter the sentence structure (e.g. from active to passive voice); it may even encourage the use of metaphors, imagery, analogy and other rhetorical figures in more adventurous learners.

One very fruitful activity Gianfranco carries out with his A2 students is to ask them to paraphrase sentences which sound ambiguous or even obscure in an attempt to enhance their clarity (he provides the sentence in the L2 with the intended meaning that the author failed to express effectively next to it).

  1. Summarizing and Shrinking

Upper intermediate student-writers often lack concision. Summarizing is a very effective way to get students to learn to be concise, especially if they are given a word limit and are not allowed to repeat more than a very limited number of the language items included in the original text.

Shrinking, one of Gianfranco’s favourite writing activities, pushes the summarizing challenge a notch further by requiring the students to concentrate the meaning of a paragraph into a single sentence. A word or even character limit can be imposed, here, too. In the past Gianfranco has used Twitter for this activity – forbidding any word abbreviations/contractions or verb ellipses.

6. Tips

  1. Do not spend too much classroom time on these activities – after a few lessons in which you would have modelled how to go about these activities (e.g. using think-aloud techniques) assign these activities as homework;
  2. Distributed better than massed practice – Better do a little bit of the above every lesson, contextualized in the topic at hand, possibly after practising the vocabulary you will include in the sentence. Unless you have highly motivated or highly needy students, do not spend a whole lesson doing this – students may find it tedious. A lot of these activities make for excellent plenaries.
  3. Avoid cognitive overload – Unless you are working with very proficient L2 writers, do not use too much unfamiliar language in the to-be-combined/paraphrased/shrunk texts.
  4. Make it fun and/or competitive – sentence combining/paraphrasing and shrinking on MWBs or on Twitter under time conditions can easily be made fun and competitive.
  5. Match to ability – whereas all of the above can be used with any of your upper intermediate learners, only the easiest forms of sentence combining and paraphrasing are suitable for your intermediate learners (e.g. signaled combining)
  6. Extensive modelling – Do provide a lot of modelling before engaging the students into the more open ended of the above activities. There will be students who will find these activities very challenging. Since these are likely to be those who need this kind of practice the most, prepping them adequately so as to enhance their chances to succeed is paramount.

7. Concluding remarks

Much written instruction in UK high schools occurs through the imitation of models, feedback on writing practice and explicit grammar instruction. However, not much explicit and systematic effort is made to develop variety, clarity, concision and syntactic maturity, the ability to produce sentences that are longer, contain complex subordination and deeper modification. Yet, the attainment of these four goals is a must if we aim to forge effective writers and communicators in general.

Another reason to focus on the development of L2-learner ability to transform and manipulate language effectively refers to the fact that a lot of 21st-century-student learning occurs through the digital medium, mostly on the Internet. Hence, today’s language learners need, more than ever, to be able to transform whatever L2 knowledge they find on internet based sources into their own words not only to avoid plagiarism but also to make it their own.

Sentence combining, Paraphrasing, Summarizing and Shrinking hold the potential to enhance these areas of L2 learner writing proficiency. The effectiveness of the sentence combining techniques discussed above is supported by a vast body of research evidence. As for the other three we could not locate any substantial research evidence. However, they worked well for us – as language learners – and for many of our students over the years.

We believe that if students received systematic practice in activities of this kind from their pre-intermediate days all the way to GCSE (intermediate to upper intermediate level), the notoriously huge gap between learner writing proficiency at the end of KS4 (14-16 years old) and the level of competence needed at KS5 (16 to 18 years old) would be significantly reduced.

More on this topic can be found in ‘The Language Teacher Toolkit’ , the book Steve Smith and I co-authored, published on


Using translation as a language-proficiency-enhancing technique – A teaching sequence

Please note: this post was co-authored with Steve Smith of with some input from Dylan Vinales of GIS Kuala Lumpur

download (5)

In a previous post Gianfranco provided the rationale for using translation in the MFL classroom rooted in common sense and cognitive theory. In that post the point was made that translation, a ‘legacy method’ frowned upon by many language educators for several decades, it is not simply a useful but a truly must-have skill if we are to prepare our students for life in the real world. Why?

Good translation skills can help you get scores of well-paid jobs and language knowers translate for other people on a daily basis. In the Internet age, possessing effective translation skills has become all the more important as (a) sources of information are often not entirely faithful to the original version and things get often lost in translation; (b) reading for gist can get us in trouble – even legal ‘troubles’ – when sharing something on social media or executing an online transaction; missing a crucial detail, such as failing to notice the negative nuance of a word or being misled by a false-friend cognate (e.g. ‘disposable’ which in Italian evokes ‘disponible’, ‘available’) a double negative, an unknown idiom or an obscure cultural reference can cause us to misunderstand the important part of a text.

Someone might object: doesn’t (b) above refer to effective reading skills? Yes and no. Research (as reported by Macaro, 2007) shows that less proficient readers (like the ones we teach at GCSE level in Britain) do often translate into their mother tongue when grappling with more complex and challenging text, rehearsing it in their working memory as they reconstruct meaning. I am a near native speaker of English and French and still find this strategy very useful when dealing with very complex literary texts. It eases the cognitive load and the processing of more challenging concepts.

As far as the benefits of translation for language learning please refer back to Gianfranco’s previous blog: . Here’s a concise summary of the ways translation can benefits the novice-to-intermediate foreign language classroom:

  1. Benefits of L2 to L1 translation (with dictionary)

– learning of new vocabulary in context;

– focus on detail – reading comprehensions do not require students to understand each and every word. Translations do. This often sparks off the use of a wider range of learning strategies (including dictionary use) than reading to answer comprehension questions does;

– because of the focus on detail translation is more likely to bring about the Noticing  of new L2 structures than reading for comprehension would. Noticing is posited by many cognitive researchers as the starting point of the L2 acquisition process (see Schmidt’s, 1980,Noticing hypothesis)

– greater cognitive investment in the processing of L2 texts than reading. This is not necessarily always the case, but often, due to the necessity of having to translate each and every word, the learner will invest more time and effort processing the target text;

– the greater cognitive investment just mentioned above may lead to deeper learning than reading for comprehension would bring about;

– practice in the use of dictionaries, a lifelong learning skill;

– requires minimum preparation but can have high impact if used adequately.


  1. Benefits of L1 to L2 translation (with dictionaries)

– enables teacher to ‘force’ students to focus on language items that other less structured writing tasks may allow students to avoid;

– allows teachers to recycle at will vocabulary and language structures that may not be used spontaneously by the students in other types of writing tasks;

– oral and written translations under time constraints are invaluable instruments for the assessment of oral/written fluency and constitute minimum-preparation starters/plenaries.

– elicits the use of lots of useful learning strategies and dictionary use;

– encourages greater focus on accuracy and on grammar and syntax – when it goes beyond word level;

– differentiation is easy.

The drawbacks are that some students do find translations boring, especially when they are long; some students may found it daunting; assessment is not always straightforward; there are not many examples in the current literature of how to use translation for teaching.

Rationale for this post

This post is motivated by the many queries Steve Smith ( ) and I have received in the last four weeks by readers of our blogs asking how we would prepare students for GCSE level translation tasks. Steve has already written a great post listing a vast array of ways in which translation can be used to enhance language learner proficiency. This post should be seen as complementary to Steve’s in that it purports to provide a teaching sequence based on various L1-to-L2 translation tasks rather than a list of discrete activities.

The teaching sequence

When using translation, like any other learning technique we have to ask ourselves the all-important question: what is it for? Is it to drill in new vocabulary or consolidate ‘old’ language items? Is it to assess students’ oral or written fluency? Is it to teach dictionary skills? Is it to impart learning strategies/translation skills? Or is it to focus on connotative language and its nuances?

The sequence below can be used to enhance/consolidate vocabulary, grammar, fluency and translation strategies across all four skills. More translation-task-based sequences will follow in future posts. It should be noted that the sequence does not necessarily have to take one lesson.

Please note that this sequence presupposes that the students have declarative knowledge of most of the grammar structures included in the target translation task and of part of the vocabulary.

Step 1 – Planning

(a) Prepare or select the translation task. Make sure it is not too long. It should not take more than 20-25 minutes maximum for your average student to complete.

(b) Prepare/select four-five texts very similar in length and linguistic content to the target translation task with some comprehension questions. This is basically, what I call a narrow-reading task (see this example on  ), i.e.: a series of comprehension tasks based on texts that are extremely similar to each other (see my post on narrow reading and listening on this blog). Make sure the tasks include finding in the texts the L2 equivalent of L1 words.

(c) (Optional) If you have the time, prepare three or four more (shorter) texts with the same features for listening comprehension (narrow listening). It seems like a lot of work but it isn’t. All you have to do is to slightly modify the texts you produced for reading comprehension purposes by changing two or three details here and there. Five minutes’ work.

(d) Identify the words/structures you expect the students will have problems with and prepare a set of sentences in the L1 and one in L2 which feature them. Important: make sure the sentences are as similar as possible in grammar/syntax to the kind of sentences found in the translation task. Gap or cut in half the sentences in the target language by removing the key items you want the students to focus on. The gapped sentences in the L2 will be for aural processing; the ones in the L1 for written translation purposes.

Step 2 – Word level teaching

This can be flipped. Using Quizlet, Memrise,  , etc. prepare a series of activities which drill in most of the key unfamiliar lexical items and the grammar structures included in the translation task.

Step 3 – Modelling of target language items through narrow reading

The modelling of the target language items occurs through narrow reading first as it is easier. Dictionaries are allowed. Narrow reading allows for recycling of the key target lexical and grammar items.

Step 4 – Eliciting selective attention to key items through listening with gap-fill

Use the gapped/cut-in-half sentences in the L2 that you prepared in Step 1 (d). You will utter the sentences at moderate speed (the purpose is modelling so speak clearly) to draw the students’ attention to the unfamiliar words/phrases you will have removed when you gapped them.

Step 5 – Reinforcing modelling through narrow listening

Same as Step 3 except that it is through the aural medium.

Step 6 (OPTIONAL) – Paying selective attention to the key target grammar items through grammaticality-judgement quizzes

Here you can stage a ‘Sentence auction’ whereby the students are presented with a number of sentences, some right, some wrong, containing the key items found in the target translation task. Each sentence has a price. Working in groups, the students must decide whether to buy or not the sentence the teachers wants to ‘sell’. If they refuse to buy when the sentence is wrong they win the equivalent of the sentence price; the same happens if they buy a sentence when it is correct. Conversely, if they buy a wrong sentence or refuse to buy a correct one they will lose money. The aim here is to focus the students on the kind of grammar mistakes that, in your experience, they are more likely to make in executing the target translation task

Step 7 – Sentence level translation

You can do this as a whole-class activity or in groups, turning it into a competition. Students translate the L1 sentences you prepared in Step 1 (d) under time conditions. The student(s) making fewer mistakes in each round win(s).

Step 8 – Translation task

You can go about this in two ways. 8a. If you want to assess fluency, you will do it under time constraints. You will break up the text in sentences and you will utter one sentence at a time.Equipped with mini white boards, the students will translate them into the TL in the time you allocated. 8b. If you are not bothered about their ability to operate in exam conditions, you will allocate the time you deem necessary for them to complete the task. Dictionaries allowed.

Step 9 – Follow-up

It would be ideal if you could set as homework a text which is extremely similar to the one done in Step 8.


This sequence does require some preparation time – about 45-60 minutes. However, we are confident the reader will see the advantages of the kind of recycling and selective attention to the key target items that this sequence brings about. The most important outcome of this sequence is that students, in our experience, get to the target task confident and prepared and usually do well. If a series of follow-ups of the kind envisaged in Step 9 occur, the gains obtained will become consolidated. The reader should note this is a ‘no-frills’ sequence, so to speak, devoid of fancy or flashy games; deliberately so, to be as low-effort as possible. However, we are sure there are ways to ‘spice it up’ and make it more engaging.

The writing skill most foreign language teachers don’t teach: interactional writing


1.What is writership?

Chatting online or texting via SMS, WATSAPP, etc. has become part and parcel of our daily life, the verb and noun ‘chat’ alluding to the fact that although we are writing we are in fact ‘talking’ to someone. Just like in a face-to-face conversation when chatting online we have to respond to our interlocutor in real time if we want to ‘stay’ in the conversation and, most importantly, if we want to keep him or her engaged.

Since in real-life face-to-face communication applied linguists refer to interactional listening as ‘Listenership’ I will henceforth call the set of skills involved in interactional writing: ‘Writership’.

Listenership and Writership have many similarities in terms of the cognitive processes they involve. There are, however, important differences, too. Besides the most obvious difference , i.e. the fact that  communication does not happen through the oral medium, there is another important one: when texting or chatting we do not usually see our interlocutor. This means that the all-important non-verbal aspects of communication (e.g. the cues we get from our interlocutor’s body language) are missing – which often leads online chatters to use imagery as a compensatory strategy. This entails that effective writership must not simply include fluency (as in: speed of production) but also a level of mastery of TL vocabulary and discourse functions which makes up for the lack of non-verbal cues and pre-empts ambiguity.

2. Should we be concerning ourselves with interactional writing?

Whilst skyping with Steve Smith of last night we were talking about the time teachers should allocate to writing. Steve made a very important observation that echoed what I have always thought: writing is less important than listening, speaking and reading as students are not very likely to use the target language after passing their GCSEs; hence, not much time at all was the answer– writing can be done by students at home. But then it suddenly dawned on me that that was true of Steve’s generation and mine, but not of the current one.

In the highly inter-connected world where global communication happens at the speed of a few milliseconds on Facebook, Twitter, Watsapp and SMSs many of our students are very likely to engage in interactional TL (target language) writing in the future – in fact many already do on a daily basis. Another example of how emerging technologies affect not only the way we communicate in real life but also, inevitably, the way we teach foreign languages.

3. But how do we teach interactional writing?

First of all let us consider what it involves.

First and foremost, obviously, the ability to understand an interlocutor’s input.

Secondly, the ability to respond to that input in real time, maybe not necessarily at the same speed as one would do in oral interaction but still quite rapidly. In other words, effective writership requires writing fluency.

Thirdly, effective writership requires a high level of intelligibility of output – not necessarily grammatical accuracy and complexity. Spelling becomes more important than it is in essay writing in that the speed of the interactional exchange does not allow the interlocutor a lot of time for working out ambiguous items.

Fourthly, the command of a sizeable repertoire of high frequency lexical items, a fairly wide range of discourse functions, the basic tenses and communication strategies (e.g. ways to compensate for lack of vocabulary).

Fifthly, in dealing with a TL native speaker an effective interactional writer must be able to grasp cultural features in their input, including the jargon and abbreviations used in TL instant-messaging communication (e.g. knowing that in French LOL is MDR).

The obvious corollary of the above is that most of the traditional communicative activities we use to foster autonomous communicative competence (through both the oral and written medium) and listenership apply to the teaching of writership too (e.g. information gap tasks and role-plays). In fact the oral communicative practice that takes place in your classroom will have a major impact on learner writership.

These are some of the activities I use in the classroom to promote writership:

  1. As a starter or plenary I stand in front of the class and type questions in the TL on the classroom screen. The students, equipped with mini-boards or iPads, have two minutes to write an answer including three details. In order to differentiate I usually ask two questions, the second being an extension for the more fluent students. Accuracy is not a concern, but intelligibility is.
  2. Picture tasks. This is similar to the previous task, except that the stimulus the students have to respond to is visual. The rationale for this task is that (a) in social media students often do have to respond to a visual stimulus; (b) it taps into their creativity; (c) it may elicit language that transcends the boundaries of the topic-at-hand.
  3. ‘What is the question?’ tasks. Students are provided with a very short dialogue in the TL where the questions have been omitted. Their task is to provide the missing questions
  4. Social media slow chat. Using Edmodo I ask my students to chat with each other about a given topic. I give out red cards to the chat-initiators and blue cards to the responders. The initiators are in charge of asking questions to any of the responders in the class. Every ten minutes the initiators and responders will switch cards. The students are given a time limit to answer, which varies from group to group – this is hard to monitor, of course, but my students are usually honest. The reason why I use Edmodo rather than Twitter is that (a) Edmodo allows the teacher to edit mistakes (please note: I only correct major intelligibility mistakes); (b) it looks a lot like Facebook but it is safer; (c) the teacher has total control over everything that happens in the interaction. Last year I paired up my class with a class from an overseas school and we chatted on Edmodo for thirty minutes. Great experience that intend to do again this year. This activity is great as a prelude to oral activities as it allows the students more time to make the same communicative choices they will have to make in the oral interaction whilst still putting communicative pressure on them
  1. Very short translations under time constraints; students need to translate the teacher’s input on mini-boards. Again, focus is on intelligibility and fluency here rather than on100 % accuracy.
  2. Agree or disagree. A simple statement appears on the screen (e.g. Tennis is very enjoyable; I like it when it rains; the food in the canteen is great) and the students have to write a response on their mini boards under time constraints.
  3. Fluency assessment. At key stages in the unfolding of a unit of work I use the activity described in point 1, above, but ask students a much broader question and give them a lot more time to answer it on paper or on using google classroom. At the end of the allocated time I ask the students to stop and note down how many words they wrote. The time to word ratio will give me an indication of the levels of writing fluency in my class at that moment in time. I value this activity as fluency is an important pre-requisite of effective writership.


Emerging technologies, especially the internet and social media have transformed the way we live and we use language in communication. On a daily basis I find myself chatting on social media in four different languages and I find the linguistic challenges this poses quite taxing as it requires faster language processing ability and sociolinguistic competences that I do not always possess.

Whether we like it or not, the vast majority of our students communicate via social media or other forms of instant messaging. Hence, if we are to prepare them for communication in the real world this phenomenon cannot be ignored. Teaching interactional writing skills is therefore a must, in my opinion.

Teaching this set of skills has also the added benefit of preparing our students for oral communication as it requires them to process language in real operating conditions whilst allowing more time than the oral medium does. In this sense, the attainment of effective writership may be seen not just as an end in itself but also as instrumental to the attainment of oral fluency intended as the ability to retrieve information from Long-term Memory under communicative pressure. Do you currently work on developing TL writing fluency in your learners?

Why the reliability of UK Examination Boards’ assessment of A Level writing papers is questionable

The Language Gym


Often, our year 12 or Year 13 students who have consistently scored high in mock exams or other assessment in the writing component of the A Level exam paper, do significantly less well in the actual exam. And, when the teachers and/or students, in disbelief, apply for a remark, they often see the controversial original grade reconfirmed or, as it has actually happened to two of my students in the past, even lowered. In the last two years, colleagues of mine around the world have seen this phenomenon worsen: are UK examinations boards becoming harsher or stricter in their grading? Or is it that the essay papers are becoming more complicated? Or, could it be that the current students are generally less able than the previous cohorts?

Although I do not discount any of the above hypotheses, I personally believe that the phenomenon is also partly due to serious issues…

View original post 1,999 more words

Six writing research findings that have impacted my teaching practice


Every now and then I post concise summaries of research findings from studies I come across in my quest for emprical evidence which supports or negates my intuitions or experiences as a language teacher and learner. As I have mentioned in a previous post (‘ten reasons why you should not trust ground-breaking educational research’), much of the research evidence out there is far from being conclusive and irrefutable, due to flaws in design, data elicitation and analysis procedures which often undermine both their internal and external validity. However, when three or more  reasonaby well-crafted studies (however small) find concurring evidence which challenge commonly held assumptions  and/or resonates with our own ‘hunches’ or experiences about teaching and learning, it is reasonable to assume that ‘there is no smoke without fire’.

The following studies have been picked based on the above logic. They are small and less than perfect in design, but do reflect my professional experience and indicate that the validity some dogmata many teachers hold about language teaching and learning may be questionable.

1. Baudrand-Aertker (1992) – Effects of journal writing on L2-writing proficiency

21 students of French in the third year at a high school in Louisiana were asked to keep a journal over a nine-month period. They were required to write two entries per week at least and were not engaged in any other type of writing tasks for the whole of the duration of the study. The teacher responded to the students’ journal entries focusing only on content – not on form. Using a pre-/post-test design Baudrand-Aertker found that:

  • The students’ written proficiency improved significantly as evidenced by the post-test and their own perception;
  • The students felt that the journals helped them improve their overall mastery of the target language;
  • The students reported positive attitudes towards the activity;
  • The vast majority of the students did not want to be corrected on their grammatical mistakes when engaging in journal writing.

Although this study has important limitations in that there was no control group to compare the independent variable’ effects with, I find the results interesting and I intend to give journal-writing a try myself next year.

  1. Cooper and Morain (1980) – Effects of sentence combining instruction

The researchers investigated the effect of grammar instruction involving sentence combining tasks on the essay writing of 130 third quarter students of French. The subjects were divided into two groups: the experimental group received 60 to 150 minutes instruction per week through sentence combining exercises whilst the control group was taught ‘traditionally’ through workbook exercises. The experimental group outperformed the control group on seven of the nine measures of syntactic complexity adopted. Although the study did not look at the overall quality of the informants’ essays but only at the syntactic complexity, its findings are very interesting and has encouraged me to incorporate sentence combining tasks more regularly in my teaching strategies. Here is an discussion of the merits of sentence combining instruction and how it can be implemented

  1. Florez Estrada (1995) – Effects of interactive writing via computer as compared to traditional journaling

In this small scale study (28 university students of Spanish) Florez-Estrada compared a group of learners exchanging e-mail and chatting online with native-speaking partners with another group of students engaged in interactive paper writing with their teachers. The researcher found that the computer group outperformed the control group on the accuracy of key grammar points such as preterite vs imperfect, ‘ser’ vs ‘estar’, ‘por’ vs ‘para’ and others. The findings of this study were echoed by another study of 40 German students, Itzes (1940), which involved students in chatting via computer amongst themselves in the TL. A notable feature of this study is that the students chose the topics they wanted to chat about. These two studies confirms finding from my own practice; I often use Edmodo or Facebook to create a slow student-initiated chat on given topics in which the whole class is involved, every students sharing their opinions/comments with their peers with the assistance of the dictionaries. I have found this activity very beneficial even with groups of less able learners.

  1. Nummikoski (1991) and Caruso (1994) – Effects of extensive L2-reading on L2-writing proficiency as contrasted with written practice.

Both studies investigated if L2 learners who are engaged in extensive L2-reading (with no writing instruction/practice) write more effectively than L2 learners who are involved in writing tasks but do no reading. The results of both studies show a significant advantage for the writing-only condition. These studies, which are by no means flawless, do challenge the commonly held assumption that we can improve our students’ writing proficiency by engaging them in extensive reading.

  1. Martinez-Lage (1992) – Comparison of focus-on-form with focus-on-form-free writing

The researcher investigated the impact of two writing-task types on the writing output of 23 second-year university Spanish students. The same students were asked to write (a) typical assigned compositions and (b) dialogue journals in which they were told they would not be assessed on grammar accuracy. The surprising finding was that the syntactic complexity across both task types was equivalent but the focus-on-form-free task type (journal writing) was grammatically more accurate. I concur with Martinez-Lage on this one as I have tried this strategy myself with many of my AS groups over the years.

  1. Hedgcock and Lefkowitz (1992) – Effect of peer feedback in L2 writing

The researchers studied 30 students in an accelerated first year college French class, who wrote two essays involving three separate drafts. The experimental group was involved in peer feedback (essays were read aloud to each other and oral feedback was given), whilst the other group received written teacher feedback. In terms of performance from the first to the second essay both groups made significant improvements, but in different areas: the peer-feedback group got worse in grammar but did better on content, organization and vocabulary; the teacher feedback group, exactly the opposite. It should be noted that a previous study by Piasecki (1988) which adopted a very similar design but lasted much longer (8 weeks) and involved 112 students of third-year high school students of Spanish, found no significant differences between the two conditions. This confirms my reservations about using peer-feedback as an effective way to correct learner output and as a blanket corrective strategy; in my opinion it may work quite well with certain groups of individuals with highly developed grammar knowledge and critical thinking skills but not with others.

The causes of learner errors in L2 writing – an attempt to integrate Skill-theory and mainstream accounts of Second Language Acquisition

A cognitive account of errors in L2-writing rooted in skill acquisition and production theory

1. Introduction

 The purpose of this paper is to shed light on the cognitive sources of errors. An understanding of the psycholinguistic mechanisms that cause our students to err is fundamental if we aim to significantly enhance the (surface-level) accuracy of their written output. In what follows, I intend to take the reader through the cognitive processes underlying second language writing mapping out in detail the stages and contexts in which mistakes are usually made. In order for the reader to fully comprehend the ensuing discussion, I will begin by outlining four key concepts in Cognitive psychology which are essential for an understanding of any skill-acquisition theory of language development and production. I will then proceed to concisely discuss the way humans acquire languages according to one of the most widely accepted models of second language acquisition (Anderson’s 2000). Finally, I will provide an exhaustive account of the way we process writing rooted in Cognitive theory and resulting from an integration of a number of models of monolingual and bilingual production. I shall then draw my conclusions as to the implications of the reviewed theories and research for an approach to error correction.

2. Key concepts in Cognitive psychology

Before engaging in my discussion of L2-acquisition and L2-writing, I shall introduce the reader to the following concepts, central to any Cognitive theory of human learning and information processing:

1. Short-term and Long-Term Memory

2. Metalinguistic Knowledge and Executive Control

3. The representation of knowledge in memory

4. Proceduralisation or Automatisation

2.1 Short-Term Memory and Long-Term Memory

In Information Processing Theory, memory is conceived as a large and permanent collection of nodes, which become complexly and increasingly inter-associated through learning (Shiffrin and Schneider, 1977). Most models of memory identify a transient memory called ‘Short-Term Memory’ which can temporarily encode information and a permanent memory or Long-Term Memory (LTM). As Baddeley (1993) suggested, it is useful to think of Short-Term Memory as a Working Short-Term Memory (WSTM) consisting of the set of nodes which are activated in memory as we are processing information. In most Cognitive frameworks, WSTM is conceived as the provision of a work space for decision making, thinking and control processes and learning is but the transfer of patterns of activation from WSTM to LTM in such a way that new associations are formed between information structures or nodes not previously associated. WSTM has two key features:

(1) fragility of storage (the slightest distraction can cause the brain to lose the data being processed);

(2) limited channel capacity (it can only process a very limited amount of information for a very limited amount of time).

LTM, on the other hand, has unlimited capacity and can hold information over long periods of time. Information in LTM is normally in an inactive state. However, when we retrieve data from LTM the information associated with such data becomes activated and can be regarded as part of WSTM.

In the retrieval process, activation spreads through LTM from active nodes of the network to other parts of memory through an associative chain: when one concept is activated other related concepts become active. Thus, the amount of active information resulting can be much greater than the one currently held in WSTM. Since source nodes have only a fixed capacity for emitting activation (Anderson, 1980), and this capacity is divided amongst all the paths emanating from a given node, the more paths that exist, the less activation will be transmitted to any one path and the slower will be the rate of activation (fan effect). Thus, additional information about a concept interferes with memory for a particular piece of information thereby slowing the speed with which that fact can be retrieved. In the extreme case in which the to-be-retrieved information is too weak to be activated (owing, for instance, to minimal exposure to that information) in the presence of interference from other associations, the result will be failure to recall (Anderson, 2000).

2.2 Metalinguistic knowledge and executive control (processing efficiency)

This distinction originated from Bialystock (1982) and its validity has been supported by a number of studies (eg Hulstijin and Hulstijin, 1984). Knowledge is the way the language system is represented in LTM; Control refers to the regulation of the processing of that knowledge in WSTM during performance. The following is an example of how this distinction applies to the context of my study: many of my intermediate students usually know the rules governing the use of the Subjunctive Mood in Italian, however, they often fail to apply them correctly in Real Operating Conditions, that is when they are required to process language in real time under communicative pressure (e.g. writing an essay under severe time constraints; giving a class presentation; etc.). The reason for this phenomenon may be that WSTM’s attentional capacity being limited, its executive-control systems may not cope efficiently with the attentional demands required by a task if we are performing in operating conditions where worry, self-concern and task-irrelevant cognitive activities make use of some of the available limited capacity (Eysenck and Keane, 1995). These factors may cause retrieval problems in terms of reduced speed of recall/recognition or accuracy. Thus, as Bialystock (1982) and Johnson (1996) assert, L2-proficiency involves degree of control as well as a degree of knowledge.

2.3 The representation of knowledge in memory

Declarative Knowledge is knowledge about facts and things, while Procedural Knowledge is knowledge about how to perform different cognitive activities. This dichotomy implies that there are two ‘paths’ for the production of behaviour: a procedural and a declarative one. Following the latter, knowledge is represented in memory as a database of rules stored in the form of a semantic network. In the procedural path, on the other hand, knowledge is embedded in procedures for action, readily at hand whenever they are required, and it is consequently easier to access.

Anderson (1983) provides the example of an EFL-learner following the declarative path of forming the present perfect in English. S/he would have to apply the rule: use the verb ‘have’ followed by the past participle, which is formed by adding ‘-ed’ to the infinitive of a verb. S/he would have to hold all the knowledge about the rule formation in WSTM and would apply it each time s/he is required to form the tense. This implies that declarative processing is heavy on channel capacity, that is, it occupies the vast majority of WSTM attentional capacity. On the other hand, the learner who followed the procedural path would have a ‘program’, stored in LTM with the following information: the present perfect of ‘play’ is ‘I have played’. Deploying that program, s/he would retrieve the required form without consciously applying any explicit rule. Thus, procedural processing is lighter on WSTM channel capacity than declarative processing.

2.4 Proceduralisation or Automatization

Proceduralisation or Automatization is the process of making a skill automatic. When a skill becomes proceduralised it can be performed without any cost in terms of channel capacity (i.e. “memory space”): skill performance requires very little conscious attention, thereby freeing up ‘space’ in WSTM for other tasks.

3. L2-Acquisition as skill acquisition: the Anderson Model

The Anderson Model, called ACT* (Adaptive Control of Thought), was originally created as an account of the way students internalise geometry rules. It was later developed as a model of L2-learning (Anderson, 1980, 1983, 2000). The fundamental epistemological premise of adopting a skill-development model as a framework for L2-acquisition is that language is considered as governed by the same principles that regulate any other cognitive skill. A number of scholars such as Mc Laughlin (1987), Levelt (1989), O’Malley and Chamot (1990) and Johnson (1996), have produced a number of persuasive arguments in favour of this notion.

Although ACT* constitutes my espoused theory of L2 acquisition, I do not endorse Anderson’s claim that his model alone can give a completely satisfactory account of L2-acquisition. I do believe, however, that it can be used effectively to conceptualise at least three important dimensions of L2-acquisition which are relevant to this study: (1) the acquisition of grammatical rules in explicit adult L2-instruction, (2) the developmental mechanisms of language processing and (3) the acquisition of Learning Strategies.

 Figure 1: The Anderson Model (adapted from Anderson, 1983)


The basic structure of the model is illustrated in Figure 1, above. Anderson posits three kinds of memory, Working Memory, Declarative Memory and Production (or Procedural) Memory. Working Memory shares the same features previously discussed in describing WSTM while Declarative and Production Memory may be seen as two subcomponents of LTM. The model is based on the assumption that human cognition is regulated by cognitive structures (Productions) made up of ‘IF’ and ’THEN’ conditions. These are activated every single time the brain is processing information; whenever a learner is confronted with a problem the brain searches for a Production that matches the data pattern associated with it. For example:

IF the goal is to form the present perfect of a verb and the person is 3rd singular/

THEN form the 3rd singular of ‘have’

IF the goal is to form the present perfect of a verb and the appropriate form of ‘have’ has just been formed /

THEN form the past participle of the verb

The creation of a Production is a long and careful process since Procedural Knowledge, once created, is difficult to alter. Furthermore, unlike declarative units, Productions control behaviour, thus the system must be circumspect in creating them. Once a Production has been created and proved to be successful, it has to be automatised in order for the behaviour that it controls to happen at naturalistic rates. According to Anderson (1985), this process goes through three stages: (1) a Cognitive Stage, in which the brain learns a description of a skill; (2) an Associative Stage, in which it works out a method for executing the skill; (3) an Autonomous Stage, in which the execution of the skill becomes more and more rapid and automatic.

In the Cognitive Stage, confronted with a new task requiring a skill that has not yet been proceduralised, the brain retrieves from LTM all the declarative representations associated with that skill, using the interpretive strategies of Problem-solving and Analogy to guide behaviour. This procedure is very time-consuming, as all the stages of a process have to be specified in great detail and in serial order in WSTM. Although each stage is a Production, the operation of Productions in interpretation is very slow and burdensome as it is under conscious control and involves retrieving declarative knowledge from LTM. Furthermore, since this declarative knowledge has to be kept in WSTM, the risk of cognitive overload leading to error may arise.

Thus, for instance, in translating a sentence from the L1 into the L2, the brain will have to consciously retrieve the rules governing the use of every single L1-item, applying them one by one. In the case of complex rules whose application requires performing several operations, every single operation will have to be performed in serial order under conscious attentional control. For example, in forming the third person of the Present perfect of ‘go’, the brain may have to: (1) retrieve and apply the general rule of the present perfect (have + past participle); (2) perform the appropriate conjugation of ‘have’ by retrieving and applying the rule that the third person of ‘have’ is ‘has’; (3) recall that the past participle of ‘go’ is irregular; (4) retrieve the form ‘gone’.

Producing language by these means is extremely inefficient. Thus, the brain tries to sort out the information into more efficient Productions. This is achieved by Compiling (‘running together’) the productions that have already been created so that larger groups of productions can be used as one unit. The Compilation process consists of two sub-processes: Composition and Proceduralisation. Composition takes a sequence of Productions that follow each other in solving a particular problem and collapses them into a single Production that has the effect of the sequence. This process lessens the number of steps referred to above and has the effect of speeding up the process. Thus, the Productions

P1 IF the goal is to form the present perfect of a verb / THEN form the simple present of have

P2 IF the goal is to form the present perfect of a verb and the appropriate form of ‘have’ has just been formed / THEN form the past participle of the verb would be composed as follows:

P3 IF the goal is to form the present perfect of a verb / THEN form the present simple of have and THEN the past participle of the verb

An important point made by Anderson is that newly composed Productions are weak and may require multiple creations before they gain enough strength to compete successfully with the Productions from which they are created. Composition does not replace Productions; rather, it supplements the Production set. Thus, a composition may be created on the first opportunity but may be ‘masked’ by stronger Productions for a number of subsequent opportunities until it has built up sufficient strength (Anderson, 2000). This means that even if the new Production is more effective and efficient than the stronger Production, the latter will be retrieved more quickly because its memory trace is stronger.

The process of Proceduralisation eliminates clauses in the condition of a Production that require information to be retrieved from LTM memory and held in WSTM. As a result, proceduralised knowledge becomes available much more quickly than non-proceduralised knowledge. For example, the Production P2 above would become

IF the goal is to form the present perfect of a verb

THEN form ‘had’ and then form the past participle of the verb

The process of Composition and Proceduralisation will eventually produce after repeated performance:

IF the goal is to form the present perfect of ‘play’/ THEN form ‘ has played’

For Anderson it seems reasonable to suggest that Proceduralisation only occurs when LTM knowledge has achieved some threshold of strength and has been used some criterion number of times. The mechanism through which the brain decides which Productions should be applied in a given context is called by Anderson Matching. When the brain is confronted with a problem, activation spreads from WSTM to Procedural Memory in search for a solution – i.e. a Production that matches the pattern of information in WSTM. If such matching is possible, then a Production will be retrieved. If the pattern to be matched in WSTM corresponds to the ‘condition side’ (the ‘if’) of a proceduralised Production, the matching will be quicker with the ‘action side’ (the ‘then’) of the Production being deposited in WSTM and make it immediately available for performance (execution). It is at this intermediate stage of development that most serious errors in acquiring a skill occur: during the conversion from Declarative to Procedural knowledge, unmonitored mistakes may slip into performance.

The final stage consists of the process of Tuning, made up of the three sub-processes of Generalisation, Discrimination and Strengthening. Generalisation is the process by which Production rules become broader in their range of applicability thereby allowing the speaker to generate and comprehend utterances never before encountered. Where two existing Productions partially overlap, it may be possible to combine them to create a greater level of generality by deleting a condition that was different in the two original Productions. Anderson (1982) produces the following example of generalization from language acquisition, in which P6 and P7 become P8

P6 IF the goal is to indicate that a coat belongs to me THEN say ‘My coat’

P7 IF the goal is to indicate that a ball belongs to me THEN say ‘My ball’

P8 IF the goal is to indicate that object X belongs to me THEN say ‘My X’

Discrimination is the process by which the range of application of a Production is restricted to the appropriate circumstances (Anderson, 1983). These processes would account for the way language learners over-generalise rules but then learn over time to discriminate between, for example, regular and irregular verbs. This process would require that we have examples of both correct and incorrect applications of the Production in our LTM.

Both processes are inductive in that they try to identify from examples of success and failure the features that characterize when a particular Production rule is applicable. These two processes produce multiple variants on the conditions (the ‘IF’ clause(s) of a Production) controlling the same action. Thus, at any point in time the system is entertaining as its hypothesis not just a single Production but a set of Productions with different conditions to control the action.

Since they are inductive processes, Generalization and Discrimination will sometimes err and produce incorrect Productions. As I shall discuss later in this chapter, there are possibilities for Overgeneralization and useless Discrimination, two phenomena that are widely documented in L2-acquisition research (Ellis, 1994). Thus, the system may simply create Productions that are incorrect, either because of misinformation or because of mistakes in its computations.
ACT* uses the Strengthening mechanism to identify the best problem-solving rules and eliminate wrong Productions. Strengthening is the process by which better rules are strengthened and poorer rules are weakened. This takes place in ACT* as follows: each time a condition in WSTM activates a Production from procedural memory and causes an action to be deployed and there is no negative feedback, the Production will become more robust. Because it is more robust it will be able to resist occasional negative feedback and also it will be more strongly activated when it is called upon:
The strength of a Production determines the amount of activation it receives in competition with other Productions during pattern matching.Thus, all other things being equal, the conditions of a stronger Production will be matched more rapidly and so repress the matching of a weaker Production (Anderson, 1983: 251)
Thus, if a wrong Interlanguage item has acquired greater strength in a learner’s LTM than the correct L2-item, when activation spreads the former is more likely to be activated first, giving rise to error. It is worth pointing out that, just as the strength of a Production increases with successful use, there is a power-law of decay in strength with disuse.
 4.Extending the model: adding a ‘Procedural-to-Procedural route’ to L2-acquisition
One limitation of the model is that it does not account for the fact that sometimes unanalysed L2-chunks of language are through rote learning or frequent exposure. This happens quite frequently in classroom settings, for instance with set phrases used in everyday teacher-to-student communication (e.g. ‘Open the book’, ‘Listen up!’). As a solution to this issue Johnson (1996) suggested extending the model by allowing for the existence of a ‘Procedural to Procedural route’ to acquisition whereby some unanalysed L2-items can be automatised with use, ‘jumping’, as it were, the initial Declarative Stage posited by Anderson. In classroom settings where instruction is grammar-based, however, only a minority of L2-items will be acquired this way.

5. Bridging the ‘gap’ between the Anderson Model and ‘mainstream’ second language acquisition (SLA) research

As already pointed out above, a number of theorists believe that Anderson provides a viable conceptualisation of the processes central to L2-acquisition. However, ACT* was intended as a model of acquisition of cognitive skills in general and not specifically of L2-acquisition. Thus, the model rarely concerns itself explicitly with the following phenomena documented by SLA researchers: Language Transfer, Communicative Strategies, Variability and Fossilization. These phenomena are relevant to secondary school settings for the following reasons: firstly, as far as Language Transfer and Communicative Strategies are concerned, they constitute common sources of error in the written output of L2-intermediate learners. Variability, on the other hand, refers to the phenomenon, particularly evident in the written output of beginner to intermediate learner writing, whereby learners produce a given structure correctly in certain contexts and incorrectly in others. Finally, Fossilization is often produced as a possible explanation of the recurrence of erroneous Interlanguage forms in learner Production. Although these phenomena are accounted for in Anderson’s framework, I believe that a discussion of mainstream SLA theories and research will enhance the reader’s understanding of their nature and implications for L2 teaching. It should be noted that for reason of relevance and space my discussion will be concise and focus only on the aspects which are most relevant to the present study.

5.1 Language Transfer

This phenomenon refers to the way prior linguistic knowledge influences L2-learner development and performance (Ellis, 1994). The occurrence of Language Transfer can be accounted for by applying the ACT* framework since, as Anderson asserts, existing Declarative Knowledge is the starting point for acquiring new knowledge and skills. In a language-learning situation this means drawing on knowledge about previously learnt languages both in order to understand the mechanisms of the target language and to solve a communicative problem. In this section, I shall draw on the SLA literature in order to explain how, when and why Language Transfer occurs and with what effects on learner written output.

As Odlin (1989) points out, Language Transfer can be positive, facilitating L2-performance. This is often the case with students of mine who studied French or Spanish and are able to transfer their knowledge of these languages advantageously to Italian because Romance languages share a large number of cognates and grammatical rules. However, Language Transfer can also be negative, resulting in erroneous L2-output. For instance, over-confidence in the fact that Italian and French/Spanish are similar may prompt a learner with L3-French to apply the rules of the French Subjunctive in the deployment of the Italian Subjunctive. This strategy will be effective in some contexts but unsuccessful in others.
Transfer can also result in the avoidance or the over production of L2-structures. For example, several intermediate Japanese learners of Italian I taught in the past avoided using relative clauses because these do not exist in their L1. On the other hand they over-used the definite article because, being totally unfamiliar with the concept of definite article in their language and noticing that Italians use it frequently, they thought that they were less likely to err if they used it all the time.
Transfer can occur as a deliberate Compensatory Strategy: a learner’s conscious attempt to fill a gap in his/her L2-knowledge (Faerch and Kasper, 1983). This phenomenon is particularly recurrent when the distance between the learner’s L1/L3 and the target language is perceived as close (e.g. Spanish and Italian). Transfer can also occur subconsciously (Poulisse, 1990). When used as Compensatory Strategy, Transfer can give rise to ‘Foreignization’ and ‘Code-switching’ errors. The former refer to the conscious alteration of L1- or L3-words to make them ‘sound’ target language like. For instance, not knowing the Italian for ‘rice’ (= riso) a French learner may add an ‘o’ to the French word ‘riz’ in the hope that the resulting ‘rizo’ will be correct. Code-switching, instead, consists in the conscious or subconscious use of unaltered L1-/L3-words/phrases when an L2-word is required. Both types of error are more likely to happen in spoken language, especially when a learner is under communicative pressure or does not have access to dictionaries or other sources of L2-knowledge. However, I have personally observed this phenomenon also in the writing of many L2-student writers, especially at the level of connectives (e.g. the French conjunction ‘et’ instead of the Italian ‘e’).
Transfer may affect any level of L2-learner output. As far as the areas of language use more relevant to the present study are concerned (syntax, morphology and lexis), Ringbom (1987) reports evidence from Ringbom (1978) and other studies (e.g. Sjoholm, 1982) that L1-Transfer affects lexical usage more than it does syntax or morphology. Of these two, it appears that morphology is the less affected area. The following factors appear to determine the extent to which Language Transfer occurs:
 (1) Perceived language distance: the closer two languages are perceived to be the more likely is Transfer to occur (see Sjoholm,1982)
 (2) Learning environment: it appears that Transfer is more likely to occur in settings where the naturalistic input is lower (Odlin, 1989);
 (3) Levels of monitoring: Gass and Selinker (1983) observe that careful, unmonitored learner output usually contains fewer instances of Transfer errors
 (4) Learner-type: learners who take more risks and are more meaning-oriented tend to transfer less than form-focused ones (Odlin, 1989);
(5) Task: some tasks appear to elicit greater use of Transfer (Odlin, 1989). This appears to be the case for L1-into-L2 translation including the approach, typical of many beginner L2-learners, whereby an L2-essay is produced first in the L1 and then translated word by word.
 (6) Proficiency: as the Anderson Model and many other Cognitive models (e.g. deBot, 1992) posit, the starting point of acquisition is the L1 which is gradually replaced by the target language as more and more L2-language items are acquired. Thus, Transfer is more likely to occur at the early stages of development than in the advanced ones. This is borne out by a number of studies (e.g. Taylor, 1975; Liceras, 1985; Major, 1987). Kellerman (1978), however, found that a number of Transfer errors occur only at advanced stages.
 5.2 Communication Strategies
Due to space constraints, my discussion of Communication Strategies (CSs) will be limited to the basic issues and levels of language (i.e. grammar, lexis and orthography) relevant to this study. Corder (1978) defined a CS as follows:

a systematic technique employed by a speaker to express his meaning

when faced with some difficulty. Difficulty in this definition is taken to

refer uniquely to the speaker’s inadequate command of the language in

the interaction (Corder, 1978: 8)

A number of taxonomies of CSs have been suggested. Most frameworks (e.g. Faerch and Kasper, 1983) identify two types of approaches to solving problems in communication: (1) avoidance behaviour (avoiding the problem altogether); (2) achievement behaviour (attempting to solve the problem through an alternative plan). In Faerch and Kasper’s (1983) framework, the two different approaches result respectively in the deployment of (a) reduction strategies, governed by avoidance behaviour, and (b) achievement strategies, governed by achievement behaviour.

Reduction strategies can affect any level of writing from content (Topic avoidance) to orthography (Graphological avoidance). Most CSs studies, however, have focused on lexical items. Achievement strategies (Faerch and Kasper, 1983) correspond to Tarone’s (1981) concept of Production Strategies and to Corder’s (1978, 1983) Resource expansion strategies. By using an achievement strategy, the learner attempts to solve problems in communication by expanding his communicative resources (Corder, 1978) rather than by reducing his communicative goal (functional reduction). Faerch and Kasper (1983) identify two broad categories of achievement strategies: Compensatory and Non linguistic. The Compensatory strategies relevant to the present study are:
 (1) Code switching (see 2.4.1 above)
(2) Interlingual transfer (see 2.4.1 above)

(3) Inter-/intralingual transfer, i.e. a generalization of an IL rule is made but the generalization is influenced by the properties of the corresponding L1-structures (Jordens, 1977)

 (4) IL based strategies. These include:

(i) Generalization: the extension of an item to an inappropriate context in order to fill the ‘gaps’ in their plans. One type of generalization relevant to the present study is Approximation, that is: the use of a lexical item to express only an approximation of the intended meaning.

(ii) Word coinage. This kind of strategy involves the learner in a creative construction of a new IL word

 5.3 Variability: the occurrence of unsystematic errors
Variability in learner language refers to the phenomenon whereby a given structure is produced correctly in certain contexts and incorrectly in others. As Ellis (1994) observed, this phenomenon is very common in the early stages of acquisition and may rapidly disappear. The Anderson model can be used to account for Variability as follows: firstly, as Anderson posits, two or more Productions which refer to different hypotheses about the use of a structure can co-exist in a learner’s LTM before the onset of the Discrimination process. These Productions compete for retrieval and, if they have more or less equal strength, may be used alternately at a given stage of development as the learner is testing their effectiveness through the trial-and-error process which characterizes the early stages of learning.
Secondly, if amongst the Productions relative to a given structure, Production ‘X’ based on the correct rule is much weaker than Production ‘Y’ based on an incorrect rule, Production ‘Y’ is likely to be retrieved first when a learner is not devoting sufficient conscious attention to it and and his/her brain ‘runs on automatic’. The lack of attention is usually determined by processing inefficiency, that is the incapacity of WSTM to cope with the demands that the task poses on its attentional system (Bygate, 1988). Processing inefficiency issues in writing are more likely to arise in unplanned and/or unmonitored Production (Krashen, 1977, 1981), especially when the L2-learner is under severe time constraints / communicative pressure (Polio, Fleck and Ledere 1998).
 A third cause of Variability refers to what above I called the ‘Procedural route’ to acquisition: aspects of the usage of a structure may have been acquired by a learner through the rote learning of or exposure to set L2-phrases (e.g. classroom phrases). Thus, in cases where that structure is well beyond that learner’s stage of development and s/he doe not know any declarative knowledge of that structure, s/he will deploy that structure correctly within the context of those set phrases while being likely to make mistakes with it in other contexts.
 5.4 Fossilization

In the SLA literature, Fossilization (or Routinization) refers to the phenomenon whereby some IL forms keep reappearing in a learner’s Interlanguage ‘in spite of the learner’s ability, opportunity and motivation to learn the target language…’ (Selinker and Lamendella, 1979: 374). An error can become fossilised even if L2-learners possess correct declarative knowledge about that form and have received intensive instruction on it (Mukkatesh, 1986).

Applying the Anderson Model, Fossilization can be explained as the Proceduralisation of an erroneous form through frequent and successful use. As already discussed, Productions that have been proceduralised are very difficult to alter, which would explain why some theorists believe that Fossilisation is a permanent state (Lamendella, 1977; Mukkatesh, 1986). For applied linguists working in the Skill-theory paradigm errors can be de-fossilised, but only after a lengthy and painstaking process of re-learning of the correct form through targeted monitoring and practice in real operating conditions (Johnson, 1996).
Several models (biological, acculturational, interactional, etc.) have been proposed to account for the development of Fossilization in L2-learning. Interactional models state that the interaction between the learner and other L2-speakers determines whether a component of the learner’s Interlanguage system is reinforced contributing to Fossilization. One such model, Tollefson and Firn’s (1983), posits that an overemphasis on conveyance of meaning in the classroom may, in the absence of cognitive feedback, promote fossilization.
On this issue, Johnson (1996) also asserts that linguistic survival is often achieved by a form of pidgin and that encouraging this type of communication in the language classroom is a practice conducive to fossilisation. Skehan (1994) and Long (1983) also make the point that communicative production might lead to the development of reduction strategies resulting in pidginogenesis and fosssilization.
 6. A Cognitive account of the writing processes: the Hayes and Flower (1980) model

Hayes and Flower’s (1980) model of essay writing is regarded as one of the most effective accounts of writing available to-date (Eysenck and Keane, 1995). As Figure 2 below shows, it posits three major components:

1. Task-environment,

2. Writer’s Long-Term Memory,
3. Writing process.

Figure 1: The Hayes and Flower model (adapted from Hayes and Flower, 1980)

The Task-environment includes: (1) the writing assignment (the topic, the target audience, and motivational factors) and the text; (2) The Writer’s LTM, which provides factual knowledge and skill/genre specific procedures; (3) the Writing Process, which consists of the three sub-processes of Planning, Translating and Reviewing.

The Planning process sets goals based on information drawn from the Task-environment and Long-Term Memory (LTM). Once these have been established, a writing plan is developed to achieve those goals. More specifically, the Generating sub-process retrieves information from LTM through an associative chain in which each item of information retrieved functions as a cue to retrieve the next item of information and so forth. The Organising sub-process selects the most relevant items of information retrieved and organizes them into a coherent writing plan. Finally, the Goal-setting sub-process sets rules (e.g. ‘keep it simple’) that will be applied in the editing process. The second process, Translating, transforms the information retrieved from LTM into language. This is necessary since concepts are stored in LTM in the form of Propositions, not words. Flower and Hayes (1980) provide the following examples of what propositions involve:

[(Concept A) (Relation B) (Concept C)]

{Concept D) (Attribute E)], etc.

Finally, the Reviewing processes of Reading and Editing have the function of enhancing the quality of the output. The Editing process checks that discourse conventions are not being flouted, looks for semantic inaccuracies and evaluates the text in the light of the writing goals. Editing has the form of a Production system with two IF- THEN conditions:

 The first part specifies the kind of language to which the editing production

applies, e.g. formal sentences, notes, etc. The second is a fault detector for

such problems as grammatical errors, incorrect words, and missing context.

(Hayes and Flower, 1980: 17)

 When the conditions of a Production are met, e.g. a wrong word ending is detected, an action is triggered for fixing the problem. For example:

CONDITION 1: (formal sentence) first letter of sentence lower case

CONDITION 2: change first letter to upper case

(Adapted from Hayes and Flower, 1980: 17)

Two important features of the Editing process are: (1) it is triggered automatically whenever the conditions of an Editing Production are met; (2) it may interrupt any other ongoing process. Editing is regulated by an attentional system called The Monitor. Hayes and Flower do not provide a detailed account of how it operates. Differently from Krashen’s (1977) Monitor, a control system used solely for editing, Hayes and Flower’s (1980) device operates at all levels of production orchestrating the activation of the various sub-processes. This allows Hayes and Flower to account for two phenomena they observed. Firstly, the Editing and the Generating processes can cut across other processes. Secondly, the existence of the Monitor enables the system to be flexible in the application of goal-setting rules, in that through the Monitor any other processes can be triggered. This flexibility allows for the recursiveness of the writing process.

 7. Extending the model: Cognitive accounts of the translating sub-processes and insights from proofreading research

Hayes and Flower’s model is useful in providing teachers with a framework for understanding the many demands that essay writing poses on students. In particular, it helps teachers understand how the recursiveness of the writing process may cause those demands to interfere with each other causing cognitive overload and error. Furthermore, by conceptualising editing as a process that can interrupt writing at any moment, the model has a very important implication for a theory of error: self-correctable errors occurring at any level of written production are not always the result of a retrieval failure; they may also be interpreted as caused by detection failure. However, one limitation of the model for a theory of error is that its description of the Translating and Editing sub-processes is too general. I shall therefore supplement it with Cooper and Matsuhashi’s (1983) list of writing plans and decisions along with findings from other L1-writing Cognitive research, which will provide the reader with a more detailed account. I shall also briefly discuss some findings from proofreading research which may help explain some of the problems encountered by L2-student writers during the Editing process.

7.1 The translating sub-processes

Cooper and Matsuhashi (1983) posit four stages, which correspond to Hayes and Flower’s (1980) Translating: Wording, Presenting, Storing and Transcribing. In the first stage, the brain transforms the propositional content into lexis. Although at this stage the pre-lexical decisions the writer made at earlier stages and the preceding discourse limit lexical choice, Wording the proposition is still a complex task: ‘the choice seems infinite, especially when we begin considering all the possibilities for modifying or qualifying the main verb and the agentive and affected nouns’ (Cooper and Matsuhashi, 1983: 32). Once s/he has selected the lexical items, the writer has to tackle the task of Presenting the proposition in standard written language. This involves making a series of decisions in the areas of genre and grammar. In the area of grammar, Agreement and Tense will be the main issues.
The proposition, as planned so far, is then temporarily stored in Working Short Term Memory (henceforth WSTM) while Transcribing takes place. Propositions longer than just a few words will have to be rehearsed and re-rehearsed in WSTM for parts of it not to be lost before the transcription is complete. The limitations of WSTM create serious disadvantages for unpractised writers. Until they gain some confidence and fluency with spelling, their WSTM may have to be loaded up with letter sequences of single words or with only 2 or 3 words (Hotopf, 1980). This not only slows down the writing process, but it also means that all other planning must be suspended during the transcriptions of short letter or word sequences.

The physical act of transcribing the fully formed proposition begins once the graphic image of the output has been stored in WSTM. In L1-writing, transcription occupies subsidiary awareness, enabling the writer to use focal awareness for other plans and decisions. In practiced writers, transcription of certain words and sentences can be so automatic as to permit planning the next proposition while one is still transcribing the previous one. An interesting finding with regards to these final stages of written production comes from Bereiter, Fire and Gartshore (1979) who investigated L1-writers aged 10-12. They identified several discrepancies between learners’ forecasts in think-aloud and their actual writing. 78 % of such discrepancies involved stylistic variations. Notably, in 17% of the forecasts, significant words were uttered in forecasts which did not appear in the writing. In about half of these cases the result was a syntactic flaw (e.g. the forecasted phrase ‘on the way to school’ was written ‘on the to school’). Bereiter and Scardamalia (1987) believe that lapses of this kind indicate that language is lost somewhere between storage in WSTM and grapho-motor execution. These lapses, they also assert, cannot be described as ‘forgetting what one was going to say’ since almost every omission was reported on recall: in the case of ‘on the to school’, for example, the author not only intended to write ‘on the way’ but claimed later to have written it. In their view, this is caused by interference from the attentional demands of the mechanics of writing (spelling, capitalization, etc.), the underlying psychological premise being that a writer has a limited amount of attention to allocate and that whatever is taken up with the lower level demands of written language must be taken from something else.

In sum, Cooper and Matsuhashi (1983) posit two stages in the conversion of the preverbal message into a speech plan: (1) the selection of the right lexical units and (2) the application of grammatical rules. The unit of language is then deposited in STM awaiting translation into grapho-motor execution. This temporary storage raises the possibility that lower level demands affects production as follows: (1) causing the writer to omit material during grapho-motor execution; (2) leading to forgetting higher-level decisions already made. Interference resulting in WSTM loss can also be caused by lack of monitoring of the written output due to devoting conscious attention entirely to planning ahead, while leaving the process of transcription to run ‘on automatic’.

 7.3 Some insights from proofreading research

Proofreading theories and research provide us with the following important insights in the mechanisms that regulate essay editing. Firstly, proofreading involves different processes from reading: when one proofreads a passage, one is generally looking for misspellings, words that might have been omitted or repeated, typographical mistakes, etc., and as a result, comprehension is not the goal. When one is reading a text, on the other hand, one’s primary goal is comprehension. Thus, reading involves construction of meaning, while proofreading involves visual search. For this reason, in reading, short function words, not being semantically salient, are not fixated (Paap, Newsome, McDonald and Schvaneveldt, 1982). Consequently, errors on such words are less likely to be spotted when one is editing a text concentrating mostly on its meaning than when one is focusing one’s attention on the text as part of a proofreading task (Haber and Schindler, 1981). Errors are likely to decrease even further when the proofreader is forced to fixate on every single function word in isolation (Haber and Schindler, 1981).

 It should also be noted that some proofreader’s errors appear to be due to acoustic coding. This refers to the phenomenon whereby the way a proofreader pronounces a word/diphthong/letter influences his/her detection of an error. For example, if an English learner of L2-Italian pronounces the ‘e’ in the singular noun ‘stazione’ (= train station) as [i] instead of [e], s/he will find it difficult to differentiate it from the plural ‘stazioni’ (= train stations). This may impinge on her/his ability to spot errors with that word involving the use of the singular for the plural and vice versa.
 The implications for the present study are that learners may have be trained to go through their essays at least once focusing exclusively on form. Secondly, they should be asked to pay particular attention to those words (e.g. function words) and parts of words (e.g. verb endings) that they may not perceive as semantically salient.

7.4 Bilingual written production: adapting the unilingual model

Writing, although slower than speaking, is still processed at enormous speed in mature native speakers’ WSTM. The processing time required by a writer will be greater in the L2 than in the L1 and will increase at lower levels of proficiency: at the Wording stage, more time will be needed to match non-proceduralized lexical materials to propositions; at the Presenting stage, more time will be needed to select and retrieve the right grammatical form. Furthermore, more attentional effort will be required in rehearsing the sentence plans in WSTM; in fact, just like Hotopf’s (1980) young L1-writers, non proficient L2-learners may be able to store in WSTM only two or three words at a time. This has implications for Agreement in Italian in view of the fact that words more than three-four words distant from one another may still have to agree in gender and number. Finally, in the Transcribing phase, the retrieval of spelling and other aspects of the writing mechanics will take up more WSTM focal awareness.

Monitoring too will require more conscious effort, increasing the chances of Short-term Memory loss. This is more likely to happen with less expert learners: the attentional system having to monitor levels of language that in the mature L1-speaker are normally automatized, it will not have enough channel capacity available, at the point of utterance, to cope with lexical/grammatical items that have not yet been proceduralised. This also implies that Editing is likely to be more recursive than in L1-writing, interrupting other writing processes more often, with consequences for the higher meta-components. In view of the attentional demands posed by L2-writing, the interference caused by planning ahead will also be more likely to occur, giving rise to processing failure. Processing failure/WSTM loss may also be caused by the L2-writer pausing to consult dictionaries or other resources to fill gaps in their L2-knowledge while rehearsing the incomplete sentence plan in WSTM. In fact, research indicates that although, in general terms, composing patterns (sequences of writing behaviours) are similar in L1s and L2s there are some important differences.
In his seminal review of the L1/L2-writing literature, Silva (1993) identified a number of discrepancies between L1- and L2-composing. Firstly, L2-composing was clearly more difficult. More specifically, the Transcribing phase was more laborious, less fluent, and less productive. Also, L2-writers spent more time referring back to an outline or prompt and consulting dictionaries. They also experienced more problems in selecting the appropriate vocabulary. Furthermore, L2-writers paused more frequently and for longer time, which resulted in L2-writing occurring at a slower rate. As far as Reviewing is concerned, Silva (1993) found evidence in the literature that in L2-writing there is usually less re-reading of and reflecting on written texts. He also reported evidence suggesting that L2-writers revise more, before and while drafting, and in between drafts. However, this revision was more problematic and more of a preoccupation. There also appears to be less auditory monitoring in the L2 and L2-revision seems to focus more on grammar and less on mechanics, particularly spelling. Finally, the text features of L2-written texts provide strong evidence suggesting that L2-writing is a less fluent process involving more errors and producing – at least in terms of the judgements of native English speakers – less effective texts.
 8. Conclusion : Implications for teaching and learning
 In the above I have discussed my espoused theories of L2-acquisition and L2-writing. I started by focusing on Anderson’s (1980, 1982, 1983, 2000) account of how language structures are acquired and language processing develops. Drawing on SLA research I then discussed some important phenomena and processes involved in the aetiology of error relevant to the present study. Finally, I discussed Hayes and Flower (1980) and Cooper and Matsuhashi’s (1983) models of written production and their implications for bilingual written production. The following notions emerging from my discussion must in my view provide the theoretical underpinnings of any remedial corrective approach to L2 writing errors.
 (1) L2-acquisition occurs in much the same way as the acquisition of any other cognitive skill;

(2) the acquisition of a skill begins consciously with an associative stage during which the brain creates a declarative representation of Productions (i.e. the procedures that regulate that skill);

 (3) it is an adaptive feature of the human brain to make the performance of any skill automatic in order to render its execution fast and efficient in terms of cognitive processing;
(4) automatisation can be a very lengthy process, since for a skill to become automatic it must be performed numerous times;

(5) the Productions that regulate a skill become automatised only if their application is perceived by the brain as resulting in positive outcomes;

 (6) at a given stage in learner development, more than one Production relating to a given item can co-exist in his/her Interlanguage. These compete for retrieval. The Productions with the stronger memory trace – not necessarily the correct one – will win;

(7) negative evidence as to the effectiveness of a Production determines whether it is going to be rejected by the brain or automatised;

(8) once a Production (including those giving rise to errors) is automatised, it is difficult to alter;

(9) errors may be the result of lack of knowledge or processing efficiency problems;

(10) learners use Language Transfer and Communication Strategies to make up for the absence of the appropriate L2-declarative knowledge necessary in order to realize a given communicative goal. These phenomena are likely to give rise to error.

(11) the writing process is recursive and can be interrupted by editing any time;

(12) the errors in L2-writing relating to morphology and syntax occur mostly in the Translating phase of the writing process when Propositions are converted into language. They may occur as a result of cognitive overload caused by the interference of various processes occurring simultaneously and posing cognitive demands beyond the processing ability of the writer’s WSTM.

(13) editing for meaning involves different processes than editing for form. When editing for meaning the writer/editor is more likely to miss function words because they are less semantically salient.

These notions have important implications for any approach to error correction. One refers to Anderson’s assumption that the acquisition of L2-structures in classroom-settings mostly begins at conscious level with the creation of mental representations of the rules governing their usage. The obvious corollary being that corrective feedback should help the learners create or restructure their declarative knowledge of the L2-rule system, any corrective approach should involve L2 students in grammar learning involving cognitive restructuring and extensive practice. This entails delivering a well planned and elaborate intervention not just a one-off lesson on a structure identified as a problem in a learner’s written piece.

Another important notion advanced by Anderson is that the automization of a Production occurs only after it has been applied numerous times and with success (actual or perceived). This notion has three major implications for Error Correction.
 (1) Error Correction can play an important role in L2-acquisition since, in order to reject a wrong production, the learner needs lots of negative evidence that informs him/her of its incorrectness.

(2) Errors should be corrected consistently to avoid sending the learners confused messages about the correctness of a given structure.

(3) For Error Correction to lead to the de-fossilization of wrong Productions and the automatization of new, correct Productions, the former should occur in learner output as rarely as possible, whereas the latter should be produced as frequently as possible.

 Consistently with these three notions, a teacher may want to invest a lot of effort in raising the learners’ awareness of their errors, should be as consistent as possible in correcting them and, finally, encourage learners to practise the problematic structures as often as possible in and outside the context of the essays they will write.
Other implications refer to the concept of automatization. As discussed above, automatised cognitive structures are difficult to alter. It follows that Error Correction is more likely to be successful (in the absence of major developmental constraints) at the early stages of learning an L2-item, before ‘incorrect’ Productions have reached the ‘Strengthening’ stage of Acquisition. Thus, in order to prevent error fossilization or automatization any corrective intervention should tackle errors more prone to routinization (usually those referring to less semantically salient language items) as early as possible in the acquisition process.
Another set of implications relates to the causes and nature of learner errors. As discussed above, a number of errors result from L2-learners’ attempt to make up for their lack of correct L2-declarative knowledge through the deployment of the following problem-solving strategies:

(1) Communication Strategies: in the absence of linguistic knowledge of an L2-item a learner may deploy achievement strategies. As far as lexical items are concerned they may deploy the following strategies leading to error: ‘Approximation’, ‘Coinage’ and ‘Foreignization’. In the case of grammar or orthography learners will draw on existing declarative knowledge, over generalizing a rule (generalization) or guessing;

(2) Use of resources: learners may use dictionaries or other sources of L2-knowledge (including people) incorrectly;
(3) L1-or L3-transfer;

(4) Avoidance.

 Since these errors are extremely likely to occur in beginner and intermediate students’ writing, teachers should involve students in activities raising learner awareness of these issues and provide practice in ways of tackling them. For instance, as far as the above Communicative Strategies are concerned, students should be trained to use dictionaries and other resources more frequently to prevent errors due to Approximation, Coinage and Foreignization. Secondly, as far as poor use of resources is concerned learners must be made aware of the possible pitfalls of using dictionaries and textbooks and be trained to use these tools more effectively and efficiently. Thirdly, learners must be made aware of the issues related to the excessive reliance on L1-/L3-Transfer and of negative Transfer (again, through effective learner training)

As discussed above, errors can also be caused by WSTM processing failure due to cognitive overload. Grammatical, lexical and orthographical errors will occur as a result of learners handling structures which have not been sufficiently automatized, in situations where the operating conditions in WSTM are too challenging for the attentional system to monitor all levels of production effectively. The implications for Error Correction is that learners should be made aware of which types of contexts are more likely to cause processing efficiency failure so that they may approach them more carefully in the future. Examples of such contexts may be sentences where the learner is attempting to express a difficult concept which requires new vocabulary and the use of tenses/moods he has not totally mastered; long sentences where items agreeing with each other in gender and/or number are located quite far apart from each other (not an uncommon occurrence in Italian); situations in which the production of a sentence has to be interrupted several times because the learner needs to consult the dictionary. Remedial practice should provide the learners with opportunity to operate in such contexts in order to train them to cope with the cognitive demands they pose on processing efficiency in Real Operating Conditions.

Another important implication of my discussion for Error Correction refers to the notion that errors are not simply the result of a Translating failure, but also of an Editing failure. The failure to detect may be due to two factors. One relates to the goal oriented-ness of the Production systems that regulate any levels of language processing: the brain is going to review the accuracy of every single aspect of the text only if it perceives that this is relevant to its goals in the production of the text. Thus, if the communication of content is the main goal the writer sets in an essay, the accuracy of function words is likely to become a secondary concern since they are not perceived as salient to the realisation of that goal. The other issue will be time. It is likely that lack of time will exacerbate this issue since it will force learners to prioritise certain aspects of their output in the Editing phase(s) over others. The implication for Error Correction is that it should aim at developing learner intentionality to be accurate at every level of the text. This may not be easy if accuracy does not feature prominently amongst the curriculum, teacher and/or student’s priorities.
Secondly, editing failure may be due to the fact that reading an essay to check and/or improve the quality of its content is different from proofreading aimed at checking non-semantic aspects of the output. As noted above, the former approach to text revision often results in the failure to detect errors with function words. The implications of this phenomenon for corrective approaches is that learner awareness of the importance of paying greater attention to function words in Editing essays should be raised. Moreover, as an editing strategy, learners should be advised to carry out the revision of their essay-drafts in two distinct phases: one aimed at checking the content and another one focused exclusively on the accuracy of grammar, lexis and orthography.
Furthermore, editing failure may be caused by the same issues that caused learners to err in the first place, that is: processing efficiency. Thus, the contexts that I listed above, sentences that are long and/or complex and/or contain problematic structures, etc. may pose problems on the learner ability to detect and/or self-correct the errors. One way to tackle this issue in remedial teaching is to advise the learners to be particularly careful in editing this kind of sentences and to approach them in a way that poses less strain on their processing efficiency; for example, by concentrating first on the items that, based on the self-knowledge they will have developed as part of metacognitive training, they are more likely to get wrong in that kind of context (training in the Monitoring-Familiar-Errors strategy would help in this respect).
A final point refers to the implications of the phenomenon of Variability for the diagnostic phase of any error treatment. As discussed above, this phenomenon may confuse the teacher or the error analyst as to whether a learner knows a given structure or not, since s/he seems to get it right at times and wrong at others. The implications of this phenomenon for Error Correction is that teachers should investigate the causes of any occurrence of this phenomenon in their learners’ writing in order to ascertain whether they refer to poor editing skills, partial knowledge of the target rule, etc. Based on the identification of the causes an appropriate action plan will be decided.

Why do our L1-English learners of French/Spanish find it hard to acquire agreement rules? What can teachers do to facilitate the process?


In a previous post I already dealt with the dichotomy declarative knowledge vs procedural knowledge and control. To put in a nutshell, the former refers to knowing the set of ‘rules’ governing the use of a given target grammar or lexical structure (having its mental representation) whereas the latter refers to its effective application during real operation conditions (e.g. in spontaneous speech or writing under time constraints). As I have often reiterated in many of my posts, language learning ought to aim at bringing about high levels of target language control (as close as possible to automatization), whilst viewing declarative knowledge as the necessary starting point in an L2-learner’s journey towards acquisition.

In my experience, students find it relatively easy to grasp the rules underlying the application of noun (or pronoun)-adjective rules but rarely manage to acquire effective control, and even at A-level and University many mistakes continue slipping into performance (especially in oral output). Why is it? And what can be done about it?

In a previous post I discussed how agreement errors are often due to the fact that, in less expert L2 speakers/writers, whenever Working Memory experiences cognitive overload, the brain tends to focus only on the most semantically salient features of the output (the ones that convey most of the intended meaning) and neglects the features which do not contribute much to meaning. However, this is not the whole truth. The picture is much more complicated than that.

Let us look at the cognitive operations and knowledge involved in the process of applying adjectival agreement rules in the production of L2 French/Italian/Spanish/German output (i.e. speaking or writing) under real operating conditions (henceforth ROC). The L2 speaker/writer must:

  1. Retrieve the required French adjective;
  2. Remember to make it agree with the noun in terms of gender and number – which is not always straightforward as they may be relatively far from each other – separated by a copula and an intensifier, for instance;
  3. Know whether the nouns is masculine or feminine;
  4. Know whether it is irregular or regular;
  5. Apply the rule;
  6. (in speaking) pronounce it correctly / (in writing) spell it correctly.

These are quite a lot of cognitive operations to perform under ROC. To top it off, the cognitive load posed by these operations is exacerbated by the fact that there are often other permutations that the learner will have to execute in the same sentence (e.g. subject-to-verb agreement).

As pointed out in previous posts, if our learners keep making this kind of mistakes day in day out whenever they engage in spontaneous or pseudo-spontaneous communication, the errors end up being fossilized (automatized) and incorporated permanently in their Interlanguage. This explains why a lot of L2 learners keep making those mistakes all the way up to university. So what can be done to fix this problem earlier on? Lots of old-fashioned drills? Or how about, as the ‘Krashenites’ amongst us would suggest, exposing the students to lots of comprehensible input and avoid involving them in any language production until later stages in the instruction process? Neither of these solutions are in my opinion, a bad idea. In fact, any sound approach to this issue, would have to involve a bit of both.

To come up with an effective solution one should, in my opinion, consider first the three main psycholinguistics causes of the issue, which refer to the six operations listed above.

  1. The gender of nouns – the notion that words can be masculine and feminine (or neuter if one is learning German) is completely alien to an L1 English native speaker. Yet, how often and how strongly is this notion firmly placed in the students’ focal awareness and ‘drummed in’ across all four language skills during the early stages of acquisition – and later on, too – by MFL instructors? Enhancing their focus on this notion does not simply involve teaching them the gender of the target nouns; it also involves changing their mindset, the way their cognition works. Hence, teachers must ensure, since the very early days of L2 learning, that students are constantly reminded of this concept, both explicitly (e.g through work on noun morphology) and implicitly (e.g. through colour coding).
  1. Focus on word endings – the anglo-saxon brain is wired to focus on the beginning of words; hence, instinctively, an English native speaker would focus his/her attention on the opposite end to where s/he should indeed be focusing it on. This also entails another disadvantage: students may not learn much from any L2 written input they read since, by focusing mainly on the beginnings of words, may not notice the endings in the texts at all.To get an anglo-saxon brain to invert the ‘instinctive’ focus of its attention is no easy task, especially with adult learners. This process will require extensive day-in-day-out scaffolding and practice.
  1. The saliency of agreement – this issue compounds the problem identified in point 1, in that an L1 English speaker is not only at odds with the notion of gender, but will also find the notion of agreement unfamiliar and redundant. Hence their brain will automatically place the saliency of agreement low down in their list of attentional priorities. The challenge for teachers is to ensure that agreement is constantly in the learners’ focal awareness until it becomes ‘second nature’ – as it is for any French, Spanish, Italian and German native speaker. By making the application of agreement become ‘second nature’, I mean that whenever an adjective is retrieved by Working Memory, a ‘program’(or Production, as it is called by Skill theorists) in the learner’s brain is automatically activated  that operates something like this:


        If condition: if I use an adjective in a phrase/sentence…

       Then condition: …then I must make it agree with the noun it modifies


The speed at which the brain will activate the above Production (i.e. the extent of its Proceduralisation) will play a big role in determining how efficiently and effectively the agreement rule will be applied.

The implications for teaching are pretty obvious. MFL teachers must focus on developing processing efficiency under ROC (i.e. cognitive control), whilst addressing the three issues just discussed, by moving them into their learners’ focal awareness until, after day-in-day-out scaffolding (in the way of reminders) and practice (only a few minutes a day), they become automatic. I have already discussed fairly extensively how control can be enhanced in a previous post (“Control – the most neglected, yet most important factor in MFL grammar teaching”); as far as the other issues are concerned, here are a few possible teacher tactics. The reader should bear in mind that in my approach one should always start with receptive skills and move on to the productive ones at a later stage.

Focus on gender – here are some suggestions on how to focus students on gender:

  • Present masculine and feminine nouns always with the (indefinite/definite) article or any other determiner (e.g. mon/ma) and using different colour coding (this is common practice in many MFL classrooms);
  • When providing vocabulary lists, make sure that the masculine and feminine nouns are grouped separately (you may use colour-coding as background to enhance the contrast) ;
  • Model and practice extensively ‘rule of thumbs’ which may work as ‘aide-memoire’ in the identification of the gender of nouns (e.g. noun endings in ‘ion’ are usually feminine). Engaging inductive activities can be staged in class whereby the students are given lists of words and with the help of dictionaries need to work out by themselves such rules of thumbs.
  • After involving the students in a reading or listening-based activity, get them to identify (based on their determiners) the gender of a set list of nouns whose gender you want to focus on;
  • Involve the students, on a regular basis – I do one every single day – in quizzes based on gender identification e.g. odd one out’s (given three nouns, spot the feminine one) and gap-fills (where the article must be inserted);
  • Give the students a short passage containing X number of mistakes with (the gender of) articles or other determiners and challenge them to find them under time conditions with the help of the dictionary. This can be done as a way to practise the modelling of ‘rule of thumbs’. Students usually enjoy this activity;
  • The classroom environment can be used as a way to remind the students of the issue and to display any rule of thumbs modelled or worked out by the students.

              Focus on adjectival endings

  • Colour-code feminine endings – as well masculine endings when dealing with irregular adjectives (e.g. travailleur vs travailleuse);
  • When providing vocabulary lists include both the feminine and masculine endings of the target adjectives(it is tedious and time consuming but it pays off);
  • Listening activities involving focus on endings should be carried out regularly (e.g. minimal pairs, where the feminine and masculine forms of the same adjective are contrasted);
  • ‘Error hunt’ tasks where students need to identify a set number of agreement errors in a text – students usually enjoy this kind of activities;
  • Old-fashioned drills (e.g. multiple choice gap-fills; ending manipulation tasks; translations, etc.)
  • Give the students a checklist with the following guiding questions, for example, as a way to scaffold the focus on adjectival endings and agreement when they are producing written output, e.g.: (1) Which noun does the adjective refer to? (2) Is the noun feminine or masculine, singular or plural? How do you know? Have you double-checked, if in doubt? (3) Is the adjective regular or irregular? Have you double-checked, if in doubt?

Placing ‘agreement’ in the students’ focal awareness – As far as this issue is concerned, the above activities, if practised regularly, ‘should do the trick’. The above mentioned activity involving work on ‘authentic’ essays written by previous cohorts of students containing numerous mistakes with adjectival agreement could be used as a reminder of how common this type of error is. Also, in setting targets as part of feedback on writing or speaking, one of the three or four targets identified should include adjectival agreement if it is a recurrent source of error in the output. A narrow-focus corrective approach, whereby the feedback and feed-forward on a student’s written output centers mainly on one or two issues only (see my article on this approach, could be implemented, too. In this approach, the students could be focused solely on agreement issues for a few weeks so as to channel all of his/her attentional resources in the editing process only on this aspect of grammar accuracy.

In conclusion, the cognitive challenges posed by the acquisition and application of agreement rules are manifold. In this article I have endeavoured to outline a few. Most teachers do address such challenges, in my experiences, but not consistently and extensively enough to prevent them from causing these errors to recur and become fossilized in their learners’ interlanguage. Some practitioners adopting strong CLT approaches may not feel that agreement errors are important enough to deserve the allocation of dedicated teaching time in each lesson. I can relate to this argument, as I agree that fluency should come before accuracy as a priority.

However, agreement mistakes, when they are recurrent, can be stigmatizing and irritating to native speaker readers or listeners and may be interpreted by them as signs of poor linguistic competence. Hence, I advocate that a few minutes’ work on the above issues should feature regularly at the early stages of L2 instruction until one feels that the learners have finally acquired a sufficiently high level of focal awareness of and control over this structure.

Why do learners – in the same essay – sometimes make an error in the use of a specific target-language structure and sometimes they don’t?

Teenage girl (16-17) lying on bed, writing, close-up

This morning, whilst correcting Spanish essays written my year 8 students (12-13 year olds if you are not familiar with the British school system) one mistake attracted my attention : a girl had written – on the same line, but within different sentences – ‘llevé una camiseta’ (I wore a T-shirt) and ‘llevo una camiseta’ (I wear a T-shirt) to mean, in both cases, ‘I wear a T-shirt’.  When asked to self-correct, she noticed her mistake immediately and changed the ‘é’ in ‘llevé’ to ‘o’ (a sign that she had declarative knowledge of the first person of the Preterite and Present in Spanish).

But why would a student produce the present tense correctly in one sentence and not in another within the same essay? And how could my year 8 student get the verb right on the second instance when she had just got it wrong a few words before on the same line?

As I often do in my one-to-one corrective conferences, I asked the student in question why she thought she had made that mistake. She shrugged and said: “No idea, sir’. Maybe I was tired”. A plausible explanation considering that she had written a long essay and that the error occurred in last paragraph. But how can tiredness cause such mistakes?

The answer to this question relates to what applied linguists call Interlanguage Variability, a widely documented phenomenon that causes frustration to a lot of teachers but which is actually a developmental feature of L2 acquisition. What is Interlanguage variability? What causes it?

To fully understand this phenomenon, the reader will benefit from getting acquainted with two important concepts: Interlanguage and Spread of Activation. I will take for granted that the reader is familiar with the concept of Working Memory, already explained in some detail in a previous post (see below: “Why do our learners get prepositions, articles and verb and adjectival agreement wrong?”). Please note that in what follows I will focus only on the cause of variability which, in my view, are more relevant to teachers operating in explicit foreign language instruction settings and that I will not venture into sociolinguistic theories of the likes of Labov’s nor into nativist accounts of the phenomenon.


Interlanguage is the name given by Selinker (1972) to the internal representation the L2-learner builds of the target language system in his/her Long Term Memory. How does s/he build it? Mainly through hypothesis-testing, often using his/her dominant language as a reference framework in an attempt to decode and make sense of the target foreign language. Since L2-acquisition occurs through trial and error, Interlanguage is not an exact system, but rather an approximation of the target language system one is acquiring.

It should be noted that the cognitive and affective feedback the learner receives from the target language speakers/knowers plays a pivotal role in the construction of the Interlanguage system, as it will ultimately determine which Interlanguage forms will be automatized and acquired. So, if a given Interlanguage form receives a lot of positive cognitive and affective feedback from the environment, it will eventually be internalized after the brain will have repeatedly been given reassurance that it is accurate.

What often happens, though, during the early stages of L2 acquisition, is that learners do not always receive consistently negative/positive cognitive or affective feedback on their errors; and even when they do receive it, it doesn’t necessarily follow that they will internalize it. This happens for a number of reasons to do with the corrective approach used in the classroom (e.g. selective or no correction); its quality (e.g. ambiguous feedback); strong interference from their first language which makes the Interlanguage structure more resistant to correction; etc.

Moreover, when students engage in unmonitored L2-production (in or outside the classroom), as happens in the course of unstructured communicative activities, their output is likely to contain more errors.

Although errors made at this stage are not automatized immediately, they will not be discarded by the brain straight away either, especially when they are repeated several times – and errors due to L1 transfer are likely to occur quite frequently at the early stages of L2 learning.  Hence, it is very common for the Interlanguage of an L2 learner to ‘contain’ more than one representation of a given target language structure: the correct one and one or more incorrect ones. Example: ‘I went’ in French is ‘je suis allé’, however, L2 students often say ‘j’ai allé’ at the early stages of French acquisition because they overgeneralize the dominant way of forming the Perfect Indicative in French. These two forms ‘j’ai allé’ and ‘je suis allé’ often coexist in L2 learners’ of French Interlanguage and compete with one another for retrieval. I will come back to this example. Now with this in mind let us look at the concept of ‘spread of activation’.

Spread of activation and Variability from processing inefficiency

When we are attending to a task, like forming the Perfect tense of ‘Aller’ in French, as in the above example, Working Memory will have to retrieve from Long-term Memory the correct match for ‘I went’ in French. As Working Memory attends to this tasks, every single bit of information (lexis, grammar, imagery)  related to the concept ‘I went’ stored in our Long Term Memory gets activated. ‘Electrical impulses’ run through semantic memory’s neural networks and the information or ‘nodes’ along the network get more or less activation based on the strength of their associations with the proposition we mean to ‘translate’ into French – the so-called ‘fan effect’. The items along the activated neural networks which will receive the greatest activation will be ‘‘j’ai allé’ and ‘je suis allé’ and, possibly, in my experience, ‘j’allé’. Which one of the three forms will be retrieved and used in the written/oral performance will depend on the ‘weight’ of each form (i.e. the strength of the memory trace) and on the context.

If the learner knows the correct French translation of ‘I went’ and Working Memory is not experiencing cognitive overload thereby having enough free space to monitor the output, even though s/he might have an initial moment of indecision due to the concurrent activation of the other two activated forms, s/he will be likely to apply the correct Interlanguage form. However, if his/her Working Memory is experiencing cognitive overload (processing inefficiency) due to a challenging task-in-hand, in the absence of close monitoring, any of the three forms may be retrieved (pretty much randomly) if their ‘weights’ are more or less equivalent. Hence the importance, at the early stages of learning, not to engage in overly unstructured oral or writing tasks.

Variability as caused by formulaic language

Variability can also be caused by formulaic-language learning that is to say the acquisition of unanalyzed chunks or set phrases memorized without really knowing what each constituent of the phrase actually means or how the grammar rules which ‘holds’ them together actually work. Thus, if a learner uses ‘je suis allé au cinema’ correctly in a written piece because s/he has learnt that sentence as an unanalyzed chunk, it will not mean that s/he masters the use of the Perfect Tense of verbs requiring the auxiliary ‘Etre’ in the Perfect Tense. Hence, when, a few lines below, in the same essay, s/he translates ‘I went’ incorrectly in a different context (e.g. I went to the park) we should not be particularly surprised by the occurrence of variability.

Variability as caused by learner strategies

Variability can also be caused by the learner’s attempt at testing a specific hypothesis they formulated about a given target language structure. Let us look at Muskaan’s hypothesis-testing strategy. Muskaan is a year 9 student of Spanish I teach who, today, told me that when she is not sure whether her assumptions about how to use a given structure are correct, tries them all out deliberately in order to get feedback from me as to which one is correct. In the essay we were marking together today, for instance, she had used a conditional and an imperfect form to translate two very similar sentences which should have required the imperfect. She wanted to tested the hypothesis that, just like in English you would use the conditional tense in sentences like “when I was young I would play the guitar in my free time’ one can do the same in Spanish. In Muskaan’s case, the retrieval of the two concurrent Interlanguage forms is not automatic / subconscious, but is triggered by a deliberate risk-taking strategy.

Risk-taking is another frequent cause of variability in our learners’ output and a phenomenon that must not be discouraged as it has great potential for learning.

In conclusion, Variability is a complete normal phenomenon that should not cause us too much frustration, even when it seem to be caused by our teaching. The most important implication of this phenomenon for the MFL classroom is that we need to be cognitively empathetic with our learners when we find this kind of mistakes and while addressing them through appropriate remedial learning, we must not stigmatize them. Secondly, teachers must give students enough time to monitor their output and encourage them to edit their written work carefully and in ways which lessen the cognitive load on their Working Memories (as the problem which triggered the error in production is likely to hinder its detection whilst proofreading). One such strategy is to have several runs through the same text, each one aimed at checking a particular type of item at a time (e.g. first time, adjectival agreement; second time, verb agreement; third time, omissions of copulas; fourth time ‘small function words’). Sentences that are particularly long and require complex processing should be dealt with by investing more time and focus.

Why do our learners often get prepositions, articles and verb endings wrong?


This article was prompted by a question (the one in the title) a colleague asked me recently at finding lots of mistakes in their students’ essays relating not solely to prepositions, but also to definite/indefinite articles, copulas (e.g. is and are) and other function words. The answer to that question is relatively simple, but in order for the reader to fully grasp its implications for classroom instruction, one has to first get acquainted with the concept of Working Memory and Executive Control.

Working Memory

To put it as simply as possible – since one of my colleagues keeps complaining about the complex jargon in my blogposts – Working Memory is the space activated in our brain when we process information, what in the old days was called Short-Term memory. Working Memory is a ‘buffer’ between the outside world and Long-Term memory which ‘holds’ any information we are trying to decode and retrieves the information we need from Long-term memory which we need to carry out the task-in-hand (e.g. writing a sentence or understanding a text). So, for example, when we are writing a sentence in a foreign language, Working Memory is the ‘place’ along our neural network in which we actually construct that sentence (i.e. where we choose the words we need from our mental lexicon, arrange them together in a grammatically correct sequence, make sure the spelling is correct and edit the final product).

Working memory has very limited channel capacity, in other words can only store a limited number of images, words and numbers at any one time and unless we keep rehearsing it, the information will be lost easily after a few seconds only (a word stays in our brain only 2-3 seconds unless we make a conscious effort to retain in through rehearsal). That is why, in order to keep a phone number in our head as we frantically try to key it in our phone we need to repeat it in our heads (or rehearse it) a few times. Miller’s (1965) magic number 7+/- 2 indicates the number of digits we can hold in our Working Memory at any one time – a very short number indeed.

The challenges posed by foreign language writing

Writing in a foreign language is much harder than a lot of us may think, especially under communicative pressure. Let us have a closer look at how a sentence is produced in writing. First of all, it is important to point out that the starting point, both in the first and the second language is a Proposition, in other words a representation in our brain (in Semantic Memory to be precise) of the concept or idea we are trying to convey. A Proposition, unlike what we may intuitively think, is not made up of words, thus, the brain has to translate into words, whether we are operating in the first language or second language.

According to Cognitive research (e.g. Cooper and Matsuhashi, 1983), the Translation process consists of four stages: Wording, Presenting, Storing and Transcribing. In the first stage, the brain transforms the Propositions into words (lexis). Although at this stage the pre-lexical decisions the writer made at earlier stages and the preceding discourse limit lexical choice, Wording the proposition is still a complex task: ‘the choice seems infinite, especially when we begin considering all the possibilities for modifying or qualifying the main verb and affected nouns’ (Cooper and Matsuhashi, 1983: 32).

Once s/he has selected the lexical items needed, the writer has to tackle the task of Presenting the proposition in standard written language. This involves making a series of decisions in the areas of genre and grammar. In the area of grammar, Agreement and Tense will be the main issues, especially in languages like French, or German where a lot of permutations are required.

The proposition, as planned so far, is then temporarily stored in Working Memory while Transcribing takes place. Propositions longer than just a few words will have to be rehearsed and re-rehearsed in Working Memory for parts of it not to be lost before the transcription is complete.

The limitations of Working Memory create serious disadvantages for unskilled writers. Until they gain some confidence and fluency with spelling, their Working Memory may have to be loaded up with letter sequences of single words or with only 2 or 3 words (Hotopf, 1980). This not only slows down the writing process, but it also means that all other planning must be suspended during the transcriptions of short letter or word sequences.

The physical act of Transcribing the fully formed proposition begins once the graphic image of the output (what the sentence physically looks like) has been stored in Working Memory.

In L1-writing the decisions taken at any of the four stages outlined above are taken automatically, thereby occupying little or no space at all in Working Memory. However, in L2-writing, especially in beginner to intermediate writer, every decision will take a lot of Working Memory space, making the process slow, cumbersome and difficult to monitor because the process happens mostly consciously.

Hence, the adaptive response of the brain, especially in beginner writers, is to prioritize the most important features of each proposition (the principle of ‘Saliency first’ being at play here), i.e. : the items that are most important in terms of conveying the intended meaning. The most semantically salient elements will include mainly: Nouns, Verbs and Adjectives. Function words, which carry considerably less meaning, will be relatively neglected by Working Memory’s attentional systems as, let’s face it, even if the writer gets them wrong, they won’t impede comprehension massively (example: whether I say, in French “Je vais au cinema’ or ‘je vais à la cinema’ I will be readily understood by a reader/listener).

This phenomenon is exacerbated by linguistic distance between the first language and the target foreign language. For instance gender (masculine and feminine) as well as verb endings are not likely to be perceived as salient by an English native speaker (as they do not exist in their language), which means that they are likely to be less monitored.

The less proficient the foreign language writer is and the less time he has to monitor his/her output, the more likely he/she will be to make mistakes with function words. Hence, errors are bound to be even more frequent in oral performance, where the self-monitoring capacity of Working Memory is drastically reduced compared to the written medium.

As the learner becomes more proficient, his/her ability to juggle the demands posed to his/her Working Memory by the processes outlined above will increase. This is due to the fact that with a lot of writing practice in the target foreign language a lot of sub-processes become automatized and require only peripheral attention, freeing up Working Memory space. This enhanced processing efficiency will also allow for more accuracy, too, in the production of less salient features unless Error Fossilization throws the spanner in the works.

The danger of fossilization

When errors go unmonitored a bit too often, they become automatized and it is very difficult to ‘unlearn’ or eradicate. Mukkatesh (1986) found that despite many remedial interventions such errors cannot be eliminate at all from L2 learners’ Interlanguage. This phenomenon, called by Selinker (1972) Fossilization, is obvious in a lot of foreign language speakers, especially when it comes to pronunciation; that is why, according to Selinker, only 5 % of foreign language speakers can be said to sound 100% native-like. Their second language will always contain some fossilized item. A very good friend of mine, for instance, speaks perfect English, with accurate pronunciation and grammar and an impressive lexical repertoire wider to that of an average native speaker; however, he cannot help voicing the ‘p’ in the word ‘psychology’ (influence of his first language: Italian), despite many corrections. Such is the power of Fossilization.

Function words and any other less salient L2 features (e.g. gender, plural and verb endings and minor pronunciation inaccuracies) are particularly amenable to fossilization as they are more likely to go unmonitored and uncorrected. Therefore, the danger is that when learners do not get enough negative (cognitive) feedback at the early stages of L2 acquisition, they are likely to fossilize mistakes with the above L2 structures and to keep making these mistakes all the way to A-Level and university – as I have often witnessed in my university lecturer days.

Communicative language teaching, especially in its strong version, by prioritizing fluency over accuracy, often leads to fossilization (and pidginization) especially when the students are asked to perform in unstructured oral practice, at a level of proficiency they are not developmentally ready for and under communicative pressure. (Skehan, 1994)

Implications for MFL teaching


The implications for the MFL classroom are manifold but hinge mainly on the teacher’s pedagogy and on the course end-goals. If we are teaching GCSE level students and we are happy for them to make a few minor mistakes as far as they can convey their intended meaning effectively, we should not worry too much about error and we can exercise a relatively high degree of tolerance. However, if we are dealing with individuals who want to make language their career and become one day interpreters, translators or teachers, then the attitude has to be less lax and mistakes with articles, prepositions, copulas and gender agreement WILL matter.

If we do want to address this issue radically, we need to keep students focused on the importance of accuracy from the very early stages of language acquisition whilst keeping the main focus of our teaching on the development of fluency. This is not easy, even for experienced teachers. Editing instruction – through games, quiz and other fun activities – should become part of almost every lesson (through snappy starters or plenaries, for example) to remind students of the importance of accuracy and to raise their awareness of which mistakes are more likely to occur at their current level of proficiency.

More importantly, the written tasks we involve are students in must pitched to the correct level, especially in terms of the cognitive challenges they pose to an inexperienced writer. If we do not, we are likely to engender more error than we and the students can effectively deal with in the remedial phase. Fluency, as I said above, has priority, it is true; however, fluent output that is rife with errors can be stigmatizing and irritating for the reader/interlocutor and we need to be aware of that in a global era in which, more than ever before, our learners are more likely to use the target language in the workplace.

Finally, Error correction – or rather Error remediation – can also play an important role if it engages the learners in a sustained long-term self-monitoring process initially moderated by the teacher which aims at focusing them on their most frequent mistakes.