12 strategies for enhancing intermediate language students’ writing

In this post I suggest strategies for enhancing modern-language students’ chances to succeed at writing, based on cognitive models of L2 written production and skill acquisition.

I will start by outlining the processes which underlie writing, as laid out in Hayes and Flowers’ (1980) model of essay writing and Cooper and Matsuhashi’s (1983) account of the Translating processes. Both models will provide us with extremely useful insights in the issues that undermine written performance thereby cueing us to the instructional approaches that may help us ‘fix’ or ideally prevent those issues.

It should be noted that whereas in a previous post I discussed the implications of these models for advanced-level students, here I will concern myself with students preparing for the England and Wales GCSE examination, who typically fall into the Lower to Upper Intermediate proficiency bands. Please note that Steve Smith and I have dealt more thoroughly with this topic in our book, ‘The Language Teacher Toolkit’ (here).

If you are not interested in the theory behind my approach, go straight to paragraph 4, where I lay out the suggested twelve minimal-preparation/high-impact instructional strategies.

2.Understanding higher order writing processes: Planning, Organising, Goal-setting and Editing

The Hayes and Flower model concerns itself mainly with the higher order writing processes. It posits three main components:

(1) the Task-environment, which includes the Writing Assignment (the topic, the target audience, and motivational factors)and the text one is producing;

(2) the Writer’s Long-term memory (LTM), which provides factual knowledge and skill/genre specific procedures;

(3) the Writing Process, which consists of the three sub-processes of Planning, Translating and Reviewing and occurs in Working Memory.

Figure 1: The Hayes and Flower Model of writing


The Planning process sets goals (e.g. I need to write about the pros of using the car) based on information drawn from the Task-environment (e.g. the essay title: Discuss the pros and cons  of using the car ) and Long-Term Memory  (our background knowledge and our previous experience with similar essay titles).

Once the goals have been established, a writing plan is developed to achieve those goals. More specifically, the Generatings sub-process retrieves information from LTM through an associative chain in which each item of information or concept retrieved functions as a cue to retrieve the next item of information and so forth. (e.g. cars use fossil fuels > fossil fuels cause CO2 emissions> CO2 emissions pollute, etc.)

The Organising sub-process selects the most relevant items of information retrieved and organizes them into a coherent writing plan.

Finally, the Goal-setting sub-process sets rules (e.g. ‘keep it simple’,’avoid long lists of nouns’, ‘add in more tenses and opinions’, etc.) that will be applied in the Editing process.

The second macro-process, Translating, transforms the information retrieved from LTM into language. This is necessary since concepts are stored in LTM in the form of Propositions (‘concepts’/ ‘imagery’), not words – a language that some refer to as ‘Brainese’. Flower and Hayes (1980) provide the following examples of what propositions involve:  [(Concept A) (Relation B) (Concept C)] or {Concept D) (Attribute E)], etc.

Finally, the Reviewing processes of Reading and Editing have the function of enhancing the quality of the written output. The Editing process checks that grammar rules and discourse conventions are not being flouted, looks for semantic inaccuracies and evaluates the text in the light of the writing goals. The Editing process checks that grammar rules and discourse conventions are not being flouted, looks for semantic inaccuracies and evaluates the text in the light of the writing goals.Two important features of the Editing process are: (1) it is triggered automatically whenever a fault is detected; (2) it may interrupt any other ongoing process.

Editing is regulated by an attentional system called the Monitor. This system seems to be more active in certain individuals than others, a phenomenon that has led Stephen Krashen to categorize language learners in ‘High monitors’ and ‘Low Monitors’. In my PhD study I identified a number of factors that appeared to render my subjects more sensitised to form accuracy than others; the most decisive of them were  personality, motivation and previous history as learners. In particular I found that students who had been taught in learning settings where attention-to-form had been emphasized were generaly higher and more effective monitors.

Hayes and Flower’s model is useful in providing teachers with a framework for understanding the many demands that essay writing poses to students. In particular, it helps teachers understand how the recursiveness of the writing process (i.e. the fact that the writer goes back and forth reviewing and editing) may cause those demands to interfere with each other causing cognitive overload and error. For instance, it suggests that if a student’s attention is constantly absorbed by the lower demands of written production, such as dealing with spelling or grammatical concerns, her higher order processes will suffer, and vice versa, with potentially catastrophic consequences for both levels of the text. This is one major reason why so many sixth formers underperform – unsurprisingly so, as how can one plan and write an essay effectively when one is still worrying about basic grammar issues such as agreement and word order, vocabulary choice and even spelling?

Hence, the first implication for teaching is that the lower order processes must be routinized as much as possible through a specific instructional focus on core micro-skills (e.g. spelling, agreement, word-order, ect.)

3. Understanding lower order writing processes : Translating ‘brainese’ into the target language

To truly understand the problems that the students experience in translating the propositional content (the ideas) into the target language we need to look elsewhere, e.g. at Cooper and Matsuhashi (1983)’s account below, which has more important implications for teaching. Cooper and Matsuhashi (1983) posit four stages, which correspond to Hayes and Flower’s (1980) TranslatingWording, Presenting, Storing and Transcribing.

In the first stage, the brain transforms the propositional content into vocabulary. Although at this stage the pre-lexical decisions the writer made at earlier stages and the preceding discourse limit vocabulary choice, Wording the proposition is still a complex task: ‘the choice seems infinite, especially when we begin considering all the possibilities for modifying or qualifying the main verb and the agentive and affected nouns’ (Cooper and Matsuhashi, 1983: 32).

Once she has selected the lexical items, the writer has to tackle the task of Presenting the proposition in standard written language. This involves making a series of decisions in the areas of genre, grammar and syntax. In the area of grammar, Agreement, Word-order and Tense will be the main issues for L1-English learners of target languages like French, German, Italian or Spanish.

The proposition, as planned so far, is then temporarily stored in Working  Memory  while Transcribing takes place. Propositions longer than just a few words will have to be rehearsed and re-rehearsed in Working Memory for parts of it not to be lost before the transcription is complete – which will require a substantial amount of attentional capacity.

The limitations of Working Memory create serious disadvantages for unpractised second language writers. Until they gain some confidence and fluency with spelling, their Working Memory may have to be loaded up with letter sequences of single words or with only 2 or 3 words (Hotopf, 1980). This not only slows down the writing process, but it also means that all other planning must be suspended during the transcriptions of short letter or word sequences. This will also make it hard for the brain to deal with words that are grammatically related but are located quite far from each other in a sentencee.g. ‘Mi madre es inglesa pero parece más italiana que mi padre que es alto y rubio’; with sentences like this one, Working Memory will struggle, as it will be able to deal with only chunks of two or three words at the time, which entails that by the time it gets to process the end-part of the sentence it might miss the fact that ‘italiana’ has to agree with ‘madre’.

The physical act of transcribing the fully formed proposition begins once the graphic image of the output (i.e its spelling) has been stored in Working Memory. In L1-writing, Transcribing occupies subsidiary awareness, enabling the writer to use focal awareness for other plans and decisions. However, this is not the case for the unpractised L2 writer in that she has a limited amount of attention to allocate and that whatever is taken up with the lower level demands of written language must be taken from something else.

This means that linguistic features perceived by the brain as less salient, such as function words, word-endings and copulas (e.g. ‘is’) are likely to be the first victims of Working-Memory loss as caused by divided attention as they are not essential for communication. This is a widely documented phenomenon amongst Novice-to-Intermediate L2 writers.

In sum, Cooper and Matsuhashi (1983) posit two main stages in the conversion of the preverbal message into a speech plan: (1) the selection of the right lexical units and (2) the application of grammatical rules. The unit of language is then deposited in STM awaiting translation into grapho-motor execution. This temporary storage raises the possibility that lower level demands affects production as follows:

(1) causing the writer to omit material during grapho-motor execution (i.e. the physical act of writing) – the most typical mistake in this phase is when students omit copulas (e.g. ‘is’);

(2) leading to forgetting higher-level decisions already made. Interference resulting in WSTM loss can also be caused by lack of monitoring of the written output due to devoting conscious attention entirely to planning ahead, while leaving the process of transcription to run ‘on automatic’.

4. Challenges and implications for the teaching of writing

As the above model clearly suggests, there is much more to writing than meets the eye and it is only through exploring the students’ cognitive processes through Think-aloud procedures that Hayes and Flower could produce the above-discussed model.

The most glaring challenge refers to the many demands that the L2-student writer’s Working Memory must juggle simultaneously as she produces written output, as most performance deficits will stem from inadequate ‘juggling’. Working Memory having very limited cognitive ‘space’ to process all these demands at the same time, teachers need to ensure that as many of these processes as possible are gradually routinised throughout the course in preparation for the exam. The more routinised the processes are, the less cognitive space they will require as they will be carried out subconsciously with negligible cognitive demands on the writer’s attention system.

To tackle the above challenge teachers should:

 – address each of the above processes and the skills/strategies needed to master them;

– (due to the time constraints imposed by the course) prioritise the processes and skills needing more attention (Planning based on a correct understanding the task brief? Idea generation? Lexical search? Syntax? Spelling?) in their Long-/Medium- and Short-term planning based on the identified needs of the target students;

– ensure they create opportunities for practising those skills/strategies over and over again throughout the two years of the GCSE course.

These are, for example, the priorities I have identified with my current group of year 11 students and have been addressing in my teaching, as listed in ascending order of importance:

(9) Understanding the exam brief (accurate interpretation of exam briefs)

(8) Performing noun-adjective agreement accurately

(7) Conjugating irregular verbs accurately across all target tense

(6) Selecting the correct tense

(5) Retrieving the target vocabulary rapidly and accurately in production (an important aspect of fluency)

(4) Using connectives and other discourse markers effectively to organize discourse coherently and cohesively

(3) Monitoring production of small function-words (the least visible item to the editing eye of the L2 student)

(2) Building complex and accurate sentences under time constraints (i.e. under Real Operating Conditions such as an exam)

(1) Self-monitoring (ability to edit based on knowledge of their most common production issues)

Prioritising (a must!) and keeping in my focal awareness the above in my planning has improved my teaching substantially, as some of the above skill-sets, especially (1) to (5), are the most important with the group of students I am currently teaching. Once identified my 5 top priorities, I structured my teaching  accordingly and addressed them systematically in every unit of work I planned in the last year and a bit.

Here are twelve  minimal prep / high-impact instructional strategies that worked for me

4.1 Practise understanding task briefs – First and foremost, as it is obvious, students must be given plenty of practice in understanding writing tasks briefs. At the early stages of preparation for the exams, this is better done by divorcing this activity from the actual essay writing. Give students a series of briefs and help them tackle the task by (a) modelling top-down inference strategies (e.g. using key-words); (b) teaching them the core vocabulary typically occurring in briefs alongside the topic-specific lexis that you will teach anyway as part of the course. When you run out of past-exam task briefs in the target language (e.g. French), translate the ones found in exams in other languages (e.g. German, Spanish, Italian).

4. 2 Increase task familiarity – Task familiarity has a number of major positive effects on students’ performance, some pertaining to the affective and some to the cognitive spheres. Firstly, from a cognitive point of view familiarity with the task will speed up the planning and organizing sub-processes as by practising the same task-type over and over the brain will identify and ultimately acquire schematas (cognitive patterns) which it will be able to apply to those tasks in the future. Secondly, as it is obvious, increased familiarity results in higher levels of expectancy of success and lower anxiety levels.

Many teachers increase task-familiarity by getting their students to practise with as many target writing tasks as possible; e.g. if the students have to write 140 words essay titles containing a 4-bullet-points exam brief, they will get their students to write many essays of this sort. However, this approach can be greatly enhanced by providing as many opportunities to process receptively as many model essays as possible across as many topics as possible. Producing model essays of this sort can be quite time consuming but pays enormous dividends – please note: by model essays I mean high quality essays containing linguistic material with high surrender value.

Obviously, simply asking the students to read model essays will not be enough. The students must be encouraged to notice specific desirable features, such as use of connectives; the deployment of specific idioms or set phrases, etc.. This can be done (a) through metalinguistic questions on the texts (e.g. why is the imperfect use in line 10?; how many adjectives can you spot in paragraph two?; etc.). Using model essays with their translation alongside (i.e. parallel texts) can be particularly useful when purporting to engage students in metalinguistic analysis of the text. (b) by asking the students to translate specific ítems in the text you want to draw their attention to.

Another dimension of task-familiarity pertains to knowing the evaluative criteria set by the Examination Board. The students ought to know in as much detail as possible how the examiner is going to mark their essay and what their baseline at the beginning of the course is as benchmarked against those criteria. This will start their process of self-monitoring vis-à-vis the identified deficits. The model essays alluded to above will be used to set aspirational goals based on the desirable features they will contain, whilst less ‘good’ essays will be shown to enhance student awareness of common pitfalls of ineffective writing. To further enhance such awareness a range of A*, A, B, C, D and E grade essays may be shown to the students who, working in groups, will rate them using the Examination Board criteria. Groups compare each other’s grading and discussion ensues – an AFL classic.

4.3 Increase the students’ planning efficiency – At GCSE level, organization is not a major issue; however, planning the essay before writing by brainstorming what is known and structuring it accordingly will lighten the cognitive load, especially with less proficient learners.

In the course of an interventionist study carried out with Professor Macaro in six Oxfordshire comprehensive schools with year 10 students of French, we used the following planning strategy: once understood the brief, the students were asked to brainstorm as many words, phrases and sentences associated to the various points in the brief as they could recall off the top of their head – any. They would then use the brainstormed items to generate ideas and plan the composition.

This strategy correlated with higher levels of success at post-test, especially with weaker students. The rationale for its success: the brainstorming starts a series of associations which stimulates idea generation; the words and phrases retrieved whilst brainstorming give the student a sense of reassurance that they can write something about each sub-topic and in many case all the students have to do is connect the various bits on their mind map into to produce meaningful and cohesive whole.

4.4 Practise smart coverage – Of course teaching masses of vocabulary is a must ; however, with the little time available it is impossible to cover the whole of the syllabus in the time allocated by the course. By practising smart coverage you will optimize vocabulary teaching. To achieve this you may:

(a) teach high surrender value items and structures. This means: (1) Identify  as many words, phrases and sentences as possible, that can be used whatever the essay title might be (e.g. connectives;  high frequency adjectives such as those expressing like and dislikes and emotions; high frequency verbs such as modal verbs; key tenses; key phrases/idioms such as ‘there are’, ‘to top it off’, ‘what I like/dislike’, ‘the best thing is’ and core structures such ‘after doing something’, ‘I think/don’t think it is’ etc. ); (2) recycle them to death in every single unit of work you are teaching; (3) create opportunities for practising those  items and structures across all four skills day in day out.

(b) recycle as much of the core vocabulary listed by the examination board in their specification as much as possible in your schemes of work. I have a box in every single unit/sub-unit I will teach this year, called ‘Recycling opportunities’ in which I list the words and structures processed in previous units  which I will recycle in the present unit and explain how it can be done.

(c) Focus on verbs much more than you currently do– Whilst nouns are the most important word class in terms of survival communicative skills, verbs significantly increase an L2 speaker/writer’s  autonomous communicative competence and expressive power in academic settings. Students are usually equipped with a very limited range of target-language verbs, partly because teachers are afraid their students will not be able to conjugate them. Teach your students the infinitive of the core verbs as you would teach any other lexical item; once your students know how to conjugate the modal verbs (want, can and must)or any other verbs requiring the infinitive (e.g. il faut in French or Hay que in Spanish) across the main tenses, the students will be able to use those infinitives across a wide range of contexts.

4.5 Masses of ‘smart’ receptive processing – As I reiterated in every post of mine, bombarding students with lots of listening and reading is fundamental to enhance students’ acquisition. However, when using receptive processing to enhance essay writing, whether we use model essays as suggested above or other written or aural texts, we need to ensure that we direct the students attention to the items we want them to incorporate in their writing. In paragraph 3 I discussed two strategies we can use to achieve that.

But what is even more important is that we give them opportunity to use whatever vocabulary or structures we expose to in the receptive input in student-generated productive output after they have processed it. So, if after two or three receptive activities we have encouraged them to notice the use of the pattern ‘After doing something’ we will ensure that they practise it productively afterwards in two or three subsequent tasks. Way too often, this does not happen; all that remains is a few examples scribbled on the whiteboards.

4.6 Address the core micro-skills- According to much research into the acquisition of French and Spanish as L2s, noun-adjective, subject-verb, article-noun and any other form of agreement are acquired (i.e. highly routinized) quite late in the language learning process. Hence, they require much more practice than is currently done in ML classrooms. In view of what we said in terms of the limited storage capacity of working memory, these skills ought to be practised day in day out. The minimal preparation way of dealing with this is devoting five minutes of your lesson to (a) receptive processing activities, such grammaticality judgement quizzes (three or more options are given, e.g. ‘ un chat blanc  / une chat blanc / une chat blanche’ and students to choose the right answer explaining why), partial dictations (students is given ‘Ma souris est ________’ and teacher utters’ ma souris est blanche’); (b) productive ones such as gap-fills (e.g. une souris_____ (blanc)) and  mini-white-board translations (teacher utters sentence in English and students to translate into target language).

Sadly, because agreement and conjugations are not perceived as salient by the anglo-saxon brain, when experiencing cognitive overload, they are usually the first thing to be overlooked by novice L2 student writers.

Ideally, practice in these micro-skills should occur in Primary or at least in the early years of Secondary. Students who have acquired such skills will be more fluent as they will have more cognitive space available in Working Memory to devote to higher order decisions regarding syntax, lexical selection and even planning and content organization.

4.7 Focus on small function words – small function words such as prepositions, conjuctions and articles are not semantically salient, hence they are another frequent victim of cognitive overload. Sadly, they are also one of the items in the syllabus that teachers and textbooks neglect. This is a problem if we consider that prepositions are usually the one set of items that even advanced learners struggle with as they often escape the rule of thumbs we teach them, most of them are polysemic and their usage frequently differs across languages and is not always based on common sense.

Solution: (a) when you do your miniwhiteboard translation practice make sure your sentences include prepositions; (b) when you model grammar/syntax through sentence builders, have a column which include pepositions; (c) do partial dictations where the missing items are function words; (d) use typographic devices to highlight the occurrence and interesting use of a small function word; (e) raise their awareness of the fact that they are likely to overlook this words in the editing phase of their essay writing and of the importance of having a run-through before handing in their piece which focuses solely on them; (f) last but not least: teach them how to use prepositions in context.

4.8 Work on students’ syntax – The students must be taught explicitly how to construct sentences; they will not just acquire it subconsciously through reading – as some ML teachers seem to believe – or by redrafting an essay incorporating their instructor’s corrections.

The best and easiest way to model sentence building is through carefully designed sentence builders like the one in the picture below. I teach grammar and syntax through them all the time thereby providing aural and visual input and presenting the grammar/syntax in context. This synergy between reading and listening truly is key to the success of this technique.

Figure 2 – Sentence builder


The following cognitive-comparison activity also models syntax through listening. Example: the teacher provides a syntactically incorrect L2 sentence resulting from literal L1-to-L2 translation); she then utters the correct L2 version; students to rewrite the wrong version of the sentence correctly.

Another useful task is ‘sentence puzzles’ – or any other task that requires students to rearrange sentences whose elements have been misplaced back into the correct syntactic order (intended meaning is provided). Such tasks enhance learner awareness of the importance of correct word order and can be used by instructors for inductive teaching by modelling correct syntax through the feedback given at the end of the task. A great follow-up to sentence builders.

Transformational writing techniques like the one described and discussed here, are immensely useful, too. My favourite one is sentence recombining. Example: students are given the sentences ‘My brother is annoying’  / ‘He talk too much’ /’He can be helpful at times’  and is given the following three words to merge the three sentences in one: although, because, also. Possible solution: Although my brother is annoying because he talks too much, he can also be helpful at times.

This is part of an exercise I gave yesterday to my year 10 Spanish class: merge the  following sentences into one ‘Mi barrio es ruidoso’ ‘Mi barrio está sucio’ ‘la gente es simpática’. Solution given by one student: Mi barrio es ruidoso y está sucio, sin embargo la gente es muy simpática’

Sentence recombining activities are useful because they involve reading comprehension, modelling, train students in the manipulation of syntax and focus them on form and word order whilst involving a degree of creativity, i.e. deeper processing of information (which enhances retention).

4.9 Forge fast and accurate spellers – spelling can require focal awareness in novice writers thereby undermining the accuracy of higher levels of their output. Hence, the importance of forging fast and accurate ‘spellers’ cannot be overemphasized if we are addressing deficits in our students’ writing performance. When the accuracy issues pertain to the spelling of agreement and conjugation ending the above problems are compounded by grammatical flaws in the output.

Minimal preparation activities include: old school dictations (keep them frequent but short), both full and partial; anagrams and gapped words from which the problematic letters/letter combinations have been omitted.

Daily low-stake spelling challenges (I avoid calling them tests) can be useful in enhancing the students’ focus on this level of writing if you are dealing with particularly sloppy or careless individuals.

4.10 Improve your students’ fluency – Fluency refers to the speed rate at which students can produce accurate writing. Get students used to writing under time conditions in response to an image or a brief similar to the bullet points found in the exam tasks. You may ask them to use a set number of tenses, connectives or opinions as you feel fit. Tell them that the more words they write and the wider their variety, the better. When you stage this kind of activities, grammar accuracy should not be a concern as the main focus is speed and variety of vocabulary exactly as you would do in speaking activities. Do correct their main errors if you believe they will benefit from it, but do not penalize them for making them.

4.11 Do more interactive speaking in class – At GCSE level the linguistic content and register of the students’ oral output will not be different from the written one, unless they are writing a formal letter. Hence, if their spelling is good enough, any work on their speaking fluency will result in gains in written fluency.

4.12 Promote effective editing and self-monitoring – Errors are often context-dependent. It is not by telling a student he has made a mistake and asking to self-correct or by providing a correction with rule explanation that they will not make that mistake again. If the mistakes relate to item ‘X’ (that they know the rule for) in context ‘Y’, they will only eliminate that mistake by practising the use of ‘X’ in context ‘Y’ time and again. What we can do, however, as an alternative to training them not to make recurrent mistakes again, is to train them to Self-monitor effectively by raising their awareness of the mistakes they make more often and by asking them to create a personalised checklists of mistakes to look out for in each and every essay. The students should be taught to review the essay (not ‘read’ it) by going through it several times, each time looking for a different issue so as to avoid cognitive overload. Example: noun-adjective agreement first, small function words second, omissions of verbs third, word order fourth, verb endings fifth, etc.

To sensitize students to the important of editing and bringing the issue of accuracy in their focal awareness, teachers will also stage frequent short and snappy error hunts whereby students need to find errors in model sentences provided by the teacher. A fun way of doing this is to write correct and incorrect model sentences on post-its that you will number and scatter around your classroom or the MFL corridor. Students will be given a set amount of time (I usually give them 10-15 minutes) to spot the post-its with the mistakes and correct them. The student with the most successful corrections wins.

Much more could be said about how to use feedback on error but I will abstain as I have already written way too much on this issue. The reader is invited to read previous blogs on error correction (here) if they want to know more – or our book, of course.

5. Concluding remarks

As I always reiterate on this blog, an understanding of the cognitive processes which underlie students’ performance in given contexts and skills is crucial if we want to address their deficits. In this post I have outlined those processes in their broad lines and suggested ways in which instruction can address them through a systematic and repeated long-term effort.

The main message is that we have to help our student-writers to automatize as many of the processes involved in the writing task as possible, especially those unfolding in the Translating phase so as to enable them to operate with  as light a cognitive load as possible. This will reduce divided attention freeing up more space in their Working Memory to deal with the aspects of production they find more challenging.

Ideally, most of the strategies I discussed above would be carried out from the very early stages of instruction. However, this happens rarely in English-based ML instruction, the result being very sloppy KS4 (KS4) and even KS5 (16+) students who have not been effective trained in writing fluently and accurately and exhibit little cognitive control and flexibility.

A final note: I do not believe that writing should take much of our classroom learning time; most of it should be flipped, unless we are modelling important writing strategies or we are carrying out one-on-one feedback/feedforward activities with our students. In my approach, however, as outlined in this post and, in greater detail, here, much modelling of writing and syntax can and should be done through listening-as-modelling activities as well as reading and interactive speaking. The key, however, for enabling receptive processing to effectively feed into student writing is to explicitly and vigorously foster Noticing.

Why the reliability of UK Examination Boards’ assessment of A Level writing papers is questionable

L.I.F.T. – an effective writing-proficiency and metacognition enhancer


Many years ago, as an L2 college student writer of English and French I often had doubts about the accuracy of what I wrote in my essays, especially when I was trying out a new and complex grammar structure or an idiom I had heard someone use.  However, the busy and under-paid native-speaker university language assistants charged with correcting my essays rarely gave me useful feedback on those adventurous linguistic exploits of mine. They simply underlined or crossed out my mistakes and provided their correct alternative. As an inquisitive and demanding language learner I was not satisfied. I wanted more.

So, I decided to try out a different approach; in every essay of mine I asked my teachers questions about things I was not sure about, in annotations I would write in the margin of my essays (e.g. should I use ‘with’ or ‘by’ here?; should this be ‘whose’ or ‘which’?) eagerly awaiting their replies – which I regularly got. Knowing my teachers were busy I would focus only on five or six things I had particular issues with and only after looking through my books and dictionaries in search for clues as to whether I was right or wrong.

This process ‘forced’ my teachers to give me more feedback than I had been getting; consequently, not only I learnt more, but I also became more ‘adventurous’ and ‘daring’ in my writing. This strategy really helped me a lot.

Later on in life, when I became a foreign language teacher, I recycled this strategy with my students. I call it L.I.F.T (Learner Initiated Feedback Technique). Although I had been using it for a long time already, a few years ago I decided to put its effectiveness to the test by conducting a little experiment. I used L.I.F.T. with one of two groups of able 14 years old I was teaching, whilst I used traditional error correction with the other. The students were asked to underline anything they were not sure about and write a question on margin explaining briefly what their problem or doubt was about; one condition I put was that they had to research the issues they were asking me about using web-based resources (e.g the www.wordreference.com forums). I used exactly the same teaching materials and covered the same topics with both groups.

When I compared how the two groups evolved over the time of the ‘experiment’, what I found out was interesting: they both made more or less the same type and number of mistakes; however, the group I had tried L.I.F.T. with had generally written longer and more complex sentences using more ambitious grammar structures and idioms. Moreover, I found that the questions the students were asking in their annotations had become increasingly more complex; a sign that they were becoming more inquisitive, ambitious and risk-taking. Why?

One reason refers, I think, to the fact that learners, especially the less confident ones, usually tend to avoid structures or idioms they are not sure about. However, L.I.F.T counteracts this avoidance behavior as it encourages them to try new things out and take risks; knowing the teacher is encouraging and endorsing this kind of risk-taking by ‘pushing’ them to ask for feedback elicits the use of this technique even more.

Moreover, since I made clear to them that they had to try and solve the problems by themselves first and then write down their questions, my students reported doing more independent study than before, especially the less committed ones. I also felt that they became more inquisitive as a result of the process as they were asking themselves and me more questions about grammar and vocabulary usage that google had no answer for (not a straightforward and easy-to-find one, at least).

Finally, quite a few of them reported paying more attention to my corrective feedback than before as they had requested it in the first place!

Another benefit was that, as part of the process, giving feedback became more interesting and enjoyable for me because it felt like a real dialogue between the students and myself, especially with the more adventurous and ambitious linguists; not just a top-down approach totally directed and owned by  the teacher. Also, because I felt that– most of the time, not always – they had indeed tried to answer the questions themselves, I put more effort into it.

Also, and more importantly, this process provided me access to some of my students’ thinking process and to the kind of hypotheses they formulated about how French worked.

Recently I discovered a study by Andrew Creswell (2000) who used a very similar approach with higher proficiency students than mine and reported very similar gains. His students reacted very favourably to the technique and he states that training learners in this technique created ‘a context in which students were able to work responsibly’.

I strongly recommend this technique not only for the benefits reported above, but also because if your students eventually do incorporate this technique into their learning strategies repertoire, they will acquire a powerful life-long metacognitive strategy that they might transfer to other domains of their learning and professional life.

More on this  and on my appraoch to language teaching and learning in the book I co-authored with Steve Smith “The Language Teacher Toolkit’, available here

Six writing research findings that have impacted my teaching practice


Every now and then I post concise summaries of research findings from studies I come across in my quest for emprical evidence which supports or negates my intuitions or experiences as a language teacher and learner. As I have mentioned in a previous post (‘ten reasons why you should not trust ground-breaking educational research’), much of the research evidence out there is far from being conclusive and irrefutable, due to flaws in design, data elicitation and analysis procedures which often undermine both their internal and external validity. However, when three or more  reasonaby well-crafted studies (however small) find concurring evidence which challenge commonly held assumptions  and/or resonates with our own ‘hunches’ or experiences about teaching and learning, it is reasonable to assume that ‘there is no smoke without fire’.

The following studies have been picked based on the above logic. They are small and less than perfect in design, but do reflect my professional experience and indicate that the validity some dogmata many teachers hold about language teaching and learning may be questionable.

1. Baudrand-Aertker (1992) – Effects of journal writing on L2-writing proficiency

21 students of French in the third year at a high school in Louisiana were asked to keep a journal over a nine-month period. They were required to write two entries per week at least and were not engaged in any other type of writing tasks for the whole of the duration of the study. The teacher responded to the students’ journal entries focusing only on content – not on form. Using a pre-/post-test design Baudrand-Aertker found that:

  • The students’ written proficiency improved significantly as evidenced by the post-test and their own perception;
  • The students felt that the journals helped them improve their overall mastery of the target language;
  • The students reported positive attitudes towards the activity;
  • The vast majority of the students did not want to be corrected on their grammatical mistakes when engaging in journal writing.

Although this study has important limitations in that there was no control group to compare the independent variable’ effects with, I find the results interesting and I intend to give journal-writing a try myself next year.

  1. Cooper and Morain (1980) – Effects of sentence combining instruction

The researchers investigated the effect of grammar instruction involving sentence combining tasks on the essay writing of 130 third quarter students of French. The subjects were divided into two groups: the experimental group received 60 to 150 minutes instruction per week through sentence combining exercises whilst the control group was taught ‘traditionally’ through workbook exercises. The experimental group outperformed the control group on seven of the nine measures of syntactic complexity adopted. Although the study did not look at the overall quality of the informants’ essays but only at the syntactic complexity, its findings are very interesting and has encouraged me to incorporate sentence combining tasks more regularly in my teaching strategies. Here is an discussion of the merits of sentence combining instruction and how it can be implemented

  1. Florez Estrada (1995) – Effects of interactive writing via computer as compared to traditional journaling

In this small scale study (28 university students of Spanish) Florez-Estrada compared a group of learners exchanging e-mail and chatting online with native-speaking partners with another group of students engaged in interactive paper writing with their teachers. The researcher found that the computer group outperformed the control group on the accuracy of key grammar points such as preterite vs imperfect, ‘ser’ vs ‘estar’, ‘por’ vs ‘para’ and others. The findings of this study were echoed by another study of 40 German students, Itzes (1940), which involved students in chatting via computer amongst themselves in the TL. A notable feature of this study is that the students chose the topics they wanted to chat about. These two studies confirms finding from my own practice; I often use Edmodo or Facebook to create a slow student-initiated chat on given topics in which the whole class is involved, every students sharing their opinions/comments with their peers with the assistance of the dictionaries. I have found this activity very beneficial even with groups of less able learners.

  1. Nummikoski (1991) and Caruso (1994) – Effects of extensive L2-reading on L2-writing proficiency as contrasted with written practice.

Both studies investigated if L2 learners who are engaged in extensive L2-reading (with no writing instruction/practice) write more effectively than L2 learners who are involved in writing tasks but do no reading. The results of both studies show a significant advantage for the writing-only condition. These studies, which are by no means flawless, do challenge the commonly held assumption that we can improve our students’ writing proficiency by engaging them in extensive reading.

  1. Martinez-Lage (1992) – Comparison of focus-on-form with focus-on-form-free writing

The researcher investigated the impact of two writing-task types on the writing output of 23 second-year university Spanish students. The same students were asked to write (a) typical assigned compositions and (b) dialogue journals in which they were told they would not be assessed on grammar accuracy. The surprising finding was that the syntactic complexity across both task types was equivalent but the focus-on-form-free task type (journal writing) was grammatically more accurate. I concur with Martinez-Lage on this one as I have tried this strategy myself with many of my AS groups over the years.

  1. Hedgcock and Lefkowitz (1992) – Effect of peer feedback in L2 writing

The researchers studied 30 students in an accelerated first year college French class, who wrote two essays involving three separate drafts. The experimental group was involved in peer feedback (essays were read aloud to each other and oral feedback was given), whilst the other group received written teacher feedback. In terms of performance from the first to the second essay both groups made significant improvements, but in different areas: the peer-feedback group got worse in grammar but did better on content, organization and vocabulary; the teacher feedback group, exactly the opposite. It should be noted that a previous study by Piasecki (1988) which adopted a very similar design but lasted much longer (8 weeks) and involved 112 students of third-year high school students of Spanish, found no significant differences between the two conditions. This confirms my reservations about using peer-feedback as an effective way to correct learner output and as a blanket corrective strategy; in my opinion it may work quite well with certain groups of individuals with highly developed grammar knowledge and critical thinking skills but not with others.

The causes of learner errors in L2 writing – an attempt to integrate Skill-theory and mainstream accounts of Second Language Acquisition

A cognitive account of errors in L2-writing rooted in skill acquisition and production theory

1. Introduction

 The purpose of this paper is to shed light on the cognitive sources of errors. An understanding of the psycholinguistic mechanisms that cause our students to err is fundamental if we aim to significantly enhance the (surface-level) accuracy of their written output. In what follows, I intend to take the reader through the cognitive processes underlying second language writing mapping out in detail the stages and contexts in which mistakes are usually made. In order for the reader to fully comprehend the ensuing discussion, I will begin by outlining four key concepts in Cognitive psychology which are essential for an understanding of any skill-acquisition theory of language development and production. I will then proceed to concisely discuss the way humans acquire languages according to one of the most widely accepted models of second language acquisition (Anderson’s 2000). Finally, I will provide an exhaustive account of the way we process writing rooted in Cognitive theory and resulting from an integration of a number of models of monolingual and bilingual production. I shall then draw my conclusions as to the implications of the reviewed theories and research for an approach to error correction.

2. Key concepts in Cognitive psychology

Before engaging in my discussion of L2-acquisition and L2-writing, I shall introduce the reader to the following concepts, central to any Cognitive theory of human learning and information processing:

1. Short-term and Long-Term Memory

2. Metalinguistic Knowledge and Executive Control

3. The representation of knowledge in memory

4. Proceduralisation or Automatisation

2.1 Short-Term Memory and Long-Term Memory

In Information Processing Theory, memory is conceived as a large and permanent collection of nodes, which become complexly and increasingly inter-associated through learning (Shiffrin and Schneider, 1977). Most models of memory identify a transient memory called ‘Short-Term Memory’ which can temporarily encode information and a permanent memory or Long-Term Memory (LTM). As Baddeley (1993) suggested, it is useful to think of Short-Term Memory as a Working Short-Term Memory (WSTM) consisting of the set of nodes which are activated in memory as we are processing information. In most Cognitive frameworks, WSTM is conceived as the provision of a work space for decision making, thinking and control processes and learning is but the transfer of patterns of activation from WSTM to LTM in such a way that new associations are formed between information structures or nodes not previously associated. WSTM has two key features:

(1) fragility of storage (the slightest distraction can cause the brain to lose the data being processed);

(2) limited channel capacity (it can only process a very limited amount of information for a very limited amount of time).

LTM, on the other hand, has unlimited capacity and can hold information over long periods of time. Information in LTM is normally in an inactive state. However, when we retrieve data from LTM the information associated with such data becomes activated and can be regarded as part of WSTM.

In the retrieval process, activation spreads through LTM from active nodes of the network to other parts of memory through an associative chain: when one concept is activated other related concepts become active. Thus, the amount of active information resulting can be much greater than the one currently held in WSTM. Since source nodes have only a fixed capacity for emitting activation (Anderson, 1980), and this capacity is divided amongst all the paths emanating from a given node, the more paths that exist, the less activation will be transmitted to any one path and the slower will be the rate of activation (fan effect). Thus, additional information about a concept interferes with memory for a particular piece of information thereby slowing the speed with which that fact can be retrieved. In the extreme case in which the to-be-retrieved information is too weak to be activated (owing, for instance, to minimal exposure to that information) in the presence of interference from other associations, the result will be failure to recall (Anderson, 2000).

2.2 Metalinguistic knowledge and executive control (processing efficiency)

This distinction originated from Bialystock (1982) and its validity has been supported by a number of studies (eg Hulstijin and Hulstijin, 1984). Knowledge is the way the language system is represented in LTM; Control refers to the regulation of the processing of that knowledge in WSTM during performance. The following is an example of how this distinction applies to the context of my study: many of my intermediate students usually know the rules governing the use of the Subjunctive Mood in Italian, however, they often fail to apply them correctly in Real Operating Conditions, that is when they are required to process language in real time under communicative pressure (e.g. writing an essay under severe time constraints; giving a class presentation; etc.). The reason for this phenomenon may be that WSTM’s attentional capacity being limited, its executive-control systems may not cope efficiently with the attentional demands required by a task if we are performing in operating conditions where worry, self-concern and task-irrelevant cognitive activities make use of some of the available limited capacity (Eysenck and Keane, 1995). These factors may cause retrieval problems in terms of reduced speed of recall/recognition or accuracy. Thus, as Bialystock (1982) and Johnson (1996) assert, L2-proficiency involves degree of control as well as a degree of knowledge.

2.3 The representation of knowledge in memory

Declarative Knowledge is knowledge about facts and things, while Procedural Knowledge is knowledge about how to perform different cognitive activities. This dichotomy implies that there are two ‘paths’ for the production of behaviour: a procedural and a declarative one. Following the latter, knowledge is represented in memory as a database of rules stored in the form of a semantic network. In the procedural path, on the other hand, knowledge is embedded in procedures for action, readily at hand whenever they are required, and it is consequently easier to access.

Anderson (1983) provides the example of an EFL-learner following the declarative path of forming the present perfect in English. S/he would have to apply the rule: use the verb ‘have’ followed by the past participle, which is formed by adding ‘-ed’ to the infinitive of a verb. S/he would have to hold all the knowledge about the rule formation in WSTM and would apply it each time s/he is required to form the tense. This implies that declarative processing is heavy on channel capacity, that is, it occupies the vast majority of WSTM attentional capacity. On the other hand, the learner who followed the procedural path would have a ‘program’, stored in LTM with the following information: the present perfect of ‘play’ is ‘I have played’. Deploying that program, s/he would retrieve the required form without consciously applying any explicit rule. Thus, procedural processing is lighter on WSTM channel capacity than declarative processing.

2.4 Proceduralisation or Automatization

Proceduralisation or Automatization is the process of making a skill automatic. When a skill becomes proceduralised it can be performed without any cost in terms of channel capacity (i.e. “memory space”): skill performance requires very little conscious attention, thereby freeing up ‘space’ in WSTM for other tasks.

3. L2-Acquisition as skill acquisition: the Anderson Model

The Anderson Model, called ACT* (Adaptive Control of Thought), was originally created as an account of the way students internalise geometry rules. It was later developed as a model of L2-learning (Anderson, 1980, 1983, 2000). The fundamental epistemological premise of adopting a skill-development model as a framework for L2-acquisition is that language is considered as governed by the same principles that regulate any other cognitive skill. A number of scholars such as Mc Laughlin (1987), Levelt (1989), O’Malley and Chamot (1990) and Johnson (1996), have produced a number of persuasive arguments in favour of this notion.

Although ACT* constitutes my espoused theory of L2 acquisition, I do not endorse Anderson’s claim that his model alone can give a completely satisfactory account of L2-acquisition. I do believe, however, that it can be used effectively to conceptualise at least three important dimensions of L2-acquisition which are relevant to this study: (1) the acquisition of grammatical rules in explicit adult L2-instruction, (2) the developmental mechanisms of language processing and (3) the acquisition of Learning Strategies.

 Figure 1: The Anderson Model (adapted from Anderson, 1983)


The basic structure of the model is illustrated in Figure 1, above. Anderson posits three kinds of memory, Working Memory, Declarative Memory and Production (or Procedural) Memory. Working Memory shares the same features previously discussed in describing WSTM while Declarative and Production Memory may be seen as two subcomponents of LTM. The model is based on the assumption that human cognition is regulated by cognitive structures (Productions) made up of ‘IF’ and ’THEN’ conditions. These are activated every single time the brain is processing information; whenever a learner is confronted with a problem the brain searches for a Production that matches the data pattern associated with it. For example:

IF the goal is to form the present perfect of a verb and the person is 3rd singular/

THEN form the 3rd singular of ‘have’

IF the goal is to form the present perfect of a verb and the appropriate form of ‘have’ has just been formed /

THEN form the past participle of the verb

The creation of a Production is a long and careful process since Procedural Knowledge, once created, is difficult to alter. Furthermore, unlike declarative units, Productions control behaviour, thus the system must be circumspect in creating them. Once a Production has been created and proved to be successful, it has to be automatised in order for the behaviour that it controls to happen at naturalistic rates. According to Anderson (1985), this process goes through three stages: (1) a Cognitive Stage, in which the brain learns a description of a skill; (2) an Associative Stage, in which it works out a method for executing the skill; (3) an Autonomous Stage, in which the execution of the skill becomes more and more rapid and automatic.

In the Cognitive Stage, confronted with a new task requiring a skill that has not yet been proceduralised, the brain retrieves from LTM all the declarative representations associated with that skill, using the interpretive strategies of Problem-solving and Analogy to guide behaviour. This procedure is very time-consuming, as all the stages of a process have to be specified in great detail and in serial order in WSTM. Although each stage is a Production, the operation of Productions in interpretation is very slow and burdensome as it is under conscious control and involves retrieving declarative knowledge from LTM. Furthermore, since this declarative knowledge has to be kept in WSTM, the risk of cognitive overload leading to error may arise.

Thus, for instance, in translating a sentence from the L1 into the L2, the brain will have to consciously retrieve the rules governing the use of every single L1-item, applying them one by one. In the case of complex rules whose application requires performing several operations, every single operation will have to be performed in serial order under conscious attentional control. For example, in forming the third person of the Present perfect of ‘go’, the brain may have to: (1) retrieve and apply the general rule of the present perfect (have + past participle); (2) perform the appropriate conjugation of ‘have’ by retrieving and applying the rule that the third person of ‘have’ is ‘has’; (3) recall that the past participle of ‘go’ is irregular; (4) retrieve the form ‘gone’.

Producing language by these means is extremely inefficient. Thus, the brain tries to sort out the information into more efficient Productions. This is achieved by Compiling (‘running together’) the productions that have already been created so that larger groups of productions can be used as one unit. The Compilation process consists of two sub-processes: Composition and Proceduralisation. Composition takes a sequence of Productions that follow each other in solving a particular problem and collapses them into a single Production that has the effect of the sequence. This process lessens the number of steps referred to above and has the effect of speeding up the process. Thus, the Productions

P1 IF the goal is to form the present perfect of a verb / THEN form the simple present of have

P2 IF the goal is to form the present perfect of a verb and the appropriate form of ‘have’ has just been formed / THEN form the past participle of the verb would be composed as follows:

P3 IF the goal is to form the present perfect of a verb / THEN form the present simple of have and THEN the past participle of the verb

An important point made by Anderson is that newly composed Productions are weak and may require multiple creations before they gain enough strength to compete successfully with the Productions from which they are created. Composition does not replace Productions; rather, it supplements the Production set. Thus, a composition may be created on the first opportunity but may be ‘masked’ by stronger Productions for a number of subsequent opportunities until it has built up sufficient strength (Anderson, 2000). This means that even if the new Production is more effective and efficient than the stronger Production, the latter will be retrieved more quickly because its memory trace is stronger.

The process of Proceduralisation eliminates clauses in the condition of a Production that require information to be retrieved from LTM memory and held in WSTM. As a result, proceduralised knowledge becomes available much more quickly than non-proceduralised knowledge. For example, the Production P2 above would become

IF the goal is to form the present perfect of a verb

THEN form ‘had’ and then form the past participle of the verb

The process of Composition and Proceduralisation will eventually produce after repeated performance:

IF the goal is to form the present perfect of ‘play’/ THEN form ‘ has played’

For Anderson it seems reasonable to suggest that Proceduralisation only occurs when LTM knowledge has achieved some threshold of strength and has been used some criterion number of times. The mechanism through which the brain decides which Productions should be applied in a given context is called by Anderson Matching. When the brain is confronted with a problem, activation spreads from WSTM to Procedural Memory in search for a solution – i.e. a Production that matches the pattern of information in WSTM. If such matching is possible, then a Production will be retrieved. If the pattern to be matched in WSTM corresponds to the ‘condition side’ (the ‘if’) of a proceduralised Production, the matching will be quicker with the ‘action side’ (the ‘then’) of the Production being deposited in WSTM and make it immediately available for performance (execution). It is at this intermediate stage of development that most serious errors in acquiring a skill occur: during the conversion from Declarative to Procedural knowledge, unmonitored mistakes may slip into performance.

The final stage consists of the process of Tuning, made up of the three sub-processes of Generalisation, Discrimination and Strengthening. Generalisation is the process by which Production rules become broader in their range of applicability thereby allowing the speaker to generate and comprehend utterances never before encountered. Where two existing Productions partially overlap, it may be possible to combine them to create a greater level of generality by deleting a condition that was different in the two original Productions. Anderson (1982) produces the following example of generalization from language acquisition, in which P6 and P7 become P8

P6 IF the goal is to indicate that a coat belongs to me THEN say ‘My coat’

P7 IF the goal is to indicate that a ball belongs to me THEN say ‘My ball’

P8 IF the goal is to indicate that object X belongs to me THEN say ‘My X’

Discrimination is the process by which the range of application of a Production is restricted to the appropriate circumstances (Anderson, 1983). These processes would account for the way language learners over-generalise rules but then learn over time to discriminate between, for example, regular and irregular verbs. This process would require that we have examples of both correct and incorrect applications of the Production in our LTM.

Both processes are inductive in that they try to identify from examples of success and failure the features that characterize when a particular Production rule is applicable. These two processes produce multiple variants on the conditions (the ‘IF’ clause(s) of a Production) controlling the same action. Thus, at any point in time the system is entertaining as its hypothesis not just a single Production but a set of Productions with different conditions to control the action.

Since they are inductive processes, Generalization and Discrimination will sometimes err and produce incorrect Productions. As I shall discuss later in this chapter, there are possibilities for Overgeneralization and useless Discrimination, two phenomena that are widely documented in L2-acquisition research (Ellis, 1994). Thus, the system may simply create Productions that are incorrect, either because of misinformation or because of mistakes in its computations.
ACT* uses the Strengthening mechanism to identify the best problem-solving rules and eliminate wrong Productions. Strengthening is the process by which better rules are strengthened and poorer rules are weakened. This takes place in ACT* as follows: each time a condition in WSTM activates a Production from procedural memory and causes an action to be deployed and there is no negative feedback, the Production will become more robust. Because it is more robust it will be able to resist occasional negative feedback and also it will be more strongly activated when it is called upon:
The strength of a Production determines the amount of activation it receives in competition with other Productions during pattern matching.Thus, all other things being equal, the conditions of a stronger Production will be matched more rapidly and so repress the matching of a weaker Production (Anderson, 1983: 251)
Thus, if a wrong Interlanguage item has acquired greater strength in a learner’s LTM than the correct L2-item, when activation spreads the former is more likely to be activated first, giving rise to error. It is worth pointing out that, just as the strength of a Production increases with successful use, there is a power-law of decay in strength with disuse.
 4.Extending the model: adding a ‘Procedural-to-Procedural route’ to L2-acquisition
One limitation of the model is that it does not account for the fact that sometimes unanalysed L2-chunks of language are through rote learning or frequent exposure. This happens quite frequently in classroom settings, for instance with set phrases used in everyday teacher-to-student communication (e.g. ‘Open the book’, ‘Listen up!’). As a solution to this issue Johnson (1996) suggested extending the model by allowing for the existence of a ‘Procedural to Procedural route’ to acquisition whereby some unanalysed L2-items can be automatised with use, ‘jumping’, as it were, the initial Declarative Stage posited by Anderson. In classroom settings where instruction is grammar-based, however, only a minority of L2-items will be acquired this way.

5. Bridging the ‘gap’ between the Anderson Model and ‘mainstream’ second language acquisition (SLA) research

As already pointed out above, a number of theorists believe that Anderson provides a viable conceptualisation of the processes central to L2-acquisition. However, ACT* was intended as a model of acquisition of cognitive skills in general and not specifically of L2-acquisition. Thus, the model rarely concerns itself explicitly with the following phenomena documented by SLA researchers: Language Transfer, Communicative Strategies, Variability and Fossilization. These phenomena are relevant to secondary school settings for the following reasons: firstly, as far as Language Transfer and Communicative Strategies are concerned, they constitute common sources of error in the written output of L2-intermediate learners. Variability, on the other hand, refers to the phenomenon, particularly evident in the written output of beginner to intermediate learner writing, whereby learners produce a given structure correctly in certain contexts and incorrectly in others. Finally, Fossilization is often produced as a possible explanation of the recurrence of erroneous Interlanguage forms in learner Production. Although these phenomena are accounted for in Anderson’s framework, I believe that a discussion of mainstream SLA theories and research will enhance the reader’s understanding of their nature and implications for L2 teaching. It should be noted that for reason of relevance and space my discussion will be concise and focus only on the aspects which are most relevant to the present study.

5.1 Language Transfer

This phenomenon refers to the way prior linguistic knowledge influences L2-learner development and performance (Ellis, 1994). The occurrence of Language Transfer can be accounted for by applying the ACT* framework since, as Anderson asserts, existing Declarative Knowledge is the starting point for acquiring new knowledge and skills. In a language-learning situation this means drawing on knowledge about previously learnt languages both in order to understand the mechanisms of the target language and to solve a communicative problem. In this section, I shall draw on the SLA literature in order to explain how, when and why Language Transfer occurs and with what effects on learner written output.

As Odlin (1989) points out, Language Transfer can be positive, facilitating L2-performance. This is often the case with students of mine who studied French or Spanish and are able to transfer their knowledge of these languages advantageously to Italian because Romance languages share a large number of cognates and grammatical rules. However, Language Transfer can also be negative, resulting in erroneous L2-output. For instance, over-confidence in the fact that Italian and French/Spanish are similar may prompt a learner with L3-French to apply the rules of the French Subjunctive in the deployment of the Italian Subjunctive. This strategy will be effective in some contexts but unsuccessful in others.
Transfer can also result in the avoidance or the over production of L2-structures. For example, several intermediate Japanese learners of Italian I taught in the past avoided using relative clauses because these do not exist in their L1. On the other hand they over-used the definite article because, being totally unfamiliar with the concept of definite article in their language and noticing that Italians use it frequently, they thought that they were less likely to err if they used it all the time.
Transfer can occur as a deliberate Compensatory Strategy: a learner’s conscious attempt to fill a gap in his/her L2-knowledge (Faerch and Kasper, 1983). This phenomenon is particularly recurrent when the distance between the learner’s L1/L3 and the target language is perceived as close (e.g. Spanish and Italian). Transfer can also occur subconsciously (Poulisse, 1990). When used as Compensatory Strategy, Transfer can give rise to ‘Foreignization’ and ‘Code-switching’ errors. The former refer to the conscious alteration of L1- or L3-words to make them ‘sound’ target language like. For instance, not knowing the Italian for ‘rice’ (= riso) a French learner may add an ‘o’ to the French word ‘riz’ in the hope that the resulting ‘rizo’ will be correct. Code-switching, instead, consists in the conscious or subconscious use of unaltered L1-/L3-words/phrases when an L2-word is required. Both types of error are more likely to happen in spoken language, especially when a learner is under communicative pressure or does not have access to dictionaries or other sources of L2-knowledge. However, I have personally observed this phenomenon also in the writing of many L2-student writers, especially at the level of connectives (e.g. the French conjunction ‘et’ instead of the Italian ‘e’).
Transfer may affect any level of L2-learner output. As far as the areas of language use more relevant to the present study are concerned (syntax, morphology and lexis), Ringbom (1987) reports evidence from Ringbom (1978) and other studies (e.g. Sjoholm, 1982) that L1-Transfer affects lexical usage more than it does syntax or morphology. Of these two, it appears that morphology is the less affected area. The following factors appear to determine the extent to which Language Transfer occurs:
 (1) Perceived language distance: the closer two languages are perceived to be the more likely is Transfer to occur (see Sjoholm,1982)
 (2) Learning environment: it appears that Transfer is more likely to occur in settings where the naturalistic input is lower (Odlin, 1989);
 (3) Levels of monitoring: Gass and Selinker (1983) observe that careful, unmonitored learner output usually contains fewer instances of Transfer errors
 (4) Learner-type: learners who take more risks and are more meaning-oriented tend to transfer less than form-focused ones (Odlin, 1989);
(5) Task: some tasks appear to elicit greater use of Transfer (Odlin, 1989). This appears to be the case for L1-into-L2 translation including the approach, typical of many beginner L2-learners, whereby an L2-essay is produced first in the L1 and then translated word by word.
 (6) Proficiency: as the Anderson Model and many other Cognitive models (e.g. deBot, 1992) posit, the starting point of acquisition is the L1 which is gradually replaced by the target language as more and more L2-language items are acquired. Thus, Transfer is more likely to occur at the early stages of development than in the advanced ones. This is borne out by a number of studies (e.g. Taylor, 1975; Liceras, 1985; Major, 1987). Kellerman (1978), however, found that a number of Transfer errors occur only at advanced stages.
 5.2 Communication Strategies
Due to space constraints, my discussion of Communication Strategies (CSs) will be limited to the basic issues and levels of language (i.e. grammar, lexis and orthography) relevant to this study. Corder (1978) defined a CS as follows:

a systematic technique employed by a speaker to express his meaning

when faced with some difficulty. Difficulty in this definition is taken to

refer uniquely to the speaker’s inadequate command of the language in

the interaction (Corder, 1978: 8)

A number of taxonomies of CSs have been suggested. Most frameworks (e.g. Faerch and Kasper, 1983) identify two types of approaches to solving problems in communication: (1) avoidance behaviour (avoiding the problem altogether); (2) achievement behaviour (attempting to solve the problem through an alternative plan). In Faerch and Kasper’s (1983) framework, the two different approaches result respectively in the deployment of (a) reduction strategies, governed by avoidance behaviour, and (b) achievement strategies, governed by achievement behaviour.

Reduction strategies can affect any level of writing from content (Topic avoidance) to orthography (Graphological avoidance). Most CSs studies, however, have focused on lexical items. Achievement strategies (Faerch and Kasper, 1983) correspond to Tarone’s (1981) concept of Production Strategies and to Corder’s (1978, 1983) Resource expansion strategies. By using an achievement strategy, the learner attempts to solve problems in communication by expanding his communicative resources (Corder, 1978) rather than by reducing his communicative goal (functional reduction). Faerch and Kasper (1983) identify two broad categories of achievement strategies: Compensatory and Non linguistic. The Compensatory strategies relevant to the present study are:
 (1) Code switching (see 2.4.1 above)
(2) Interlingual transfer (see 2.4.1 above)

(3) Inter-/intralingual transfer, i.e. a generalization of an IL rule is made but the generalization is influenced by the properties of the corresponding L1-structures (Jordens, 1977)

 (4) IL based strategies. These include:

(i) Generalization: the extension of an item to an inappropriate context in order to fill the ‘gaps’ in their plans. One type of generalization relevant to the present study is Approximation, that is: the use of a lexical item to express only an approximation of the intended meaning.

(ii) Word coinage. This kind of strategy involves the learner in a creative construction of a new IL word

 5.3 Variability: the occurrence of unsystematic errors
Variability in learner language refers to the phenomenon whereby a given structure is produced correctly in certain contexts and incorrectly in others. As Ellis (1994) observed, this phenomenon is very common in the early stages of acquisition and may rapidly disappear. The Anderson model can be used to account for Variability as follows: firstly, as Anderson posits, two or more Productions which refer to different hypotheses about the use of a structure can co-exist in a learner’s LTM before the onset of the Discrimination process. These Productions compete for retrieval and, if they have more or less equal strength, may be used alternately at a given stage of development as the learner is testing their effectiveness through the trial-and-error process which characterizes the early stages of learning.
Secondly, if amongst the Productions relative to a given structure, Production ‘X’ based on the correct rule is much weaker than Production ‘Y’ based on an incorrect rule, Production ‘Y’ is likely to be retrieved first when a learner is not devoting sufficient conscious attention to it and and his/her brain ‘runs on automatic’. The lack of attention is usually determined by processing inefficiency, that is the incapacity of WSTM to cope with the demands that the task poses on its attentional system (Bygate, 1988). Processing inefficiency issues in writing are more likely to arise in unplanned and/or unmonitored Production (Krashen, 1977, 1981), especially when the L2-learner is under severe time constraints / communicative pressure (Polio, Fleck and Ledere 1998).
 A third cause of Variability refers to what above I called the ‘Procedural route’ to acquisition: aspects of the usage of a structure may have been acquired by a learner through the rote learning of or exposure to set L2-phrases (e.g. classroom phrases). Thus, in cases where that structure is well beyond that learner’s stage of development and s/he doe not know any declarative knowledge of that structure, s/he will deploy that structure correctly within the context of those set phrases while being likely to make mistakes with it in other contexts.
 5.4 Fossilization

In the SLA literature, Fossilization (or Routinization) refers to the phenomenon whereby some IL forms keep reappearing in a learner’s Interlanguage ‘in spite of the learner’s ability, opportunity and motivation to learn the target language…’ (Selinker and Lamendella, 1979: 374). An error can become fossilised even if L2-learners possess correct declarative knowledge about that form and have received intensive instruction on it (Mukkatesh, 1986).

Applying the Anderson Model, Fossilization can be explained as the Proceduralisation of an erroneous form through frequent and successful use. As already discussed, Productions that have been proceduralised are very difficult to alter, which would explain why some theorists believe that Fossilisation is a permanent state (Lamendella, 1977; Mukkatesh, 1986). For applied linguists working in the Skill-theory paradigm errors can be de-fossilised, but only after a lengthy and painstaking process of re-learning of the correct form through targeted monitoring and practice in real operating conditions (Johnson, 1996).
Several models (biological, acculturational, interactional, etc.) have been proposed to account for the development of Fossilization in L2-learning. Interactional models state that the interaction between the learner and other L2-speakers determines whether a component of the learner’s Interlanguage system is reinforced contributing to Fossilization. One such model, Tollefson and Firn’s (1983), posits that an overemphasis on conveyance of meaning in the classroom may, in the absence of cognitive feedback, promote fossilization.
On this issue, Johnson (1996) also asserts that linguistic survival is often achieved by a form of pidgin and that encouraging this type of communication in the language classroom is a practice conducive to fossilisation. Skehan (1994) and Long (1983) also make the point that communicative production might lead to the development of reduction strategies resulting in pidginogenesis and fosssilization.
 6. A Cognitive account of the writing processes: the Hayes and Flower (1980) model

Hayes and Flower’s (1980) model of essay writing is regarded as one of the most effective accounts of writing available to-date (Eysenck and Keane, 1995). As Figure 2 below shows, it posits three major components:

1. Task-environment,

2. Writer’s Long-Term Memory,
3. Writing process.

Figure 1: The Hayes and Flower model (adapted from Hayes and Flower, 1980)

The Task-environment includes: (1) the writing assignment (the topic, the target audience, and motivational factors) and the text; (2) The Writer’s LTM, which provides factual knowledge and skill/genre specific procedures; (3) the Writing Process, which consists of the three sub-processes of Planning, Translating and Reviewing.

The Planning process sets goals based on information drawn from the Task-environment and Long-Term Memory (LTM). Once these have been established, a writing plan is developed to achieve those goals. More specifically, the Generating sub-process retrieves information from LTM through an associative chain in which each item of information retrieved functions as a cue to retrieve the next item of information and so forth. The Organising sub-process selects the most relevant items of information retrieved and organizes them into a coherent writing plan. Finally, the Goal-setting sub-process sets rules (e.g. ‘keep it simple’) that will be applied in the editing process. The second process, Translating, transforms the information retrieved from LTM into language. This is necessary since concepts are stored in LTM in the form of Propositions, not words. Flower and Hayes (1980) provide the following examples of what propositions involve:

[(Concept A) (Relation B) (Concept C)]

{Concept D) (Attribute E)], etc.

Finally, the Reviewing processes of Reading and Editing have the function of enhancing the quality of the output. The Editing process checks that discourse conventions are not being flouted, looks for semantic inaccuracies and evaluates the text in the light of the writing goals. Editing has the form of a Production system with two IF- THEN conditions:

 The first part specifies the kind of language to which the editing production

applies, e.g. formal sentences, notes, etc. The second is a fault detector for

such problems as grammatical errors, incorrect words, and missing context.

(Hayes and Flower, 1980: 17)

 When the conditions of a Production are met, e.g. a wrong word ending is detected, an action is triggered for fixing the problem. For example:

CONDITION 1: (formal sentence) first letter of sentence lower case

CONDITION 2: change first letter to upper case

(Adapted from Hayes and Flower, 1980: 17)

Two important features of the Editing process are: (1) it is triggered automatically whenever the conditions of an Editing Production are met; (2) it may interrupt any other ongoing process. Editing is regulated by an attentional system called The Monitor. Hayes and Flower do not provide a detailed account of how it operates. Differently from Krashen’s (1977) Monitor, a control system used solely for editing, Hayes and Flower’s (1980) device operates at all levels of production orchestrating the activation of the various sub-processes. This allows Hayes and Flower to account for two phenomena they observed. Firstly, the Editing and the Generating processes can cut across other processes. Secondly, the existence of the Monitor enables the system to be flexible in the application of goal-setting rules, in that through the Monitor any other processes can be triggered. This flexibility allows for the recursiveness of the writing process.

 7. Extending the model: Cognitive accounts of the translating sub-processes and insights from proofreading research

Hayes and Flower’s model is useful in providing teachers with a framework for understanding the many demands that essay writing poses on students. In particular, it helps teachers understand how the recursiveness of the writing process may cause those demands to interfere with each other causing cognitive overload and error. Furthermore, by conceptualising editing as a process that can interrupt writing at any moment, the model has a very important implication for a theory of error: self-correctable errors occurring at any level of written production are not always the result of a retrieval failure; they may also be interpreted as caused by detection failure. However, one limitation of the model for a theory of error is that its description of the Translating and Editing sub-processes is too general. I shall therefore supplement it with Cooper and Matsuhashi’s (1983) list of writing plans and decisions along with findings from other L1-writing Cognitive research, which will provide the reader with a more detailed account. I shall also briefly discuss some findings from proofreading research which may help explain some of the problems encountered by L2-student writers during the Editing process.

7.1 The translating sub-processes

Cooper and Matsuhashi (1983) posit four stages, which correspond to Hayes and Flower’s (1980) Translating: Wording, Presenting, Storing and Transcribing. In the first stage, the brain transforms the propositional content into lexis. Although at this stage the pre-lexical decisions the writer made at earlier stages and the preceding discourse limit lexical choice, Wording the proposition is still a complex task: ‘the choice seems infinite, especially when we begin considering all the possibilities for modifying or qualifying the main verb and the agentive and affected nouns’ (Cooper and Matsuhashi, 1983: 32). Once s/he has selected the lexical items, the writer has to tackle the task of Presenting the proposition in standard written language. This involves making a series of decisions in the areas of genre and grammar. In the area of grammar, Agreement and Tense will be the main issues.
The proposition, as planned so far, is then temporarily stored in Working Short Term Memory (henceforth WSTM) while Transcribing takes place. Propositions longer than just a few words will have to be rehearsed and re-rehearsed in WSTM for parts of it not to be lost before the transcription is complete. The limitations of WSTM create serious disadvantages for unpractised writers. Until they gain some confidence and fluency with spelling, their WSTM may have to be loaded up with letter sequences of single words or with only 2 or 3 words (Hotopf, 1980). This not only slows down the writing process, but it also means that all other planning must be suspended during the transcriptions of short letter or word sequences.

The physical act of transcribing the fully formed proposition begins once the graphic image of the output has been stored in WSTM. In L1-writing, transcription occupies subsidiary awareness, enabling the writer to use focal awareness for other plans and decisions. In practiced writers, transcription of certain words and sentences can be so automatic as to permit planning the next proposition while one is still transcribing the previous one. An interesting finding with regards to these final stages of written production comes from Bereiter, Fire and Gartshore (1979) who investigated L1-writers aged 10-12. They identified several discrepancies between learners’ forecasts in think-aloud and their actual writing. 78 % of such discrepancies involved stylistic variations. Notably, in 17% of the forecasts, significant words were uttered in forecasts which did not appear in the writing. In about half of these cases the result was a syntactic flaw (e.g. the forecasted phrase ‘on the way to school’ was written ‘on the to school’). Bereiter and Scardamalia (1987) believe that lapses of this kind indicate that language is lost somewhere between storage in WSTM and grapho-motor execution. These lapses, they also assert, cannot be described as ‘forgetting what one was going to say’ since almost every omission was reported on recall: in the case of ‘on the to school’, for example, the author not only intended to write ‘on the way’ but claimed later to have written it. In their view, this is caused by interference from the attentional demands of the mechanics of writing (spelling, capitalization, etc.), the underlying psychological premise being that a writer has a limited amount of attention to allocate and that whatever is taken up with the lower level demands of written language must be taken from something else.

In sum, Cooper and Matsuhashi (1983) posit two stages in the conversion of the preverbal message into a speech plan: (1) the selection of the right lexical units and (2) the application of grammatical rules. The unit of language is then deposited in STM awaiting translation into grapho-motor execution. This temporary storage raises the possibility that lower level demands affects production as follows: (1) causing the writer to omit material during grapho-motor execution; (2) leading to forgetting higher-level decisions already made. Interference resulting in WSTM loss can also be caused by lack of monitoring of the written output due to devoting conscious attention entirely to planning ahead, while leaving the process of transcription to run ‘on automatic’.

 7.3 Some insights from proofreading research

Proofreading theories and research provide us with the following important insights in the mechanisms that regulate essay editing. Firstly, proofreading involves different processes from reading: when one proofreads a passage, one is generally looking for misspellings, words that might have been omitted or repeated, typographical mistakes, etc., and as a result, comprehension is not the goal. When one is reading a text, on the other hand, one’s primary goal is comprehension. Thus, reading involves construction of meaning, while proofreading involves visual search. For this reason, in reading, short function words, not being semantically salient, are not fixated (Paap, Newsome, McDonald and Schvaneveldt, 1982). Consequently, errors on such words are less likely to be spotted when one is editing a text concentrating mostly on its meaning than when one is focusing one’s attention on the text as part of a proofreading task (Haber and Schindler, 1981). Errors are likely to decrease even further when the proofreader is forced to fixate on every single function word in isolation (Haber and Schindler, 1981).

 It should also be noted that some proofreader’s errors appear to be due to acoustic coding. This refers to the phenomenon whereby the way a proofreader pronounces a word/diphthong/letter influences his/her detection of an error. For example, if an English learner of L2-Italian pronounces the ‘e’ in the singular noun ‘stazione’ (= train station) as [i] instead of [e], s/he will find it difficult to differentiate it from the plural ‘stazioni’ (= train stations). This may impinge on her/his ability to spot errors with that word involving the use of the singular for the plural and vice versa.
 The implications for the present study are that learners may have be trained to go through their essays at least once focusing exclusively on form. Secondly, they should be asked to pay particular attention to those words (e.g. function words) and parts of words (e.g. verb endings) that they may not perceive as semantically salient.

7.4 Bilingual written production: adapting the unilingual model

Writing, although slower than speaking, is still processed at enormous speed in mature native speakers’ WSTM. The processing time required by a writer will be greater in the L2 than in the L1 and will increase at lower levels of proficiency: at the Wording stage, more time will be needed to match non-proceduralized lexical materials to propositions; at the Presenting stage, more time will be needed to select and retrieve the right grammatical form. Furthermore, more attentional effort will be required in rehearsing the sentence plans in WSTM; in fact, just like Hotopf’s (1980) young L1-writers, non proficient L2-learners may be able to store in WSTM only two or three words at a time. This has implications for Agreement in Italian in view of the fact that words more than three-four words distant from one another may still have to agree in gender and number. Finally, in the Transcribing phase, the retrieval of spelling and other aspects of the writing mechanics will take up more WSTM focal awareness.

Monitoring too will require more conscious effort, increasing the chances of Short-term Memory loss. This is more likely to happen with less expert learners: the attentional system having to monitor levels of language that in the mature L1-speaker are normally automatized, it will not have enough channel capacity available, at the point of utterance, to cope with lexical/grammatical items that have not yet been proceduralised. This also implies that Editing is likely to be more recursive than in L1-writing, interrupting other writing processes more often, with consequences for the higher meta-components. In view of the attentional demands posed by L2-writing, the interference caused by planning ahead will also be more likely to occur, giving rise to processing failure. Processing failure/WSTM loss may also be caused by the L2-writer pausing to consult dictionaries or other resources to fill gaps in their L2-knowledge while rehearsing the incomplete sentence plan in WSTM. In fact, research indicates that although, in general terms, composing patterns (sequences of writing behaviours) are similar in L1s and L2s there are some important differences.
In his seminal review of the L1/L2-writing literature, Silva (1993) identified a number of discrepancies between L1- and L2-composing. Firstly, L2-composing was clearly more difficult. More specifically, the Transcribing phase was more laborious, less fluent, and less productive. Also, L2-writers spent more time referring back to an outline or prompt and consulting dictionaries. They also experienced more problems in selecting the appropriate vocabulary. Furthermore, L2-writers paused more frequently and for longer time, which resulted in L2-writing occurring at a slower rate. As far as Reviewing is concerned, Silva (1993) found evidence in the literature that in L2-writing there is usually less re-reading of and reflecting on written texts. He also reported evidence suggesting that L2-writers revise more, before and while drafting, and in between drafts. However, this revision was more problematic and more of a preoccupation. There also appears to be less auditory monitoring in the L2 and L2-revision seems to focus more on grammar and less on mechanics, particularly spelling. Finally, the text features of L2-written texts provide strong evidence suggesting that L2-writing is a less fluent process involving more errors and producing – at least in terms of the judgements of native English speakers – less effective texts.
 8. Conclusion : Implications for teaching and learning
 In the above I have discussed my espoused theories of L2-acquisition and L2-writing. I started by focusing on Anderson’s (1980, 1982, 1983, 2000) account of how language structures are acquired and language processing develops. Drawing on SLA research I then discussed some important phenomena and processes involved in the aetiology of error relevant to the present study. Finally, I discussed Hayes and Flower (1980) and Cooper and Matsuhashi’s (1983) models of written production and their implications for bilingual written production. The following notions emerging from my discussion must in my view provide the theoretical underpinnings of any remedial corrective approach to L2 writing errors.
 (1) L2-acquisition occurs in much the same way as the acquisition of any other cognitive skill;

(2) the acquisition of a skill begins consciously with an associative stage during which the brain creates a declarative representation of Productions (i.e. the procedures that regulate that skill);

 (3) it is an adaptive feature of the human brain to make the performance of any skill automatic in order to render its execution fast and efficient in terms of cognitive processing;
(4) automatisation can be a very lengthy process, since for a skill to become automatic it must be performed numerous times;

(5) the Productions that regulate a skill become automatised only if their application is perceived by the brain as resulting in positive outcomes;

 (6) at a given stage in learner development, more than one Production relating to a given item can co-exist in his/her Interlanguage. These compete for retrieval. The Productions with the stronger memory trace – not necessarily the correct one – will win;

(7) negative evidence as to the effectiveness of a Production determines whether it is going to be rejected by the brain or automatised;

(8) once a Production (including those giving rise to errors) is automatised, it is difficult to alter;

(9) errors may be the result of lack of knowledge or processing efficiency problems;

(10) learners use Language Transfer and Communication Strategies to make up for the absence of the appropriate L2-declarative knowledge necessary in order to realize a given communicative goal. These phenomena are likely to give rise to error.

(11) the writing process is recursive and can be interrupted by editing any time;

(12) the errors in L2-writing relating to morphology and syntax occur mostly in the Translating phase of the writing process when Propositions are converted into language. They may occur as a result of cognitive overload caused by the interference of various processes occurring simultaneously and posing cognitive demands beyond the processing ability of the writer’s WSTM.

(13) editing for meaning involves different processes than editing for form. When editing for meaning the writer/editor is more likely to miss function words because they are less semantically salient.

These notions have important implications for any approach to error correction. One refers to Anderson’s assumption that the acquisition of L2-structures in classroom-settings mostly begins at conscious level with the creation of mental representations of the rules governing their usage. The obvious corollary being that corrective feedback should help the learners create or restructure their declarative knowledge of the L2-rule system, any corrective approach should involve L2 students in grammar learning involving cognitive restructuring and extensive practice. This entails delivering a well planned and elaborate intervention not just a one-off lesson on a structure identified as a problem in a learner’s written piece.

Another important notion advanced by Anderson is that the automization of a Production occurs only after it has been applied numerous times and with success (actual or perceived). This notion has three major implications for Error Correction.
 (1) Error Correction can play an important role in L2-acquisition since, in order to reject a wrong production, the learner needs lots of negative evidence that informs him/her of its incorrectness.

(2) Errors should be corrected consistently to avoid sending the learners confused messages about the correctness of a given structure.

(3) For Error Correction to lead to the de-fossilization of wrong Productions and the automatization of new, correct Productions, the former should occur in learner output as rarely as possible, whereas the latter should be produced as frequently as possible.

 Consistently with these three notions, a teacher may want to invest a lot of effort in raising the learners’ awareness of their errors, should be as consistent as possible in correcting them and, finally, encourage learners to practise the problematic structures as often as possible in and outside the context of the essays they will write.
Other implications refer to the concept of automatization. As discussed above, automatised cognitive structures are difficult to alter. It follows that Error Correction is more likely to be successful (in the absence of major developmental constraints) at the early stages of learning an L2-item, before ‘incorrect’ Productions have reached the ‘Strengthening’ stage of Acquisition. Thus, in order to prevent error fossilization or automatization any corrective intervention should tackle errors more prone to routinization (usually those referring to less semantically salient language items) as early as possible in the acquisition process.
Another set of implications relates to the causes and nature of learner errors. As discussed above, a number of errors result from L2-learners’ attempt to make up for their lack of correct L2-declarative knowledge through the deployment of the following problem-solving strategies:

(1) Communication Strategies: in the absence of linguistic knowledge of an L2-item a learner may deploy achievement strategies. As far as lexical items are concerned they may deploy the following strategies leading to error: ‘Approximation’, ‘Coinage’ and ‘Foreignization’. In the case of grammar or orthography learners will draw on existing declarative knowledge, over generalizing a rule (generalization) or guessing;

(2) Use of resources: learners may use dictionaries or other sources of L2-knowledge (including people) incorrectly;
(3) L1-or L3-transfer;

(4) Avoidance.

 Since these errors are extremely likely to occur in beginner and intermediate students’ writing, teachers should involve students in activities raising learner awareness of these issues and provide practice in ways of tackling them. For instance, as far as the above Communicative Strategies are concerned, students should be trained to use dictionaries and other resources more frequently to prevent errors due to Approximation, Coinage and Foreignization. Secondly, as far as poor use of resources is concerned learners must be made aware of the possible pitfalls of using dictionaries and textbooks and be trained to use these tools more effectively and efficiently. Thirdly, learners must be made aware of the issues related to the excessive reliance on L1-/L3-Transfer and of negative Transfer (again, through effective learner training)

As discussed above, errors can also be caused by WSTM processing failure due to cognitive overload. Grammatical, lexical and orthographical errors will occur as a result of learners handling structures which have not been sufficiently automatized, in situations where the operating conditions in WSTM are too challenging for the attentional system to monitor all levels of production effectively. The implications for Error Correction is that learners should be made aware of which types of contexts are more likely to cause processing efficiency failure so that they may approach them more carefully in the future. Examples of such contexts may be sentences where the learner is attempting to express a difficult concept which requires new vocabulary and the use of tenses/moods he has not totally mastered; long sentences where items agreeing with each other in gender and/or number are located quite far apart from each other (not an uncommon occurrence in Italian); situations in which the production of a sentence has to be interrupted several times because the learner needs to consult the dictionary. Remedial practice should provide the learners with opportunity to operate in such contexts in order to train them to cope with the cognitive demands they pose on processing efficiency in Real Operating Conditions.

Another important implication of my discussion for Error Correction refers to the notion that errors are not simply the result of a Translating failure, but also of an Editing failure. The failure to detect may be due to two factors. One relates to the goal oriented-ness of the Production systems that regulate any levels of language processing: the brain is going to review the accuracy of every single aspect of the text only if it perceives that this is relevant to its goals in the production of the text. Thus, if the communication of content is the main goal the writer sets in an essay, the accuracy of function words is likely to become a secondary concern since they are not perceived as salient to the realisation of that goal. The other issue will be time. It is likely that lack of time will exacerbate this issue since it will force learners to prioritise certain aspects of their output in the Editing phase(s) over others. The implication for Error Correction is that it should aim at developing learner intentionality to be accurate at every level of the text. This may not be easy if accuracy does not feature prominently amongst the curriculum, teacher and/or student’s priorities.
Secondly, editing failure may be due to the fact that reading an essay to check and/or improve the quality of its content is different from proofreading aimed at checking non-semantic aspects of the output. As noted above, the former approach to text revision often results in the failure to detect errors with function words. The implications of this phenomenon for corrective approaches is that learner awareness of the importance of paying greater attention to function words in Editing essays should be raised. Moreover, as an editing strategy, learners should be advised to carry out the revision of their essay-drafts in two distinct phases: one aimed at checking the content and another one focused exclusively on the accuracy of grammar, lexis and orthography.
Furthermore, editing failure may be caused by the same issues that caused learners to err in the first place, that is: processing efficiency. Thus, the contexts that I listed above, sentences that are long and/or complex and/or contain problematic structures, etc. may pose problems on the learner ability to detect and/or self-correct the errors. One way to tackle this issue in remedial teaching is to advise the learners to be particularly careful in editing this kind of sentences and to approach them in a way that poses less strain on their processing efficiency; for example, by concentrating first on the items that, based on the self-knowledge they will have developed as part of metacognitive training, they are more likely to get wrong in that kind of context (training in the Monitoring-Familiar-Errors strategy would help in this respect).
A final point refers to the implications of the phenomenon of Variability for the diagnostic phase of any error treatment. As discussed above, this phenomenon may confuse the teacher or the error analyst as to whether a learner knows a given structure or not, since s/he seems to get it right at times and wrong at others. The implications of this phenomenon for Error Correction is that teachers should investigate the causes of any occurrence of this phenomenon in their learners’ writing in order to ascertain whether they refer to poor editing skills, partial knowledge of the target rule, etc. Based on the identification of the causes an appropriate action plan will be decided.

How process-driven instruction may enhance foreign language upper intermediate students’ essay writing

Say you are the coach of a team of novice football players. You wouldn’t throw them into a very difficult match straight away, without the necessary training. Right? Surely you would make sure they received lots of practice in all the most crucial aspects of the game from the most basic skills (e.g. passing, drilling, tackling and shooting) to the most complex ones (e.g. defensive and offensive tactics). This is common sense, and it should apply to language teaching, too. Yet, many of us, from the very early days of A Level or IB, ask students to write cognitively and linguistically demanding discursive essays of some length, without teaching them all of the skills necessary to accomplish that challenging task effectively.

Not long ago, at a conference I attended, a teacher was emphatically asserting the importance of getting students to write essays from the very first week of AS. “I ask them to write two essays every week”- he boasted – ‘and teach them lots of grammar’. When I asked them how they learnt to write essays he replied “By writing lots of them”. Fair point, if your students are gifted linguists who read a lot ( in the target language), have highly refined critical thinking skills and are strong first language writers. But what if you students are not as exceptional?

In what follows, I advance the notion that any instructional approach to L2 essay writing should first and foremost be driven by a focus on the processes the task involves and should aim at equipping learners with the skills and strategies necessary to execute those processes effectively under real operating conditions (e.g. in the exam hall under exam conditions).

In order for the reader to fully understand the approach i advocate, let us look briefly at the cognitive processes involved in L2 essay writing, which I have already described in some detail in a previous article on this blog (‘Mapping out the foreign language writing process’).

The writing process

As the Hayes and Flower model (see Figure 1, below) shows, there are three major components to L2 essay writing:

  1. Task-environment – The task, defined by the essay requirements (essay title, audience, word limit, etc.) as well as any external resources one may want to use;
  2. Writer’s Long-Term Memory –the knowledge storage in our brain from which the writer will retrieve any information relevant to the task
  3. The writing process

Figure 1: The Hayes and Flower model (adapted from Hayes and Flower, 1980)


The writing process, as the figure above shows, is very complex and recursive – not linear as one would intuitively expect. The Planning process sets goals based on information drawn from the Task-environment and Long-Term Memory (LTM). Once these have been established, a writing plan is developed to achieve those goals. More specifically, the Generating sub-process retrieves information from LTM through an associative chain in which each item of information or concept retrieved functions as a cue to retrieve the next item of information and so forth. The Organising sub-process selects the most relevant items of information retrieved and organizes them into a coherent writing plan. Finally, the Goal-setting sub-process sets rules (e.g. ‘keep it simple’) that will be applied in the Editing process. The second process, Translating, transforms the information retrieved from LTM into language. This is necessary since concepts are stored in LTM in the form of Propositions (‘concepts’/ ‘imagery’), not language as we normally intend it (as made of words).

To execute all this processes when writing an argumentative essay in one’s first language is already quite a demanding task; but doing it in a foreign language is even more challenging. The human’s fragile and limited Working Memory has to juggle demands from all these processes, which often occur simultaneously, e.g.: whenever you set goals for the content of the next paragraph in the essay, you may evaluate them and decide you are not happy with them; so you may decide to re-plan. This means that Working Memory is loaded with lots of information. The resulting cognitive load becomes even ‘heavier’ when writing in a foreign language, when Working Memory must not only cope with idea generation, goal setting, organization and monitoring, but also with ‘translating’ propositions into L2 words and arrange them into grammatically correct sentences.

The translation process being particularly difficult to self-monitor for novice essay writers of intermediate proficiency, due to the demands posed by the higher meta-components, it is not surprising that these learners tend to rely massively on external resources and that the essay-writing process takes a lot of time (often with negative impact on motivation). Such reliance on external resources can often lead to plagiarism whether of the blatant or of the ‘smart’ sort (whereby the student picks bits from different sources and assembles them together intelligently, logically and cohesively). Although students are often blamed for such ‘unethical’ behavior, the truth of the matter is that they are often required to write essays when they are not developmentally ready for it both in terms of the higher order skills and of the lower ones.

Implications for L2-writing instruction 

The most obvious implication of the Hayes and Flower model for L2 essay writing instruction is that teachers should train their student writers to operate effectively at each level of the writing process, across all the different processes it involves. Hence, exactly as one would do with the football scenario outlined above, language teachers should teach in separate sessions the specific sets of skills students require to execute each writing sub –process. Thus, for instance, in one session or set of sessions they would stage activities aimed at practising Idea-generation and planning; in subsequent ones, the teachers would work on evaluating the relevance of the ideas generated to the essay title; in other sessions, organization (coherence and cohesion) would be focused on; etc.

Parallel to this work on higher order cognitive skills, L2 writing instruction would also have to work on the ‘Translating’ process of essay production, thereby focusing on the language level, syntax/grammar and lexical development as well as the functions and the ‘mechanics’ of written discourse. By functions of written discourse I mean acquiring the L2 discourse markers necessary to introduce and sequence information (Tour d’abord, en plus, qui plus est, etc.), contrast ideas (e.g. en revanche), express a purpose (dans le but de, afin de, etc.), etc. Practice in these skills can still be done in writing, but in less threatening and cognitive demanding contexts just like in football training you would play shorter games, maybe using only one half of the football pitch, so learner writers should be engaged in shorter and more controlled tasks in which they are required to focus on specific functions and grammar structures.

When working both at the ‘higher and ‘lower’ level of essay composition, student writers should be provided with plenty of examples of good L2 writing which they will be asked to analyze by focusing on the generation, evaluation and organization skills or on the linguistic features under study.

Through extensive discrete practice in each of the different set of skills and discourse functions, the students will be able to execute each of the different processes involved in essay writing more effectively. This will lead, in turn, to greater processing efficiency and control over the overall essay writing process with more ‘cognitive space’ available in the writer’s Working Memory to monitor the higher and lower levels of their output.

In conclusion, teachers, in my opinion, ought to rethink the way they teach L2 essay writing. Instruction should equip the learner writers with process-specific skills which will enable them to execute the Planning, Goal setting, Organizing, Self-monitoring and Translating processes effectively and efficiently (in terms of cognitive load). Hence the Schemes of Work should be adapted to include extensive instruction in each and every one of these specific skills.  Students should be made aware of the Hayes and Flower model components as instruction focuses on each sub-process; this will enhance their task-related metacognition i.e. their awareness of the skills each sub-process require.

Ultimately, it is process, not product, that should determine our L2 writing pedagogical approach. Teaching L2 learners how to master each and every stage of the writing process effectively will forge not just A* students, but more competent, independent and adaptive life-long writers.

Enhancing L2-writing grammar accuracy through ‘narrow focus’

‘Narrow focus’, as I call it, is a technique I came up with whilst teaching a group of relatively weak and demotivated (mostly male) Year 10 IGCSE students of French as a foreign language. The first written piece they handed in to me being rife with grammar errors, and our examination board (CIE) awarding grades mainly based on structural accuracy, I was obviously quite worried and had to get ‘creative’.

As I have written in previous blogs, at this level of proficiency, learner writers find it hard to juggle all the demands that the writing process poses on their Working Memory. Errors mostly occur due to processing inefficiency, the inability, that is, to monitor the accuracy of one’s output, due to divided attention (e.g.from idea generation, organization and translation process). As I wrote in my previous blog, ‘Mapping out the L2 writing process’, less proficient L2 learner writers, when suffering from cognitive overload, tend to focus on meaning and neglect function words (e.g. conjunctions, prepositions and articles) and any other words that are not semantically salient (e.g. copulas and auxiliaries). This was definitely the case with this group of students, who made frequent errors with verb endings, omissions of auxiliaries (e.g. ‘je allé’, il mangé), omission of copulas (e.g. ‘il grand’), adjectival agreement, missing plural endings, wrong prepositions, word order, etc. Too many mistakes for them to deal with simultaneously.

Normally, with more linguistically mature students I would meet up and, in the context of one-to-one conferences I would talk them through their main errors and draft a personalised check-list containing six to eight mistakes to look out for in editing their next essays. But with this group of students it would have been asking too much. They had neither the maturity nor the motivation to cope with this approach.

I decided then, on setting the second written assignment of the year, to challenge them to get three and three only specific grammar structures right. I told them that I would not bother with the rest; that in my marking of the language level of the essay I would award points based solely on how accurate those three structures were deployed. Based on the nature of the essay which was about an outing they had gone on in the past and the places they had visited, I asked them to focus on the perfect tense, the imperfect and adjectival agreement.

In order to scaffold the process, the students were asked to code each instance of the three structures with different colours; they were also given a checklist – to be used in the editing phase – with reminders of the grammar rules governing those structures, which read something like this:

Perfect tense – you need two words, one is the verb AVOIR or ETRE and the other one is the PAST PARTICIPLE of the verb; it indicates a completed action like ‘I fell’, ‘I left’, ‘the phone rang’

Imperfect – description; continuous or repetitive action; telling the time; it indicates something “one used to do’.

Adjectival agreement – feminine ending if noun is feminine; add ‘– s’for plural.

It should be pointed out that during the first cycle I focused the students on the above three structures over a period of four weeks, recycling them later on in the year when I felt necessary. Each subsequent ‘narrow focus’ cycle lasted about three to four weeks.

The rationale behind this approach was to move those three structures from their subsidiary awareness – where they had been until then – into their focal awareness. Interestingly, as my narrow focus experiment went on, not only did the level of accuracy in the three target structures addressed in each cycle increase, but the accuracy of the other structures deployed in their essays gradually improved substantially, too.

In the interviews I carried out with them, the students reported that the process triggered greater focus on formal accuracy in general and that the fact that they were evaluated only on the content and on the execution of those three structures generated less anxiety. Half of the students reported that, although at each different narrow-focus cycle they did mainly focus on the three new target structures, the structures they had focused in the previous cycle were always somehow ‘at the back of their mind’ – as one student said. In other words for some of them the ‘narrow focus’ became gradually 3 new structures + 3 old ones and maybe at a later stage 3 new structures + six old ones, etc.

In the actual exam paper, they all did brilliantly considering their starting point, and although linguistic maturation may have played a major role in it, I am confident that the ‘narrow focus’ approach played a significant part, too. I invite any colleagues to try this strategy out, if not with a whole class, with less confident pre-intermediate to intermediate students who may need to be focused on accuracy and may be lacking self-efficacy as L2 writers. In my view, this is a less threatening and more effective approach than other traditional remedial methods.