Form and Function in Language: Functional, cognitive and ...



[to be published in Theory and Practice in Functional-Cognitive Space, edited by María de los Ángeles Gómez González, Francisco José Ruiz de Mendoza Ibáñez and Francisco Gonzálvez-García (John Benjamins, 2014+)]

Cognitive functionalism in language education

Richard Hudson

University College London, United Kingdom

Abstract

Functional pressures on language are always cognitive, and cognitive pressures are always functional, so cognitivism and functionalism combine to explain the structure of lexicogrammar - the continuum of lexicon and grammar - and also the statistics of language usage. As an example, the paper shows how Word Grammar explains the difficulty of centre-embedding in terms of dependency syntax combined with a general cognitive principle of binding, and also the benefits of non-canonical word orders (such as extraposition) in the lexicogrammar. These reordering options are part of the formal academic language that children learn through education, and education should be guided by linguistic research. This is a research area that calls for far more effort and collaboration with other disciplines.

Keywords

Word Grammar, word order, education, syntax, children

1. Cognitive functionalism

The terms cognitive and functional are often combined, as in ‘functional-cognitive space’ (Gonzálvez-García and Butler 2006), ‘usage-based functionalist-cognitive models’ (Butler 2006) or ‘cognitive-functional linguistics’ (espoused by a number of university departments). This is a healthy development, but it is important to remember that each term names a distinct set of assumptions. In linguistics, cognitivism applies the insights of cognitive science, including cognitive psychology, to the study of language, on the assumption that language is subject to the same constraints and principles as other areas of cognition. Functionalism, on the other hand, seeks functional explanations for language in terms of general assumptions such as the principle of contrast (minimize ambiguity). Cognitivism need not seek functional explanations, and functionalism need not seek cognitive underpinnings. Nevertheless, it makes perfect sense to combine them because (as I shall argue below) functional pressures on language are always cognitive pressures, and the effects of cognition on language are always functional. This dual perspective is one of the attractions for me of Chris Butler’s work, along with his unflagging determination to listen, learn and understand his colleagues.

Functional pressures must always be cognitive for three reasons: it is only through cognition that they apply to language, it is only because language is an example of cognition that they apply at all, and they cover the full range of cognitive processes as applied to language. To show the significance of these three claims, imagine a functional analysis which is completely divorced from cognition, such as a branch of the mathematical theory of communication. This would analyse the elements of any communication, such as a message, a medium, a sender, a receiver and a code, and the properties that any code would have to have in order to allow efficient communication. There would be nothing in the analysis about the code’s users, its history or its social significance. The only questions would involve efficient communication: how to measure it, and how to design a code so as to maximize it.

In contrast, as soon as we bring cognition into the discussion the questions multiply. How easy is the code to learn? How does it change diachronically? What is its social significance as an important badge of group membership? How does it balance the needs of the speaker (e.g. for brevity) against those of the hearer (e.g. for explicitness)? Butler puts the complexities well in the following passage (Butler 2006:1):

“If we are to study language as communication, then we will need to take into account the properties both of human communicators and of the situations in which linguistic communication occurs. Indeed, a further important claim of functionalism is that language systems are not self-contained with respect to such factors, and therefore autonomous from them, but rather are shaped by them and so cannot be properly explained except by reference to them. Linguists who make this claim ... undoubtedly form the largest and most influential group of functional theorists. The main language-external motivating factors are of two kinds: the biological endowment of human beings, including cognition and the functioning of language processing mechanisms, and the sociocultural contexts in which communication is deeply embedded. We might also expect that a functionalist approach would pay serious attention to the interaction between these factors and the ways in which languages change over time, although in practice this varies considerably from one model to another.

The question of motivation for linguistic systems is, of course, not a simple one. Much of the formalist criticism of functionalist positions has assumed a rather naïve view of functional motivation, in which some linguistic phenomenon is explicable in terms of a single factor. Functionalists, however, have never seen things this way, but rather accept that there may be competing motivations, pulling in different directions and often leading to compromise solutions.”

This complex and sophisticated view of the pressures that shape languages has been expressed recently as ‘stable engineering solutions satisfying multiple design constraints, reflecting both cultural-historical factors and the constraints of human cognition.’ (Evans and Levinson 2009:1). For Levinson and Evans, the most significant property of language is the enormous diversity, which they hope to explain in relation to the multiple (and competing) design constraints. My only disagreement – a minor quibble about terminology - concerns their contrast between ‘cultural-historical’ and ‘the constraints of human cognition’: cultural-historical facts are themselves ultimately facts about human cognition. If the English word for ‘cat’ is CAT, this is only true because English speakers know it, act upon it and transmit it to the next generation. This is a very different kind of cognitive fact from the fact that working memory is limited, but cognitive it is nevertheless. I should therefore like to reword the quotation: ‘stable engineering solutions satisfying multiple cognitive design constraints, reflecting both variable cultural-historical knowledge and the permanent and universal constraints of human cognition.’ Similarly, Butler’s ‘sociocultural contexts’ are only relevant to the extent that they are part of speakers’ cognition.

If it is true that functional pressures are always cognitive, it is equally true that cognitive pressures are always functional, in the sense that they push language towards a better solution for one of the many competing design constraints. This claim is hard to test in the absence of a closed list of design constraints, so we might treat it as a premise to guide us in the search for design constraints: whenever we find a fact about language which seems to relate to cognition, we must find a design constraint to mediate between language and cognition. To take an elementary example, why does English rank the speaker above the addressee in the pronoun system, so that the presence of the speaker in a group forces the choice of we regardless of who else is in it? Even more interestingly, why do so many other languages do the same? True, some languages distinguish inclusive and exclusive pronouns for ‘we’, but (so far as I know) no language has a word for ‘you’ which may or may not include the speaker. Presumably the explanation lies in cognition, but it must include a design constraint such as the paramount importance of talking about oneself – a sad comment on human nature, perhaps, but apparently true.

If language is subject to functional pressures, what effects do these pressures have? If their effects are always cognitive, as I am suggesting, they must affect our minds first and foremost, and it is only via our minds that they affect our behaviour; so if I choose the word we rather than you to refer to a group including my addressee as well as myself, this is because my mind contains a ‘lexicogrammar’ which assigns each of these words a meaning which dictates this choice. (The term lexicogrammar is a very useful term from Systemic Functional Grammar for the continuum of lexicon and grammar which has more recently been rediscovered by cognitive linguists – Butler and Taverniers 2008). The pressure shapes the lexicogrammar, which in turn affects our behaviour. But is it only via the lexicogrammar that functional pressures can affect our behaviour? The answer depends on how we define ‘lexicogrammar’, but there are some functional pressures whose effects clearly fall outside any familiar definition.

For example, if you and I are talking, we are more likely to understand each other if only one of us is talking at a time, for the simple reason that listening and talking compete for the same mental resources of attention. As with any pressure, this comes with a cost – a competing pressure that has to be balanced against it. If you are talking, and I have something to say, not only do I have to wait, but I also may have to take my place in a queue along with others who also have something to say. Consequently different communities develop different behavioural norms, ranging from complete anarchy to the rigid rules of committee meetings; and these norms affect our speaking behaviour in a striking way (Hudson 1996:133). But they cannot be part of the language system if this simply controls the ways in which words are combined, pronounced and interpreted. On the other hand, the rules for speaking or staying silent are equally clearly related to the language system, because they govern its use – when to use language and when not.

Some functional pressures clearly do affect the content of the language system, and others clearly don’t. But in between these two extremes, we find ‘weak’ pressures, where some kind of language behaviour is not actually dictated by the system, but is nevertheless typical throughout the community. An example that comes to mind is the use of directional expressions in English. If my wife is downstairs and asks me to join her, I believe I would say I’ll come down in a minute rather than simply I’ll come in a minute, even though the down is completely optional, and, in the situation concerned, completely uninformative. And I believe the same is true of any English speaker describing almost any movement or position which could be related to the deictic ‘here’. So in all the following examples, the bracketed expression is grammatically optional and situationally predictable, but nevertheless expected:

1) I went (over) to Ben’s place the other day.

2) It’s (up) in the spare bedroom.

3) I’m driving (down) to Cardiff tomorrow.

I have no research evidence to support this claim, but my hunch is that the bracketed words are much more likely to be uttered than omitted. What is supported by research is the idea that our learning of language is ‘usage-based’ (Barlow and Kemmer 2000, Bybee 2010, Hudson 2007b, Tomasello 2003), which means that we maintain a mental record of the statistical patterns in other people’s behaviour; so a statistical tendency in other people’s behaviour may become part of my own behaviour (with the obvious feed-back effects on other speakers).

But why should English speakers show this particular pattern? It might be just an arbitrary pattern which we reinforce in each other, like the pronunciation patterns which are so well documented in quantitative sociolinguistics(Hudson 1996: chapter 5). But much more likely is that we have created our own local ‘functional pressure’ to specify deictic locations and directions, regardless of the hearer’s needs. If so, this would be an example of a functional pressure being created by collective linguistic behaviour, and then being learned and applied by every novice speaker. It would be reflected in the lexicogrammar by the particles which are tailor-made for this precise purpose, but their use is not governed by categorial rules. How, then, do we decide whether or not to use them?

This question is very similar to the one that arises in quantitative dialectology. For example, given that we all have a choice between a velar and an alveolar nasal in the suffix ing (as in walking or walkin’), how do we choose between them? Labov and his colleagues and followers have shown very clearly that each speaker’s choices reflect rather precisely the choice-patterns of the speakers who have served as their models, but there is no agreed cognitive model for the mechanism of choosing. What I have suggested elsewhere is that a model should take the form of a cognitive network with dynamic activation levels which trigger choices (Hudson 2007a). Once a model is in place, it could be extended to non-categorial functional pressures such as the one discussed above. This is a major research challenge because it isn’t at all obvious how to build the network needed, but the project would certainly reveal a lot about the cognitive architecture behind human language.

The general challenge that linguistic theory faces is to relate functions to structures: how to build a model of language structure which takes account of functional pressures. The current proliferation of theories, including theories whose names contain the word functional, testifies to the difficulty of this project. One basic question is whether the functions might be so closely integrated into the system that they become part of it. Some theories do merge functions and structures in this way, but in my opinion it is a mistake; I shall consider two very different theories: Optimality Theory and Systemic Functional Grammar.

Optimality Theory is the extreme case because each functional pressure is represented directly as either a faithfulness constraint or a markedness constraint within the system (Newmeyer 2010); for instance, the process that inserts an epenthetic vowel in horses is triggered by the difficulty of pronouncing two adjacent sibillants. The trouble with building pressures into the system in this way is that it turns the pressures into concepts, so they only apply to the extent that speakers have the relevant concepts; but the fact is that adjacent sibillants (for instance) are hard to pronounce whether or not we ‘know’ this conceptually.

Systemic Functional Grammar keeps the functional pressures outside the system, but analyses the structure so that it reflects the functions closely. Both the paradigmatic system-networks and the syntagmatic structures of syntax are organised into a small number of ‘metafunctions’ – ideational, interpersonal and textual – each of which is responsible for a different set of functional pressures. This means that a clause has three different syntactic structures: an ideational structure for the basic referential meaning, an interpersonal structure showing how the speaker and addressee relate to this meaning, and a textual structure showing how it relates to what has been said already (Butler 1985, Halliday 1994). My objection in this case is that the analysis misrepresents the relation between functions and structures by concealing the tensions and conflicts. In my opinion, it would be much nearer to the truth to say that we try to use a single structure to perform a number of very different jobs at the same time, so there is no sense in which a single clause can dedicate one entire structure to each job. For example, the clause Does she love me? uses she love me to describe a situation (ideational), uses me and does she to relate it to the speaker and the hearer (interpersonal), and she to relate it to the previous discourse; but these words are all closely integrated in a single structure, where the redundant does is the price we pay for this particular ‘engineering solution’ to the problem of satisfying these conflicting pressures.

But even if some attempts to relate structures to functions have been unsuccessful, we can all celebrate the twentieth century’s strong movement towards functionalism. Whatever we may think of specific theories, they are all trying to go beyond the mere analysis and description of language structures by looking for explanations. More recently, we have a separate movement towards cognitive analyses of language structures which explain how these structures relate to the rest of cognition. If we can marry the two strands, functional and conceptual, into a single cognitive-functional linguistics, then we have some hope of really understanding how language works.

2. Syntactic structure: Word order and dependency geometry

One area of language structure which has generated some particularly promising functional explanations is word order. Why are some basic orders so much more common than others? And why do languages provide so many alternative orders? Cognitive explanations have always been prominent in the sense that terms such as ‘given’ and ‘new’ have been used to capture some kind of mental reality, but it is only recently that these analyses have been able to build on work in cognitive science. One especially promising link relates word order to limitations on working memory; perhaps the best know exponent of this link is Hawkins, who argues that basic word orders evolve so as to minimize demands on working memory (Hawkins 1994, Hawkins 1999, Hawkins 2001). I find his evidence and arguments compelling, and agree with his general conclusions.

However, any discussion of the effects of functional pressures on syntactic structure presupposes some general theory of syntactic structure, and I believe Hawkins’s case would be even stronger under a different set of assumptions. For him, syntactic structure is phrase structure, so words are related to each other only via shared ‘mother’ nodes; so even if two words are adjacent, there is no direct syntactic relation between them. This analysis is not a helpful basis for explaining why syntax favours adjacency; nor is it promising as a basis for a cognitive theory of syntax because it raises the obvious question: why can’t we link words directly to one another, using the same mental apparatus that we use in relating events or objects in other areas of life? For example, if we can represent the members of our family as individuals with direct relations to other individuals, why can’t we do the same with the words in a sentence?

A much better basis for syntax, in my opinion, is dependency structure, in which the relations between individual words are paramount. Like phrase structure, dependency structure has many different interpretations in different theoretical packages, so I shall select the package that I prefer, which (unsurprisingly) is the one I created: Word Grammar (Hudson 2007b, Hudson 2010, Butler 2013). Figure 1 shows the syntactic structure for a very simple example, Cows eat grass, in a typical, but simplified, phrase-structure representation compared with a Word-Grammar dependency structure. In these diagrams we are only concerned with the basic geometry of the diagram, so labels are unnecessary; but in a complete analysis the labels (or more accurately, the classification that they imply) are essential. The main point of the diagram for present purposes is that the phrase-structure analysis puts two links between eat and grass and three between eat and cows, whereas the dependency analysis has a single link in both cases.

[pic]

Figure 1: Phrase structure and dependency structure

Seen from the perspective of cognitive science these two structures allow very different predictions. Most pertinently, the first structure predicts that the order of verb and object is irrelevant to processing difficulty, because the geometry would be exactly the same for Cows grass eat as for Cows eat grass. In contrast, the second structure predicts the opposite, as can be seen in Figure 2. For the dependency analysis, Cows grass eat ought to be harder to process because working memory has to hold the cows – eat dependency for longer than in the first figure.

[pic]

Figure 2: A contrast between phrase and dependency structures

In such simple examples, and for adult speakers, the differences are trivial; but child-language research has shown that such differences do matter for novices with small working memories. For example, small children use adjective-noun combinations more frequently before the verb (e.g. big book fall) than after it (e.g. see big book). This is easy to explain in terms of functional pressures from working memory, because the separating word adds its mental demands to the existing dependency, so the processing demands of Cows grass eat are less evenly distributed than those of Cows eat grass. In contrast, determiner-noun combinations show the reverse pattern, being more common after the verb (e.g. see that book) than before it (e.g. that book fall) (Ninio 1994). Once again, this pattern is easy to explain if nouns depend on determiners (as they do in Word Grammar - Hudson 1990:268-76). Figure 3 summarises the patterns, showing how frequency follows predicted difficulty due to dependency patterns.

[pic]

Figure 3: Dependency density in child language

One of the attractions of dependency analysis is the possibility of measuring the relative processing difficulty of different structures. Various measures are available. One, which I have called ‘dependency distance’, consists of a simple count of the number of words that separate a word from the word on which it depends Hudson 2007b:124-9) and is very similar to the distance metric developed by Ted Gibson for experimental work which clearly confirms the importance of dependency distance (Gibson 2002) in adult language. However I now believe that a more appropriate measure for some patterns would be ‘dependency density’, the number of dependencies being held in memory at any given moment. This measure is most easily illustrated with so-called ‘centre-embedded’ structures.

These sentences are so hard to process syntactically that ordinary adult experimental subjects simply give up on syntax. Consider, for example, sentence (4).

4) The patient who the nurse who the clinic had hired met Jack.

When presented with a list of sentences to be judged as either grammatical or ungrammatical, many people accept this sentence (and others like it), although it actually doesn’t make sense either syntactically or semantically. The problem can be seen in the simplified Word-Grammar analysis in Figure 4, which shows how the first who introduces a relative clause which should have the nurse as its subject, but which isn’t there.

[pic]

Figure 4: An incomplete centre-embedded sentence

The main point of this example is to show how easily our working memory can run out of resources, and how well this can be predicted by the extreme dependency density at the point indicated by the dotted line. To see this, imagine yourself reading this sentence a word at a time, and in slow motion. Figure 5 shows the state of play in your mind just after reading the word clinic. (The dependency between the and clinic can be ignored because it is so easily and quickly completed.)

[pic]

Figure 5: Part of the way through an incomplete sentence

At this point of time, you have to hold five incomplete dependencies in your mind, each looking for a word (labelled in the diagram) which you haven’t yet read:

• The top dependency is looking for verb a, a non-dependent finite verb, for the patient to depend on.

• The next dependency down was set up by the first who, and is looking for a dependent finite verb b.

• Word c is needed for the nurse to depend on.

• Similarly, words d and e are needed by the second who and the clinic.

The problem for you, as the reader, is that you’re looking for five finite verbs, each of which is represented in your working memory simply as ‘some finite verb’. Why this should be a problem isn’t completely clear, but the following explanation strikes me as plausible.

One of the most important activities in your mental life is to recognise that two concepts which are represented separately are in fact the same – that the person on the phone is your friend, or that the next street on the right is the one you’re looking for. To achieve this, you merge concepts (by binding them to one another) when they are the same, so whenever you have two concepts with similar specifications (e.g. ‘finite verb’) and similar activity level at the same time, you tend to merge them unless you have reasons for keeping them separate (Hudson 2010:91-102) – that is, merging is the default which can only be prevented by extra mental effort. In the case of this sentence, you can safely merge verbs b and c as bc because it’s almost certain that the nurse will turn out to be the subject of the verb expected by who: and similarly for d and e. The trouble is that by this time your working memory is having to hold a lot of information (five remembered words plus five dependencies plus from three to five anticipated words) and hasn’t got the resources needed to keep these very similar nodes separate, so it simply merges bc with de into a single dependent finite verb bcde, and once had hired appears, the finite verb had is accepted as the merged bcde, even though this means that it has to double as the expected complement of two who’s at the same time.

Of course, the sentence fragment in Figure 5 could have been completed grammatically, as in Figure 6. This is grammatical because each of the two who’s has a separate verb to head its relative clause; but it is virtually impossible to process.

[pic]

Figure 6: A complete centre-embedded sentence

This example explains the difficulty of the famous ‘centre-embedding’ or ‘self-embedding’ pattern, but of course such sentences are vanishingly rare in actual performance because we all know how hard they are to process, whether as speaker or as hearer. Fortunately, English offers an alternative way to express the same ideas. If we want to attach a relative clause to a noun, we have the option of ‘extraposing’ it by pretending that it actually depends on the next verb up. For example, in (5), the relative clause that I bought last week is attached directly to goldfish, but in (6), which means the same, it is extraposed so that it takes its position as a dependent of died.

5) The goldfish that I bought last week has died.

6) The goldfish has died that I bought last week.

This extraposition can be thought of as a mental operation that converts the basic default structure (5) into one that is easier to process; but it is different from a classic Chomskyan transformation because it can short-circuit the planning process so that it applies while the planned words are still only partly specified. The point is that if we see a complicated structure developing in our minds, we have ways to avoid it such as the use of extraposition – an ‘engineering solution’ to the problem of syntactic complexity.

As can be seen from the partial structure in Figure 7, the extraposed version delays the dependency between goldfish and that so that it doesn’t have to be processed at the same time as the subject link from has to the goldfish.

[pic]

Figure 7: A simple example of extraposition

Applying extraposition to the unprocessable (7) produces (8), which is easy to understand.

7) The patient who the nurse who the clinic hired treated died.

8) The patient died who the nurse treated who the clinic hired.

Moreover, extraposition reveals the ungrammaticality of the first version of the sentence, (4), repeated below as (9).

9) *The patient who the nurse who the clinic had hired met Jack.

10) *The patient met Jack who the nurse who the clinic had hired.

Perhaps the most interesting characteristic of extraposed sentences is that, although they are much easier to process than their unextraposed equivalents, they are structurally more complex because of the extra dependency between the extraposed word and the higher verb (in Figure 7, the dependency between has and that). This extra dependency coexists with all the dependencies found in the unextraposed sentence (which, for simplicity, I omitted from Figure 7). The general conclusion is that ease of processing is not a general matter of ‘complexity’, but of the distribution of processing load: evenly distributed processing load is easy, but a high concentration of load in one area is much harder.

Extraposition of a relative clause is not the only way to redistribute processing load. In fact, English has a rich supply of grammatical solutions, each geared to a different set of problematic sentences. The list below hints at the main ways in which we can tweak a sentence’s syntax to suit our particular communicative purposes and processing needs. The little formulae are meant as a guide to the particular examples rather than as a correct generalisation of the process concerned.

11) It-extraposition:

From: That you were able to help her so easily | is good. [1 | 2]

To: It | is good | that you were able to help her so easily. [it | 2 | 1]

12) Heavy-NP shift:

From: Put | all the food that we’re going to need for the party and that we can’t freeze | on this shelf. [1 | 2 | 3]

To: Put | on this shelf | all the food that we’re going to need for the party and that we can’t freeze. [1 | 3 | 2]

13) Dative shift:

From: Let’s give | something to remind her of all the good times she had with us | to Mary. [1 | 2 | to 3]

To: Let’s give | Mary | something to remind her of all the good times she had with us. [1 | 3 | 2]

14) Subject delay:

From: A wonderful old oak tree with a tree-house in its branches | stands | in the corner. [1 | 2 | 3]

To: In the corner | stands | a wonderful old oak tree with a tree-house in its branches. [3 | 2 | 1]

15) There-insertion:

From: A dog | is | in the garden. [1 | 2 | 3]

To: There | is | a dog | in the garden. [there | 2 | 1 | 3]

16) Front-shifting:

From: I bumped into someone I met at a party given by our neighbours | last night. [1 | 2]

To: Last night | I bumped into someone I met at a party given by our neighbours. [2 | 1]

17) It-clefting:

From: I bumped into someone I met at a party given by our neighbours | last night. [1 | 2]

To: It was last night that I bumped into someone I met at a party given by our neighbours. [it was 2 | that 1]

18) Wh-clefting:

From: I bumped into someone I met at a party given by our neighbours | last night. [1 | 2]

To: Last night was when I bumped into someone I met at a party given by our neighbours. [2 | was when | 1]

19) Passivization:

From: All the books that I’ve read by him | have | impressed | me. [1 | 2 | 3 | 4]

To: I | have | been impressed | by all the books that I’ve read by him. [4 | 2 | been 3 | by 1]

Each of these patterns is firmly embedded in the grammar of English, with its own rules and its own effects; and in each case it makes good sense to see it as an ‘engineering solution’ to some kind of functional demand on the speaker or hearer – in other words, as an important tool that any mature user of English can apply effectively. Which brings us to the language education which is needed in order to turn us all into ‘mature users’.

3. Language education

Education is, by definition, an interference with a child’s ‘natural’ development, an attempt to direct that development in particular ways chosen by the adult world. For some people, the notion of ‘language education’ is a contradiction because language develops naturally under its own logic, so all it needs is raw data to trigger the built-in grammatical system and to provide a vocabulary. In this view, second-language teaching should just ‘expose’ children to comprehensible input (Krashen 1982), and much the same philosophy dominated first-language English teaching for some decades (Kolln and Hancock 2005). However, there is now a well-articulated and influential body of opinion which sees language development in a very different way, with education playing a major role (Hudson 2004).

The cognitive functionalism with which we started implies that each language evolves to support the tasks that its users have to perform, so we expect, and find, as much diversity among languages as among language users. This amounts to a rejection of the romantic notion of ‘natural language’, which is language ‘as nature intended’, unspoilt by human institutions such as schools (Chomsky 1987, Chomsky 2011, Olson and others 1991). There is very little ‘natural’ about the language that you and I know, and that allows me to write these words, and you to read them. You and I both spent years of our childhood not only learning the skills of reading and writing, but also ‘academic language’ – the language of school, of universities and of a great deal of adult life. This academic language has been shaped over the centuries by the need to talk about mathematics, geography and literature, and by the need to argue, hypothesise, reason and explain. It has also been standardized, but this is a relatively minor element in its history compared with the enormous developments triggered by complex communicative demands. The fact is that two generations of theoretical linguists have used modern English as an example of a ‘natural language’ without worrying about the many ways in which we interfere with our language, or even noticing them. If diversity and cultural adaptation are normal for languages, then a ‘natural language’ is simply one that is (more or less) adapted to its culture; and from this point of view, modern English, with all its richness and complexity, is just as natural as a very simple language such as Pirahã, which has evolved to fit a very simple culture (Everett 2008).

Let’s assume, therefore, that a complex society such as ours needs a complex language, and that complex language requires education so that children can move beyond childish and casual language development. What kinds of language do children need to be taught in school? Part of the answer is obvious and uncontroversial: they need to be taught ‘relevant’ language which they won’t learn outside school. What makes language experience relevant is, of course, a social and even a political decision, according to what ‘society’ deems necessary for adult functioning. Our society generally agrees that school leavers should be able to cope with more formal and academic styles, both in spoken and written modes, and though these notions are inherently vague, there is enough agreement for examination boards to design public tests of competence in these areas.

But what does this mean, in concrete terms, for first-language teaching? What does ‘formal academic language’ contain that children won’t learn anyway from ordinary linguistic interaction outside the school? This is a research question for linguistics, but the research that it defines is remarkable for its paucity. There has been a great deal of work by psychologists on the global statistics of vocabulary growth; for example, one estimate (Bloom and Markson 1998) suggests that, starting at 30 months, we typically learned 3.6 new words per day in our pre-school years, rising to 6.6 words up to 8 years (the age when we typically become independent readers), then rising to 12.1 words. This particular estimate stops at age 10, but another research report (Nagy and Herman 1987) estimates that the typical school-leaver (year 11) knows about 40,000 words, which implies a rate of about 3,000 words per year, or just under 10 per day – roughly the same figure as for primary children. However, very few linguists seem to have done research in this area (Hatch and Brown 1995).

Even more striking is the lack of research on grammatical development during the school years – what develops, and how it can be encouraged. The most important source of information on what develops is still the work on syntax done in the 1980s by Katharine Perera (Perera 1984, Perera 1990, Perera 1994), with a few rather minor recent additions such as my own (Hudson 2009). One of the conclusions that emerges very clearly from this research is that children’s grammatical repertoire – the range of constructions that they know with sufficient confidence to actually use in their own writing – is still growing right through the school years. For example, they are learning new conjunctions and prepositions (such as although, unless and in spite of), new ways of using non-finite verbs (such as after when or on their own as adverbial clauses), and new details that are on the borderline between grammar and vocabulary (such as the prepositions selected by particular words, e.g. tired of but bored with). As to how it can be encouraged, we now have solid research evidence that sensibly planned grammatical instruction can have a considerable effect on children’s writing and reading skills (Hancock 2009, Myhill and others 2010, Myhill 2011, Chipere 2003), so the way forward is clear: English teachers can help children to develop grammatically by judicious use of direct instruction.

I should like to finish by returning to the list of grammatical tools that English provides for communicating complex ideas. These tools illustrate the potential for direct instruction in grammar. The linguistic demands of adult life go well beyond mere details of style such as formal and informal vocabulary (as in contrasts such as TRY versus ATTEMPT). None of these details will help them to deal with complex communication – reading other people’s attempts to put complex ideas into words, or writing their own attempts. The fact is that adult life often depends on this ability, and there are significant benefits not only for those who succeed, but also for those who are trying to communicate with them. Unlike Pirahã culture, complex messages are part of our culture. Our language has adapted over the centuries to these functional demands, and now contains a large number of tools for effective communication, namely extraposition and all the other structures listed in (11) to (19). School leavers could, and arguably should, be consciously aware of these tools – that they exist, how they affect syntax, how they can help, and maybe even their technical names. Grammarians know and understand all the linguistic details, psychologists know how the tools help with processing, cognitive-functional theorists know how to integrate the linguistic details with functional demands and culture, and educationalists know how to teach such things. The only missing element is collaboration.

References

Barlow, Michael and Kemmer, Suzanne. 2000. Usage based models of language. Stanford: CSLI

Bloom, Paul and Markson, Lori. 1998. “Capacities underlying word learning”. Trends in Cognitive Sciences 2: 67-73.

Butler, Christopher. 1985. Systemic Linguistics: Theory and applications. London: Batsford

Butler, Christopher. 2006. “Functionalist Theories of Language”, in Encyclopedia of Language & Linguistics, Keith Brown (ed.) (eds), 696-704. Oxford: Elsevier.

Butler, Christopher. 2013. “Word grammar”, in Theories and Methods in Linguistics (Wörterbücher der Sprach- und Kommunikationswissenschaft), Johannes Kabatek & Bernd Kortmann (eds.) (eds), Berlin: Mouton de Gruyter.

Butler, Christopher and Taverniers, Miriam. 2008. “Layering in structural-functional grammars”. Linguistics 46: 689-956.

Bybee, Joan. 2010. Language, Usage and Cognition. Cambridge: Cambridge University Press

Chipere, Ngoni. 2003. Understanding Complex Sentences: Native Speaker Variation in Syntactic Competence. London: Palgrave Macmillan

Chomsky, Noam. 1987. “Chomsky on grammar teaching. Noam Chomsky interviewed by Lillian R. Putnam”. Reading Instruction Journal 1987.

Chomsky, Noam. 2011. “Language and Other Cognitive Systems. What Is Special About Language?”. Language Learning and Development 7: 263-278.

Evans, Nicholas and Levinson, Stephen. 2009. “The Myth of Language Universals: Language diversity and its importance for cognitive science”. Behavioral and Brain Sciences 32: 429-492.

Everett, Daniel. 2008. Don't Sleep, There Are Snakes: Life and Language in the Amazonian Jungle. Pantheon.

Gibson, Edward. 2002. “The influence of referential processing on sentence complexity”. Cognition 85 : 79-112.

Gonzálvez-García, Francisco and Butler, Christopher. 2007. “Mapping functional-cognitive space”. Annual Review of Cognitive Linguistics 4: 39-96.

Halliday, Michael. 1994. An Introduction to Functional Grammar (2nd Edition). London: Arnold

Hancock, Craig. 2009. “How linguistics can inform the teaching of writing”, in The Sage Handbook of Writing, Roger Beard, Debra Myhill, Jeni Riley, & Martin Nystrand (eds.) (eds), 194-207. London etc: Sage.

Hatch, Evelyn M. and Brown, Cheryl. 1995. Vocabulary, Semantics, and Language Education. Cambridge: Cambridge University Press

Hawkins, John. 1994. A Performance Theory of Order and Constituency. Cambridge: Cambridge University Press.

Hawkins, John. 1999. “Processing complexity and filler-gap dependencies across grammars”. Language 75: 244-285.

Hawkins, John. 2001. “Why are categories adjacent?”. Journal of Linguistics 37: 1-34.

Hudson, Richard. 1990. English Word Grammar. Oxford: Blackwell.

Hudson, Richard. 1996. Sociolinguistics (Second edition). Cambridge: Cambridge University Press.

Hudson, Richard. 2004. “Why education needs linguistics (and vice versa)”. Journal of Linguistics 40: 105-130.

Hudson, Richard. 2007a. “English dialect syntax in Word Grammar”. English Language and Linguistics 11: 383-405.

Hudson, Richard. 2007b. Language Networks: The New Word Grammar. Oxford: Oxford University Press.

Hudson, Richard. 2009. “Measuring maturity”, in SAGE Handbook of Writing Development, Roger Beard, Debra Myhill, Martin Nystrand, & Jeni Riley (eds.) (eds), 349-362. London: Sage.

Hudson, Richard. 2010. An Introduction to Word Grammar. Cambridge: Cambridge University Press.

Kolln, Martha and Hancock, Craig. 2005. “The story of English grammar in United States schools”. English Teaching: Practice and Critique 4: 11-31.

Krashen, Stephen. 1982. Principles and Practice in Second Language Acquisition. Michigan: Pergamon.

Myhill, Debra. 2011. “Grammar for designers: how grammar supports the development of writing”, in Applied Linguistics and Primary School Teaching, Sue Ellis & Elspeth McCartney (eds.) (eds), 81-92. Cambridge: Cambridge University Press.

Myhill, Debra, Lines, Helen, and Watson, Annabel. 2010. “Making meaning with grammar: A repertoire of possibilities”. METAphor 2: 1-10.

Nagy, William and Herman, Patricia. 1987. “Breadth and depth of vocabulary knowledge: Implications for acquisition and instruction”, in The nature of vocabulary acquisition, Margaret McKeown & Mary Curtis (eds.), 19-35. Hillsdale NJ: Lawrence Erlbaum.

Newmeyer, Frederick. 2010. “History and Philosophy of Linguistics: an interview with Frederick J. Newmeyer”. ReVel 8.

Ninio, Anat. 1994. “Predicting the order of acquisition of three-word constructions by the complexity of their dependency structure”. First Language 14: 119-152.

Olson, Gary, Faigley, Lester, and Chomsky, Noam. 1991. “Language, politics and composition: a conversation with Noam Chomsky”. Journal of Advanced Composition 11: 1-35.

Perera, Katharine. 1984. Children's Writing and Reading. Analysing Classroom Language. Oxford: B. Blackwell in association with A. Deutsch.

Perera, Katharine. 1990. “Grammatical differentiation between speech and writing in children aged 8 to 12.” in Knowledge About Language and the Curriculum, Ronald Carter (ed.), 216-233. London: Hodder and Stoughton.

Perera, Katharine. 1994. “Child Language Research: Building on the Past, Looking to the Future”. Journal of Child Language 21: 1-7.

Tomasello, Michael. 2003. Constructing a Language: A Usage-based Theory of Language Acquisition. Harvard University Press.

-----------------------

The goldfish has died that I bought last week.

The goldfish that I bought last week has died.

The patient who the nurse who the clinic hired treated died.

a

b

c

d

e

The patient who the nurse who the clinic

??

??

The patient who the nurse who the clinic had hired met Jack.

>

>

See that book. fall.

That book fall.

See big book. fall.

Big book fall.

Cows grass eat.

Cows grass eat.

Cows eat grass.

Cows eat grass.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download