A Corpus Study of Strong and Powerful

A Corpus Study of Strong and Powerful

Dominic Castello

Master of Arts in Applied Linguistics Module 4 Assignment July 2014

ELAL College of Arts & Law University of Birmingham Edgbaston Birmingham B15 2TT United Kingdom

CL/14/01 Take a small number of words or phrases (between 2 and 5) and carry out a corpus-based study to show how they are used in similar or different ways. Choose words/phrases which are interesting in some way ? e.g. your students often confuse them; they cause problems for translators working with a specific language; you yourself have difficulty deciding when to use one or the other. Examples of words / phrases which have been studied in the past include: between and through; immense, enormous and massive; reason to and reason for; on the other hand and on the contrary. Be sure to reflect critically on your methodology. (Note that you should not repeat these studies, which are mentioned as examples, but should choose different sets of words.)

1

TABLE OF CONTENTS

1.0 Introduction

3

2.0 Literature Review

4

2.1 An introduction to corpus linguistics

4

2.2 Types of corpora

4

2.3 Corpus studies and intuition

5

2.4 Capability and limitations of corpus data

6

2.5 Collocation and semantic preference

7

3.0 Analysis

8

2.1 Methodology

8

2.2 Definitions of strong and powerful

9

2.3 Token frequency

10

2.4 Collocation and patterns

11

2.5 Lexical patterning

14

4.0 Conclusion

17

5.0 References

18

6.0 Appendix

20

2

1.0 Introduction

This paper presents an exploratory corpus study into the terms strong and powerful, and will compare their usage and distribution to identify patterns, similarities and differences in the ways that they are used. Adopting the view that language is, first and foremost a communication tool, polysemous words - lexical items whose senses are identical in respect of `central' semantic traits but not in `peripheral' traits (Cruse 1986, cited in Chung 2011) can present challenges to the language learner in terms of knowing the appropriate contexts in which to use them.

The decision to analyse these particular words came from personal experience working as a language instructor at secondary level. Presently, students in my school are taught English vocabulary mainly as a discrete activity where, in some cases, similar words are presented to them as being perfectly synonymous.

At certain stages of learning, the advantages of this approach are self-evident: it is seldom beneficial to present every sense of a word from the outset. However, as students become more confident with the vocabulary through writing and conversation, naturally they apply the words in a variety of contexts with the understanding that they can be universally substituted.

Returning to my experience, as the extent of the lexical overlap (or indeed the lack thereof) between the words became more apparent to students, questions arose concerning how to differentiate which of them were appropriate in a given context. Which one would best describe a compelling argument, an economically successful or influential nation, or athletes such as bodybuilders? Are certain words or types of words more likely to appear with strong than powerful and vice versa?

The current study aims to use empirical evidence to investigate the patterns of language relating to these words. In doing so it is hoped that the similarities and differences between the terms and the ways in which they are used (that is to say, in context or as part of a semantic group) will be revealed.

3

2.0 Literature review

2.1 An introduction to corpus linguistics

Corpus linguistics is a methodology of linguistic analysis that views `naturally-occurring' language as a credible source for the investigation and classification of linguistic structures (Neselhauff 2011). According to Hanks (2012), corpus linguistics is primarily concerned with interpreting observed language in order to arrive at statements on patterns in word meaning or syntactic composition.

Within this field, a corpus is defined as `a large collection of authentic texts that have been selected and organised following precise linguistic criteria' (Sinclair 1991, 1996; Leech 1991:8, Williams 2003 amongst others). Corpus data is systematic in that its structure and contents will be governed by a number of sampling principles, such as the mode of discourse, subject and variety of language (Neselhauff 2011), while its authenticity stems from the fact that a corpus typically pulls together thousands if not millions of written and/or spoken texts sampled directly from 'maximally representative' examples of language in use (Mcenery and Wilson 1996:87, Dobric 2009:360, Bowker and Pearson 2002:9).

Corpus-driven lexicographers arrive at statements about word meaning or syntactic structures by studying usage, and evaluating the constraints and preferences associated with each word `for what they really are' (Hanks 2012).

2.2 Types of corpora

The earliest electronic corpora were compiled in the 1960s, but it was, according to () the technological advances of the 1970s - most significantly the introduction of digital computers - that meant that for the first time there was sufficient power to collate electronic language databases (Mcenery and Hardie 2013). These could be used as a resource with which to better understand the characteristics of an unprecedented number of source texts. In the 1990s, further developments in corpus design allowed for the capture of discourse in much greater volume. The resulting corpora were considered a rich and varied enough

4

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download