Narrative Style and the Frequencies of Very Common Words ...

Narrative Style and the Frequencies of Very Common Words: A Corpus-Based Approach to

Dickens's First Person and Third Person Narratives*

Tomoji Tabata

Abstract The present article is devoted to statistical analysis of the language of Dickens's

novels. The particular problem is to examine structural and stylistic features of the first person and third person narratives. In the following analysis, I apply Principal Component Analysis (PCA) to the examination of the frequency-patterns of very common word-types of the text-samples. What emerges from this approach is a remarkable contrast between the two narrative modes. The differentiae between Dickens's first person and third person narratives suggest a broad opposition between a more oral, subjective, verbal style and a more literate, descriptive, nominal style.

0. Introduction

The choice of a particular point of view is arguably one of the most crucial decisions in beginning a fictional discourse. Decisions may include whether to employ a first person narrator or a third person narrator, whether to narrate in the present tense or the past tense, and so forth. While a first person narrative and a third person narrative differ obviously from each other in terms of the presence, or the total absence, of the first person pronouns, I, me, and my, the two narrative modes are likely to differ in less obvious ways. They share, no doubt, many characteristics of the language of narrative distinct from other genres of English prose.

This study examines, from a quantitative viewpoint, linguistic and stylistic attributes of the two narrative modes to demonstrate how Dickens differentiates one mode of narrative from another. The approach I adopt in this study has three characteristics. First of all, it is corpus-based: the word frequency profiles that will appear a little later and other frequency counts are derived from a computerised text-corpus. Second, it focuses on very common word-types, most of

English Corpus Studies, No. 2, 1995, pp. 91-109.

92

Tomoji Tabata

which are function words, rather than rare words or so-called "key-words" that are usually focused on in studies of literary texts. Third, it is based on multivariate statistics to illustrate relationships among very common word-types; relationships among text-samples; relationships between the very common wordtypes and the text-samples. This method has produced justifiable results in studies of disputed authorship (Burrows: 1989, Craig: 1992); literary idiolects (Burrows: 1987a, Tabata: 1991); stylistic changes that occur over an author's career (Tabata 1993 & 1994 forthcoming); and in other areas as well.

Label

Table 1. The Set of Eleven Narratives

Narrator [TEXT] & Date

Word-tokens Segments [Pure-Narrative]

First Person Narratives

David#1-5 David [David Copperfield] (1849-50)

Esther#1-4 Esther [Bleak House]

(1852-3)

Pip#1-4 Pip [Great Expectations] (1860-1)

Group Total

20145

5

18399

4

18359

4

56903

13

Third Person Narratives

SB#1-3

Sketches by Boz

PP#1-3

The Pickwick Papers

OT#1-4 Oliver Twist

NN#1-3 Nicholas Nickleby

BH#1-2 Bleak House

TTC#1-3 A Tale of Two Cities

OMF#1-3 Our Mutual Friend

(1836) (1836-7) (1837-8) (1838-9) (1852-3)

(1859) (1864-5)

ED#1-3 The Mystery of Edwin Drood (1870)

Group Total

12569

3

11081

3

16677

4

12863

3

7389

2

12798

3

13117

3

11973

3

98467

24

1. Data

1.1 Corpus The corpus draws on ten novels from Dickens's oeuvre (See Table 1).1 Each

text is represented by approximately twenty thousand words from the beginning of the novel, and the language of "pure-narrative" is extracted as a basis of comparison.2 The current corpus consists of three first person narratives--David Copperfield, Esther's narrative, and Great Expectations--and eight third person

Narrative Style & the Frequencies of Very Common Words

93

narratives. Bleak House provides two narratives: one is the first person narrative by the character narrator Esther Summerson, the other is the anonymous third person narrative. The two narratives of Bleak House are also contrasted in the use of verb tenses. While Esther uses the past tense, the anonymous narrator employs the "dramatic present." The dramatic present is also used in The Mystery of Edwin Drood and in some parts of Sketches by Boz and David Copperfield.

Each text is then divided into successive 4000-word segments. Segmentation of text has two objectives. First, to give each variable (i.e., word) as appropriate a number of samples as possible in order to reduce the possibility of chance effect. Second, to help observe internal variation (or consistency) in each text. In all, the present study analyses 37 segments (or text-samples), of which 13 are first person narratives and 24 are third person narratives.3

1.2 Some preliminary treatments of data In the present case, the discrepancy between the first person and the third

person narratives in an incidence of first person pronouns is too obvious to require a statistical analysis. It is desirable, therefore, to exclude those pronouns from the following statistical analysis so as to diminish the overshadowing effect of what is already evident. Otherwise the difference due to the incidence of first person pronouns will become so inflated through statistical treatments that other subtler differences may be submerged. This exclusion of first person pronouns deprives my data of some interesting subjects for computational stylistics, but in return it makes them sensitive to evidence of subtler stylistic differences.

Another problem is concerned with verb forms. My earlier studies have shown that the top 100 words include only a small number of verbs--mostly preterite forms of common verbs, such as was, had, and said.4 The size of the present corpus, in addition, is not large enough to process verbs of lower frequency. If words of low frequency are subjected to a statistical analysis, the dearth of numbers may cause an aberrant result. The recognised solution is lemmatisation. For example, take, takes, took, taken, and taking are lemmatised as take. Lemmatisation enables a number of verbs to rank higher than in my

Table 2. Eleven Narrators in Dickens's Novels: Standardised (text-percentage) frequencies for the 100 most common word-types in the "pure-narrative."

94

Tomoji Tabata

Rank Word-types SB PP OT NN David Esther BH TTC Pip OMF ED Total (raw) (%)

1 the 2 and 3 be 4 of 5a 6 in(p) 7 his 8 have 9 to(i) 10 he 11 with 12 to(p) 13 say 14 it 15 as 16 at 17 that(c) 18 on(p) 19 by(p) 20 her(a) 21 which(r) 22 him 23 for(p) 24 but 25 she 26 not 27 from 28 when 29 this 30 all 31 an 32 they 33 look

7.606 3.914 3.477 4.225 2.912 2.164 1.090 1.567 1.201 1.034 1.154 1.034 0.151 0.676 0.692 0.756 0.549 0.835 0.525 0.271 0.812 0.342 0.732 0.422 0.127 0.501 0.485 0.159 0.294 0.326 0.493 0.509 0.127

9.097 4.088 2.969 3.592 2.346 1.660 2.265 1.516 1.101 1.354 1.263 1.119 1.724 0.605 0.957 0.496 0.388 0.713 0.578 0.108 0.641 0.298 0.415 0.289 0.009 0.262 0.478 0.343 0.307 0.171 0.433 0.325 0.244

7.327 3.598 3.352 3.190 2.908 1.847 2.027 1.619 1.325 1.961 1.091 1.091 1.133 0.768 0.851 0.738 0.660 0.522 0.672 0.168 0.762 0.899 0.570 0.414 0.126 0.336 0.402 0.342 0.546 0.300 0.348 0.444 0.216

6.320 4.019 2.946 3.281 3.001 1.788 1.998 1.174 1.314 1.454 1.104 1.026 1.508 0.700 1.011 0.910 0.599 0.575 0.536 0.288 0.669 0.474 0.459 0.342 0.155 0.327 0.443 0.350 0.498 0.420 0.233 0.389 0.334

4.433 3.927 3.783 2.636 2.442 1.812 0.521 1.762 1.524 0.789 1.052 1.176 1.142 1.365 1.082 1.023 0.933 0.660 0.457 0.988 0.417 0.392 0.491 0.660 0.963 0.551 0.308 0.536 0.382 0.367 0.308 0.268 0.432

4.723 4.310 3.565 2.462 2.571 1.853 1.005 1.631 1.549 1.223 1.277 1.163 1.614 1.076 0.848 0.962 0.989 0.554 0.435 0.598 0.424 0.478 0.484 0.582 0.902 0.554 0.326 0.462 0.217 0.451 0.342 0.217 0.321

6.834 3.424 3.816 3.424 2.774 2.463 0.947 1.719 1.340 1.177 0.920 0.988 0.650 1.272 0.826 0.758 0.595 0.555 0.528 0.839 0.420 0.338 0.568 0.555 0.568 0.447 0.406 0.392 0.298 0.352 0.392 0.284 0.298

7.462 4.329 3.110 3.469 2.508 2.016 2.188 1.469 1.188 1.453 1.274 1.203 0.445 1.399 0.938 0.836 0.484 0.766 0.524 0.656 0.398 0.641 0.328 0.445 0.445 0.391 0.484 0.344 0.313 0.336 0.336 0.539 0.453

5.817 4.210 3.470 2.511 2.495 1.672 1.117 1.759 1.416 1.073 1.149 1.024 0.980 1.285 1.008 1.100 1.002 0.757 0.452 0.376 0.381 0.507 0.479 0.523 0.616 0.523 0.376 0.485 0.338 0.479 0.289 0.207 0.468

6.602 3.690 2.851 3.255 3.171 2.173 2.295 1.243 1.189 1.479 1.395 1.235 0.991 1.113 1.037 0.953 0.602 0.724 0.640 0.793 0.435 0.496 0.343 0.450 0.267 0.381 0.267 0.252 0.442 0.328 0.450 0.175 0.358

6.515 3.792 2.923 3.566 2.773 2.096 2.038 0.969 1.111 1.178 1.336 1.169 0.793 1.052 1.128 0.618 0.501 0.727 0.585 0.685 0.309 0.443 0.543 0.317 0.292 0.267 0.326 0.309 0.451 0.334 0.443 0.409 0.292

9935 6164 5163 4879 4194 2983 2373 2358 2055 1983 1841 1736 1630 1624 1476 1337 1098 1040

826 821 794 772 762 729 700 664 595 585 579 561 560 518 516

6.394 3.967 3.323 3.140 2.699 1.920 1.527 1.518 1.323 1.276 1.185 1.117 1.049 1.045 0.950 0.861 0.707 0.669 0.532 0.528 0.511 0.497 0.490 0.469 0.451 0.427 0.383 0.377 0.373 0.361 0.360 0.333 0.332

*(a) = adjective, (adv) = adverbials, (a.d.) = adverb of degree, (c) = conjunction, (d) = demonstrative, (i) = infinitive, (r) = relative, (p) = preposition, (pron) = pronoun

Table 2. (continued)

Rank Word-types SB PP OT NN David Esther BH TTC Pip OMF ED Total (raw) (%)

Narrative Style & the Frequencies of Very Common Words

34 or 35 out 36 there 37 into 38 one 38 who(r) 40 that(d) 41 very 42 if 43 little 44 up(adv) 45 go 46 so(a.d.) 47 do 48 upon(p) 49 take 50 their 51 make 52 no(a) 53 come 54 them 55 would 56 see 57 down 58 some 59 could 60 more 61 old 62 man 63 then 64 before 65 her(pron) 66 other 67 over 68 again

0.398 0.080 0.294 0.382 0.446 0.358 0.239 0.223 0.207 0.199 0.151 0.080 0.151 0.095 0.191 0.183 0.549 0.088 0.326 0.111 0.278 0.334 0.183 0.088 0.263 0.127 0.239 0.255 0.294 0.175 0.223 0.095 0.271 0.167 0.127

0.153 0.199 0.217 0.208 0.235 0.253 0.280 0.244 0.208 0.190 0.280 0.099 0.126 0.162 0.190 0.208 0.316 0.171 0.153 0.072 0.099 0.135 0.027 0.117 0.135 0.126 0.180 0.162 0.343 0.262 0.108 0.018 0.190 0.153 0.171

0.372 0.222 0.288 0.408 0.282 0.384 0.222 0.462 0.186 0.222 0.288 0.096 0.174 0.198 0.228 0.246 0.216 0.228 0.258 0.144 0.138 0.330 0.180 0.198 0.210 0.168 0.210 0.294 0.126 0.126 0.168 0.012 0.174 0.192 0.108

0.365 0.194 0.319 0.365 0.350 0.404 0.334 0.365 0.272 0.404 0.327 0.210 0.187 0.233 0.334 0.264 0.381 0.194 0.210 0.117 0.288 0.179 0.124 0.109 0.179 0.171 0.187 0.117 0.155 0.124 0.163 0.054 0.264 0.093 0.086

0.357 0.506 0.387 0.268 0.323 0.218 0.472 0.338 0.377 0.338 0.333 0.442 0.283 0.501 0.268 0.253 0.104 0.357 0.223 0.377 0.194 0.199 0.278 0.228 0.243 0.298 0.194 0.114 0.055 0.164 0.179 0.412 0.129 0.129 0.238

0.255 0.364 0.375 0.217 0.239 0.223 0.239 0.413 0.288 0.413 0.228 0.408 0.554 0.261 0.212 0.174 0.082 0.250 0.163 0.315 0.207 0.234 0.288 0.207 0.158 0.326 0.207 0.304 0.147 0.136 0.207 0.402 0.114 0.125 0.168

0.379 0.487 0.365 0.392 0.217 0.514 0.203 0.271 0.352 0.298 0.257 0.217 0.230 0.176 0.284 0.135 0.338 0.284 0.244 0.325 0.244 0.203 0.189 0.203 0.203 0.054 0.135 0.284 0.176 0.041 0.149 0.257 0.108 0.108 0.149

0.305 0.391 0.328 0.352 0.367 0.211 0.273 0.211 0.211 0.266 0.289 0.219 0.227 0.242 0.336 0.250 0.273 0.219 0.242 0.234 0.367 0.164 0.148 0.313 0.211 0.164 0.180 0.094 0.133 0.164 0.211 0.195 0.211 0.250 0.203

0.283 0.414 0.327 0.283 0.234 0.245 0.272 0.185 0.468 0.169 0.376 0.370 0.278 0.289 0.240 0.338 0.180 0.332 0.267 0.289 0.267 0.245 0.430 0.272 0.218 0.256 0.196 0.065 0.283 0.278 0.174 0.114 0.153 0.174 0.153

0.381 0.282 0.198 0.160 0.236 0.435 0.274 0.175 0.198 0.229 0.282 0.320 0.175 0.168 0.206 0.252 0.191 0.183 0.244 0.236 0.130 0.099 0.061 0.252 0.168 0.137 0.107 0.114 0.259 0.198 0.229 0.076 0.206 0.267 0.221

0.342 0.342 0.234 0.393 0.309 0.134 0.259 0.150 0.284 0.384 0.200 0.217 0.242 0.184 0.234 0.284 0.242 0.192 0.192 0.184 0.242 0.109 0.150 0.209 0.234 0.042 0.192 0.284 0.125 0.284 0.117 0.159 0.117 0.200 0.192

505 0.325 503 0.324 480 0.309 474 0.305 457 0.294 457 0.294 447 0.288 445 0.286 443 0.285 442 0.284 435 0.280 408 0.263 394 0.254 383 0.247 382 0.246 375 0.241 372 0.239 368 0.237 356 0.229 355 0.228 343 0.221 325 0.209 319 0.205 318 0.205 316 0.203 295 0.190 292 0.188 287 0.185 285 0.183 281 0.181 277 0.178 274 0.176 269 0.173 262 0.169 260 0.167

95

*(a) = adjective, (adv) = adverbials, (a.d.) = adverb of degree, (c) = conjunction, (d) = demonstrative, (i) = infinitive, (r) = relative, (p) = prepos (pron) = pronoun

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download