Narrative Style and the Frequencies of Very Common Words ...
Narrative Style and the Frequencies of Very Common Words: A Corpus-Based Approach to
Dickens's First Person and Third Person Narratives*
Tomoji Tabata
Abstract The present article is devoted to statistical analysis of the language of Dickens's
novels. The particular problem is to examine structural and stylistic features of the first person and third person narratives. In the following analysis, I apply Principal Component Analysis (PCA) to the examination of the frequency-patterns of very common word-types of the text-samples. What emerges from this approach is a remarkable contrast between the two narrative modes. The differentiae between Dickens's first person and third person narratives suggest a broad opposition between a more oral, subjective, verbal style and a more literate, descriptive, nominal style.
0. Introduction
The choice of a particular point of view is arguably one of the most crucial decisions in beginning a fictional discourse. Decisions may include whether to employ a first person narrator or a third person narrator, whether to narrate in the present tense or the past tense, and so forth. While a first person narrative and a third person narrative differ obviously from each other in terms of the presence, or the total absence, of the first person pronouns, I, me, and my, the two narrative modes are likely to differ in less obvious ways. They share, no doubt, many characteristics of the language of narrative distinct from other genres of English prose.
This study examines, from a quantitative viewpoint, linguistic and stylistic attributes of the two narrative modes to demonstrate how Dickens differentiates one mode of narrative from another. The approach I adopt in this study has three characteristics. First of all, it is corpus-based: the word frequency profiles that will appear a little later and other frequency counts are derived from a computerised text-corpus. Second, it focuses on very common word-types, most of
English Corpus Studies, No. 2, 1995, pp. 91-109.
92
Tomoji Tabata
which are function words, rather than rare words or so-called "key-words" that are usually focused on in studies of literary texts. Third, it is based on multivariate statistics to illustrate relationships among very common word-types; relationships among text-samples; relationships between the very common wordtypes and the text-samples. This method has produced justifiable results in studies of disputed authorship (Burrows: 1989, Craig: 1992); literary idiolects (Burrows: 1987a, Tabata: 1991); stylistic changes that occur over an author's career (Tabata 1993 & 1994 forthcoming); and in other areas as well.
Label
Table 1. The Set of Eleven Narratives
Narrator [TEXT] & Date
Word-tokens Segments [Pure-Narrative]
First Person Narratives
David#1-5 David [David Copperfield] (1849-50)
Esther#1-4 Esther [Bleak House]
(1852-3)
Pip#1-4 Pip [Great Expectations] (1860-1)
Group Total
20145
5
18399
4
18359
4
56903
13
Third Person Narratives
SB#1-3
Sketches by Boz
PP#1-3
The Pickwick Papers
OT#1-4 Oliver Twist
NN#1-3 Nicholas Nickleby
BH#1-2 Bleak House
TTC#1-3 A Tale of Two Cities
OMF#1-3 Our Mutual Friend
(1836) (1836-7) (1837-8) (1838-9) (1852-3)
(1859) (1864-5)
ED#1-3 The Mystery of Edwin Drood (1870)
Group Total
12569
3
11081
3
16677
4
12863
3
7389
2
12798
3
13117
3
11973
3
98467
24
1. Data
1.1 Corpus The corpus draws on ten novels from Dickens's oeuvre (See Table 1).1 Each
text is represented by approximately twenty thousand words from the beginning of the novel, and the language of "pure-narrative" is extracted as a basis of comparison.2 The current corpus consists of three first person narratives--David Copperfield, Esther's narrative, and Great Expectations--and eight third person
Narrative Style & the Frequencies of Very Common Words
93
narratives. Bleak House provides two narratives: one is the first person narrative by the character narrator Esther Summerson, the other is the anonymous third person narrative. The two narratives of Bleak House are also contrasted in the use of verb tenses. While Esther uses the past tense, the anonymous narrator employs the "dramatic present." The dramatic present is also used in The Mystery of Edwin Drood and in some parts of Sketches by Boz and David Copperfield.
Each text is then divided into successive 4000-word segments. Segmentation of text has two objectives. First, to give each variable (i.e., word) as appropriate a number of samples as possible in order to reduce the possibility of chance effect. Second, to help observe internal variation (or consistency) in each text. In all, the present study analyses 37 segments (or text-samples), of which 13 are first person narratives and 24 are third person narratives.3
1.2 Some preliminary treatments of data In the present case, the discrepancy between the first person and the third
person narratives in an incidence of first person pronouns is too obvious to require a statistical analysis. It is desirable, therefore, to exclude those pronouns from the following statistical analysis so as to diminish the overshadowing effect of what is already evident. Otherwise the difference due to the incidence of first person pronouns will become so inflated through statistical treatments that other subtler differences may be submerged. This exclusion of first person pronouns deprives my data of some interesting subjects for computational stylistics, but in return it makes them sensitive to evidence of subtler stylistic differences.
Another problem is concerned with verb forms. My earlier studies have shown that the top 100 words include only a small number of verbs--mostly preterite forms of common verbs, such as was, had, and said.4 The size of the present corpus, in addition, is not large enough to process verbs of lower frequency. If words of low frequency are subjected to a statistical analysis, the dearth of numbers may cause an aberrant result. The recognised solution is lemmatisation. For example, take, takes, took, taken, and taking are lemmatised as take. Lemmatisation enables a number of verbs to rank higher than in my
Table 2. Eleven Narrators in Dickens's Novels: Standardised (text-percentage) frequencies for the 100 most common word-types in the "pure-narrative."
94
Tomoji Tabata
Rank Word-types SB PP OT NN David Esther BH TTC Pip OMF ED Total (raw) (%)
1 the 2 and 3 be 4 of 5a 6 in(p) 7 his 8 have 9 to(i) 10 he 11 with 12 to(p) 13 say 14 it 15 as 16 at 17 that(c) 18 on(p) 19 by(p) 20 her(a) 21 which(r) 22 him 23 for(p) 24 but 25 she 26 not 27 from 28 when 29 this 30 all 31 an 32 they 33 look
7.606 3.914 3.477 4.225 2.912 2.164 1.090 1.567 1.201 1.034 1.154 1.034 0.151 0.676 0.692 0.756 0.549 0.835 0.525 0.271 0.812 0.342 0.732 0.422 0.127 0.501 0.485 0.159 0.294 0.326 0.493 0.509 0.127
9.097 4.088 2.969 3.592 2.346 1.660 2.265 1.516 1.101 1.354 1.263 1.119 1.724 0.605 0.957 0.496 0.388 0.713 0.578 0.108 0.641 0.298 0.415 0.289 0.009 0.262 0.478 0.343 0.307 0.171 0.433 0.325 0.244
7.327 3.598 3.352 3.190 2.908 1.847 2.027 1.619 1.325 1.961 1.091 1.091 1.133 0.768 0.851 0.738 0.660 0.522 0.672 0.168 0.762 0.899 0.570 0.414 0.126 0.336 0.402 0.342 0.546 0.300 0.348 0.444 0.216
6.320 4.019 2.946 3.281 3.001 1.788 1.998 1.174 1.314 1.454 1.104 1.026 1.508 0.700 1.011 0.910 0.599 0.575 0.536 0.288 0.669 0.474 0.459 0.342 0.155 0.327 0.443 0.350 0.498 0.420 0.233 0.389 0.334
4.433 3.927 3.783 2.636 2.442 1.812 0.521 1.762 1.524 0.789 1.052 1.176 1.142 1.365 1.082 1.023 0.933 0.660 0.457 0.988 0.417 0.392 0.491 0.660 0.963 0.551 0.308 0.536 0.382 0.367 0.308 0.268 0.432
4.723 4.310 3.565 2.462 2.571 1.853 1.005 1.631 1.549 1.223 1.277 1.163 1.614 1.076 0.848 0.962 0.989 0.554 0.435 0.598 0.424 0.478 0.484 0.582 0.902 0.554 0.326 0.462 0.217 0.451 0.342 0.217 0.321
6.834 3.424 3.816 3.424 2.774 2.463 0.947 1.719 1.340 1.177 0.920 0.988 0.650 1.272 0.826 0.758 0.595 0.555 0.528 0.839 0.420 0.338 0.568 0.555 0.568 0.447 0.406 0.392 0.298 0.352 0.392 0.284 0.298
7.462 4.329 3.110 3.469 2.508 2.016 2.188 1.469 1.188 1.453 1.274 1.203 0.445 1.399 0.938 0.836 0.484 0.766 0.524 0.656 0.398 0.641 0.328 0.445 0.445 0.391 0.484 0.344 0.313 0.336 0.336 0.539 0.453
5.817 4.210 3.470 2.511 2.495 1.672 1.117 1.759 1.416 1.073 1.149 1.024 0.980 1.285 1.008 1.100 1.002 0.757 0.452 0.376 0.381 0.507 0.479 0.523 0.616 0.523 0.376 0.485 0.338 0.479 0.289 0.207 0.468
6.602 3.690 2.851 3.255 3.171 2.173 2.295 1.243 1.189 1.479 1.395 1.235 0.991 1.113 1.037 0.953 0.602 0.724 0.640 0.793 0.435 0.496 0.343 0.450 0.267 0.381 0.267 0.252 0.442 0.328 0.450 0.175 0.358
6.515 3.792 2.923 3.566 2.773 2.096 2.038 0.969 1.111 1.178 1.336 1.169 0.793 1.052 1.128 0.618 0.501 0.727 0.585 0.685 0.309 0.443 0.543 0.317 0.292 0.267 0.326 0.309 0.451 0.334 0.443 0.409 0.292
9935 6164 5163 4879 4194 2983 2373 2358 2055 1983 1841 1736 1630 1624 1476 1337 1098 1040
826 821 794 772 762 729 700 664 595 585 579 561 560 518 516
6.394 3.967 3.323 3.140 2.699 1.920 1.527 1.518 1.323 1.276 1.185 1.117 1.049 1.045 0.950 0.861 0.707 0.669 0.532 0.528 0.511 0.497 0.490 0.469 0.451 0.427 0.383 0.377 0.373 0.361 0.360 0.333 0.332
*(a) = adjective, (adv) = adverbials, (a.d.) = adverb of degree, (c) = conjunction, (d) = demonstrative, (i) = infinitive, (r) = relative, (p) = preposition, (pron) = pronoun
Table 2. (continued)
Rank Word-types SB PP OT NN David Esther BH TTC Pip OMF ED Total (raw) (%)
Narrative Style & the Frequencies of Very Common Words
34 or 35 out 36 there 37 into 38 one 38 who(r) 40 that(d) 41 very 42 if 43 little 44 up(adv) 45 go 46 so(a.d.) 47 do 48 upon(p) 49 take 50 their 51 make 52 no(a) 53 come 54 them 55 would 56 see 57 down 58 some 59 could 60 more 61 old 62 man 63 then 64 before 65 her(pron) 66 other 67 over 68 again
0.398 0.080 0.294 0.382 0.446 0.358 0.239 0.223 0.207 0.199 0.151 0.080 0.151 0.095 0.191 0.183 0.549 0.088 0.326 0.111 0.278 0.334 0.183 0.088 0.263 0.127 0.239 0.255 0.294 0.175 0.223 0.095 0.271 0.167 0.127
0.153 0.199 0.217 0.208 0.235 0.253 0.280 0.244 0.208 0.190 0.280 0.099 0.126 0.162 0.190 0.208 0.316 0.171 0.153 0.072 0.099 0.135 0.027 0.117 0.135 0.126 0.180 0.162 0.343 0.262 0.108 0.018 0.190 0.153 0.171
0.372 0.222 0.288 0.408 0.282 0.384 0.222 0.462 0.186 0.222 0.288 0.096 0.174 0.198 0.228 0.246 0.216 0.228 0.258 0.144 0.138 0.330 0.180 0.198 0.210 0.168 0.210 0.294 0.126 0.126 0.168 0.012 0.174 0.192 0.108
0.365 0.194 0.319 0.365 0.350 0.404 0.334 0.365 0.272 0.404 0.327 0.210 0.187 0.233 0.334 0.264 0.381 0.194 0.210 0.117 0.288 0.179 0.124 0.109 0.179 0.171 0.187 0.117 0.155 0.124 0.163 0.054 0.264 0.093 0.086
0.357 0.506 0.387 0.268 0.323 0.218 0.472 0.338 0.377 0.338 0.333 0.442 0.283 0.501 0.268 0.253 0.104 0.357 0.223 0.377 0.194 0.199 0.278 0.228 0.243 0.298 0.194 0.114 0.055 0.164 0.179 0.412 0.129 0.129 0.238
0.255 0.364 0.375 0.217 0.239 0.223 0.239 0.413 0.288 0.413 0.228 0.408 0.554 0.261 0.212 0.174 0.082 0.250 0.163 0.315 0.207 0.234 0.288 0.207 0.158 0.326 0.207 0.304 0.147 0.136 0.207 0.402 0.114 0.125 0.168
0.379 0.487 0.365 0.392 0.217 0.514 0.203 0.271 0.352 0.298 0.257 0.217 0.230 0.176 0.284 0.135 0.338 0.284 0.244 0.325 0.244 0.203 0.189 0.203 0.203 0.054 0.135 0.284 0.176 0.041 0.149 0.257 0.108 0.108 0.149
0.305 0.391 0.328 0.352 0.367 0.211 0.273 0.211 0.211 0.266 0.289 0.219 0.227 0.242 0.336 0.250 0.273 0.219 0.242 0.234 0.367 0.164 0.148 0.313 0.211 0.164 0.180 0.094 0.133 0.164 0.211 0.195 0.211 0.250 0.203
0.283 0.414 0.327 0.283 0.234 0.245 0.272 0.185 0.468 0.169 0.376 0.370 0.278 0.289 0.240 0.338 0.180 0.332 0.267 0.289 0.267 0.245 0.430 0.272 0.218 0.256 0.196 0.065 0.283 0.278 0.174 0.114 0.153 0.174 0.153
0.381 0.282 0.198 0.160 0.236 0.435 0.274 0.175 0.198 0.229 0.282 0.320 0.175 0.168 0.206 0.252 0.191 0.183 0.244 0.236 0.130 0.099 0.061 0.252 0.168 0.137 0.107 0.114 0.259 0.198 0.229 0.076 0.206 0.267 0.221
0.342 0.342 0.234 0.393 0.309 0.134 0.259 0.150 0.284 0.384 0.200 0.217 0.242 0.184 0.234 0.284 0.242 0.192 0.192 0.184 0.242 0.109 0.150 0.209 0.234 0.042 0.192 0.284 0.125 0.284 0.117 0.159 0.117 0.200 0.192
505 0.325 503 0.324 480 0.309 474 0.305 457 0.294 457 0.294 447 0.288 445 0.286 443 0.285 442 0.284 435 0.280 408 0.263 394 0.254 383 0.247 382 0.246 375 0.241 372 0.239 368 0.237 356 0.229 355 0.228 343 0.221 325 0.209 319 0.205 318 0.205 316 0.203 295 0.190 292 0.188 287 0.185 285 0.183 281 0.181 277 0.178 274 0.176 269 0.173 262 0.169 260 0.167
95
*(a) = adjective, (adv) = adverbials, (a.d.) = adverb of degree, (c) = conjunction, (d) = demonstrative, (i) = infinitive, (r) = relative, (p) = prepos (pron) = pronoun
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
- narrative style and the frequencies of very common words
- supplementary information for
- experiment 7 titration of an antacid boston college
- rotate king to get queen word relationships as orthogonal
- in the supreme court of the state of nevada no 75477 filed
- normal findings on brain fluid attenuated
- no place for an octopus claire zorn
- impacts of psychological security emotional intelligence
- activity test prep workbook
- factors influencing fidelity of house finches to a feeding
Related searches
- words instead of very pdf
- words to use instead of very important
- list of most common words in english
- frequencies of the human body
- vibrational frequencies of the body
- happiness is the meaning and the purpose of life the whole aim and end of human
- the constitution and the bill of rights
- chapter 2 neuroscience and the biology of behavior
- frequencies and the human body
- energy frequencies and the body
- the sum of 148 and the product of a number raised to the third power and 11
- the difference between 10 and the product of a number and 9 is 73 at maximum