A Meta-Analysis of Writing Instruction for Students in the ...

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Journal of Educational Psychology 2012, Vol. 104, No. 4, 879 ? 896

? 2012 American Psychological Association 0022-0663/12/$12.00 DOI: 10.1037/a0029185

A Meta-Analysis of Writing Instruction for Students in the Elementary Grades

Steve Graham

Arizona State University

Debra McKeown

Georgia State University

Sharlene Kiuhara

Westminister College

Karen R. Harris

Arizona State University

In an effort to identify effective instructional practices for teaching writing to elementary grade students, we conducted a meta-analysis of the writing intervention literature, focusing our efforts on true and quasi-experiments. We located 115 documents that included the statistics for computing an effect size (ES). We calculated an average weighted ES for 13 writing interventions. To be included in the analysis, a writing intervention had to be tested in 4 studies. Six writing interventions involved explicitly teaching writing processes, skills, or knowledge. All but 1 of these interventions (grammar instruction) produced a statistically significant effect: strategy instruction (ES 1.02), adding self-regulation to strategy instruction (ES 0.50), text structure instruction (ES 0.59), creativity/imagery instruction (ES 0.70), and teaching transcription skills (ES 0.55). Four writing interventions involved procedures for scaffolding or supporting students' writing. Each of these interventions produced statistically significant effects: prewriting activities (ES 0.54), peer assistance when writing (ES 0.89), product goals (ES 0.76), and assessing writing (0.42). We also found that word processing (ES 0.47), extra writing (ES 0.30), and comprehensive writing programs (ES 0.42) resulted in a statistically significant improvement in the quality of students' writing. Moderator analyses revealed that the self-regulated strategy development model (ES 1.17) and process approach to writing instruction (ES 0.40) improved how well students wrote.

Keywords: writing, composition, meta-analysis, instruction, elementary grades

Supplemental materials:

The development of the Common Core State Standards (CCSS; National Governors Association & Council of Chief School Officers, 2010) has made writing and the teaching of writing an integral part of the school reform movement in the United States (Graham, in press). Learning how to write and using writing as a tool for learning received considerable emphasis in CCSS. This document provided benchmarks for a variety of writing skills and applications students are expected to master at each grade and across grades. In the elementary grades, this includes spelling, handwriting, typing, sentence construction (including grammar

This article was published Online First July 9, 2012. Steve Graham, Mary Lou Fulton Teachers College, Arizona State University; Debra McKeown, School of Education, Georgia State University; Sharlene Kiuhara, School of Education, Westminster College; Karen R. Harris, Mary Lou Fulton Teachers College, Arizona State University. Steve Graham and Karen R. Harris are authors of some of the studies reviewed in this meta-analysis. Harris developed the self-regulated strategy development (SRSD) model tested in 14 studies included in the review, and Harris and Graham developed a number of the strategies used in the SRSD studies. The lesson plans and instructional procedures used in SRSD studies are published in two books for teachers. Graham and Harris are authors of these two books. Correspondence concerning this article should be addressed to Steve Graham. E-mail: steve.graham@asu.edu

skills), and strategies for planning and revising. It also includes writing different types of text (persuasive, narrative, and informative), writing for different purposes (facilitate text comprehension and content learning), and using technology to support writing. If elementary grade teachers are to meet CCSS for writing, they need effective instructional tools.

Purpose of the Current Review

A useful approach for identifying instructional practices that have the power to transform students' writing is to conduct systematic reviews of writing intervention research. The systematic approach we applied in this review is meta-analysis. This method of review is used to summarize the magnitude and directions of the effects obtained in a set of empirical research studies (Lipsey & Wilson, 2001). In this article, we present a comprehensive metaanalysis of experimental and quasi-experimental writing studies conducted with elementary grade students. The purpose of this review was to identify effective practices for teaching writing to these children. Meta-analysis is well suited to this purpose, as it provides an estimate of a "treatment's effect under conditions that typify studies in the literature" (Bangert-Drowns, Hurley, & Wilkinson, 2004, p. 34).

A review identifying effective writing practices at the elementary level is needed for three reasons. First, studies of teachers'

879

880

GRAHAM, MCKEOWN, KIUHARA, AND HARRIS

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

practices have raised serious concerns about the quality of writing instruction received by students in the elementary grades (e.g., Fisher & Hebert, 1990; Gilbert & Graham, 2010). Thus, it is important to identify writing treatments with evidence of effectiveness, as this provides elementary teachers with instructional practices that can potentially improve the quality of their instruction and their students' writing. Second, there is a growing consensus that waiting until later grades to address literacy problems that have their origin in earlier grades is not successful (Slavin, Madden, & Karweit, 1989). Applying evidence-based writing practices with elementary grade students should reduce the number of youths who reach middle school and do not write well enough to meet grade-level demands (Harris, Graham, & Mason, 2006). Third, there is no comprehensive meta-analysis of writing treatments conducted just with elementary grade students.

Previous Meta-Analyses in Writing

During the last 30 years, researchers have undertaken a number of meta-analyses of true and quasi-experiments to identify effective practices for writing instruction. Some of these reviews focused on a single writing treatment, finding that teaching strategies for planning or revising (Graham, 2006a; Graham & Harris, 2003), word processing (Bangert-Drowns, 1993; Goldberg, Russell, & Cook, 2003; Morphy & Graham, 2012), and the process approach to writing (Graham & Sandmel, 2011) improved the overall quality of text produced by typical and, in most cases, struggling writers. Other reviews focused more broadly, examining the effectiveness of multiple writing treatments at specific grades. Hillocks (1986) conducted a review of writing interventions with students in Grade 3 through college, whereas Graham and Perin (2007a, 2007c) limited their review to writing treatments applied in Grades 4 ?12.

Although the meta-analyses conducted by Hillocks (1986) and Graham and Perin (2007a, 2007c) were conducted almost 20 years apart and differed somewhat in terms of grade level, there was some overlap in their findings. In both reviews grammar instruction was ineffective in improving writing, but sentence-combining instruction, study and emulation of good models of writing, and inquiry activities improved the quality of students' writing. Hillocks also found that students' writing improved when they evaluated writing using a writing guide or scale, whereas Graham and Perin reported that the process approach to writing instruction, strategy instruction, summarization, prewriting activities, peer assistance, setting product goals, and word processing positively enhanced the quality of students' writing.

The current meta-analysis has the greatest overlap with Graham and Perin's (2007a, 2007c) review. They conducted a metaanalysis of writing treatments tested with true and quasiexperiments with students in Grades 4 ?12. Their outcome measure was writing quality, and studies were only included in the analysis if quality was reliably measured. They excluded studies conducted in special schools for students with disabilities (e.g., schools for the deaf). Finally, they only calculated an average weighted effect size (ES) for a writing treatment if it had been tested in four investigations. This meta-analysis applied these same principles, except it did not include studies conducted with middle and high school students. Despite these similarities, there was only modest overlap in the studies included in this review and the one by Graham and Perin (35 of 115 articles, or 30%).

A second difference between this and the Graham and Perin (2007a, 2007c) review was that quasi-experiments had to assess writing quality at pretest to be included in this meta-analysis, since students were not randomly assigned to conditions (allowing us to adjust for any pretest differences). A third difference was that effects from all quasi-experiments in this review were adjusted for possible data clustering due to hierarchical nesting of data (i.e., researchers assigned classes to treatment or control conditions but then examined student-level effects).

In summary, the primary research question guiding this review was, What writing treatments improve the quality of writing produced by students in the elementary grades? The findings from this meta-analysis have important theoretical implications for writing development. Drawing on a general model of development proposed by Alexander (1997), Graham (2006b) argued that writing strategies, knowledge, skills, and motivation play an important role in students' growth as writers. This meta-analysis provides evidence on the veracity of this claim, at least in part, as some of the treatments evaluated are specifically designed to improve writing strategies, knowledge, or skills. If a treatment designed to enhance knowledge of text structure, for example, improves writing quality, then the theoretical role of knowledge in writing development is supported.

Method

Study Inclusion and Exclusion Criteria

A study had to meet the following criteria to be included in this meta-analysis: (a) was a true experiment (random assignment to conditions) or a quasi-experiment, (b) involved students who were attending an elementary school (in some studies elementary schools included students in Grades 1?5, whereas in other studies elementary schools included Grade 6), (c) contained a treatment group that received a writing intervention, (d) included a measure of writing quality at posttest (quasi-experiments had to include a comparable pretest quality measure, and studies were excluded if interrater reliability of quality was not established or was less than .60), (e) was presented in English, and (f) contained the statistics necessary to compute a weighted ES (or statistics were obtained from the authors). Studies were excluded if the writing treatment took place in a special school for students with disabilities (e.g., school for the deaf), as the purpose of the review was to draw conclusions for more typical school settings.

Search Strategies Used to Locate Studies

Four search strategies were applied. First, 95 electronic searches (ending in October 2010) were conducted (ERIC, PsycINFO, Education Abstracts, ProQuest, and Dissertation Abstracts). These involved the following keywords combined with writing and composition: peer collaboration, peer revising, peer planning, peers, summary, summary instruction, summary strategies, motivation, motivation and instruction, technology, speech synthesis, spell checkers, strategy instruction, sentence combining, dictation, goal setting, genre, free writing, writer's workshop, process writing, process approach, self-monitoring, self-evaluation, national writing project, assessment, evaluative scales, usage, imagery, creativity, mechanics, grammar, inquiry, models, collaborative learn-

ELEMENTARY META-ANALYSIS

881

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

ing, spelling instruction, handwriting instruction, word processing, word processor. Over 12,000 abstract and titles were identified. Each was read by the first author, and if the item looked promising, it was obtained.

Second, the following 18 journals were hand searched: American Educational Research Journal, Assessing Writing, Contemporary Educational Psychology, Elementary School Journal, Exceptional Children, Journal of Educational Psychology, Journal of Educational Research, Journal of Experimental Education, Journal of Learning Disabilities, Journal of Literacy, Journal of Special Education, Learning Disability Quarterly, Learning Disabilities Research and Practice, Reading and Writing, Reading and Writing Quarterly, Reading Research Quarterly, Research in the Teaching of Writing, and Written Communication. Third, pertinent references from previous writing meta-analyses (i.e., BangertDrowns, 1993; Goldberg et al., 2003; Graham, 2006a; Graham & Harris, 2003; Graham & Hebert, 2010; Graham & Perin, 2007c; Graham & Sandmel, 2011; Hillocks, 1986; Morphy & Graham, 2012) were examined. Fourth, reference lists of obtained articles were searched.

Of 424 documents collected, 115 articles were found that met inclusion and exclusion criteria. The interrater reliabilities of the quality writing measures in the studies included in this metaanalysis were generally strong. Correlations between two or more raters' scores were used to calculate reliability in 68% of studies, with a median reliability of .86 and a range of .62?.97 (reliability was .80 or greater in 88% of studies, and reliability was in the .60s in only two studies). Percent of exact agreement was applied in 22% of studies, with a median of 90% agreement and a range of 70%?97% (reliability was 80% or greater in 89% of studies). Seven percent of studies used percent of agreement within a single point to calculate reliability, with a median of 96.5% and a range of 80%?100% (all but one study was above 90%). Finally, three studies calculated coefficient alphas, with a median coefficient of .92 and a range of .76 ?.93.

Categorizing Studies Into Treatment Conditions

Step 1. First, each study was read by the first author and placed (if possible) into one of the 14 writing treatment categories identified by Graham and Perin (2007a). This included the process approach to writing instruction defined as involving extended opportunities for writing; writing for real audiences; engaging in cycles of planning, translating, and reviewing; personal responsibility and ownership of writing projects; high levels of student interactions; creation of a supportive writing environment; selfreflection and evaluation; personalized individual assistance and instruction; and in some instances more systematic instruction. Categorization also included four treatments where explicit teaching of skills, process, or knowledge occurred. These were grammar instruction (e.g., students systematically studied parts of speech, diagrammed sentences, and so forth), sentence combining (students were taught to construct more complex sentences through exercises where two or more basic sentences are combined into a single sentence), strategy instruction (the teacher modeled how to use specific strategies for planning, revising, and/or editing text; students practiced applying the target strategies in at least three sessions, with the goal of using these strategies independently), and text structure instruction (students taught knowledge about the

structure of specific types of text, such as stories or persuasive essays).

There were seven categories that studies were placed in that involved procedures for scaffolding students' writing: prewriting activities (students engaged in activities, like using a semantic web, to generate or organize ideas for their papers), inquiry (students engaged in activities to develop ideas for a particular writing task by analyzing immediate and concrete data), procedural facilitation (students were provided with external supports, such as prompts or hints, to facilitate one or more processes such as planning or revising), peer assistance (students worked together to plan, draft, and/or revise their papers), study of models (students examined examples of specific types of text and attempted to emulate the forms in these examples in their own writing), product goals (students were assigned specific goals for writing), and feedback (students received input from others about their written product).

The final two placement categories were word processing (students used word processing programs to compose their compositions) and extra writing time (students spent additional time writing). Studies that did not fit neatly within one of these 14 categories were held apart. These studies were group together in what we referred to as an unspecified category.

Step 2. The studies placed in the 14 treatments were reread by the first author to determine whether the intervention in each investigation represented the same general writing treatment. For any study in which this was not the case, it was placed in the unspecified category.

Step 3. Studies placed in the unspecified category were reexamined by the first author, and five new treatment categories were created. They were teaching transcription skills (students were taught handwriting, spelling, or keyboarding), adding selfregulation instruction to strategy instruction (students were taught to apply goal setting and self-assessment as part of strategy instruction), imagery/creativity instruction (students taught how to form images or how to be more creative), assessing writing (students received feedback from peers, the teacher, or other adults about their writing, and students were taught to assess their own writing), and comprehensive writing programs (writing treatments designed to serve as a complete writing program). Studies in the feedback category and the process approach to writing instruction category (see Step 1) were included in assessing writing and comprehensive writing programs, respectively. At this point, there were 17 writing treatments.

Step 4. The final step involved eliminating any treatment category where we were unable to calculate at least four or more effects testing its effectiveness (this was identical to the procedures applied by Graham & Perin, 2007c). This resulted in the elimination of four treatments: sentence combining, inquiry, procedural facilitation, and study of models. This left us with 13 writing treatments with four or more effects testing their effectiveness.

Reliability of this categorization process was established by having the second and third authors read and categorize all studies. There were only two disagreements with the first author. It should be noted that we decided to use a monothetic (mutually exclusive) rather than a polythetic classification scheme for two reasons: (a) most of the studies involved specific, well-defined interventions, and (b) previous attempts to use a polythetic approach to classifying writing interventions (e.g., Hillocks, 1986) have been criti-

882

GRAHAM, MCKEOWN, KIUHARA, AND HARRIS

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

cized for trying to force broader schemes, such as natural or environmental teaching approaches, on a literature that is difficult to classify in this way (Stotsky, 1988).

Coding of Study Features

Each study was coded for grade, participant type (e.g., struggling writers, English Language learners, etc.), genre of the posttest measure, description of treatment and control conditions, and publication type. Nine quality indicators were also coded: (a) design (random assignment with the appropriate unit of analysis; i.e., true experiment); (b) treatment fidelity was established through direct observation; (c) teacher effects controlled (e.g., random assignment of teachers); (d) more than a single teacher carried out each condition; (e) total attrition was less than 10% of total sample; (f) total attrition was less than 10%, and equal attrition across conditions was evident (i.e., conditions did not differ by more than 5%); (g) pretest equivalence of writing quality was evident in quasi-experiments (i.e., conditions did not differ by more than 1 standard deviation for the condition with the least variance); (h) pretest ceiling or floor effects were not evident for writing quality in quasi-experiments (more than 1 standard deviation from floor and ceiling); and (i) posttest ceiling or floor effects for writing quality were not evident (more than 1 standard deviation from floor and ceiling). Each quality indicator was scored as 1 (met) or 0 (not met). A total score was calculated for each study (7 possible points for true experiments and 9 possible points for quasi-experiments). This was converted to a percentage by dividing obtained score by total possible points and multiplying by 100%. Coding for study descriptors and quality indicators were independently completed by the second and third authors (96.2% agreement). Disagreements were resolved by reexamining the study.

Calculation of ESs and Statistical Analysis

Basic procedures. ESs were calculated just for writing quality. If a holistic quality measure (a single score that measures general overall quality) was available, then the ES was calculated with this score. If only an analytic quality measure (separate scores for specific aspects of writing, such as content, organization, vocabulary, mechanics, and so forth) was available, a separate ES was computed for each aspect of writing assessed and averaged to produce a single ES. We converted analytic quality measures to a single score because halo effects (the separate scores are moderately to highly related and are best captured through a single, general factor) are evident in studies examining the reliability and validity of analytic measures (see Graham, Hebert, & Harris, 2011). We computed an ES for norm-referenced outcome measures only if they assessed quality or structure of a sample of students' writing.

An ES was calculated for true experiments by subtracting the mean score of the treatment group at posttest from the mean score of the control group at posttest and dividing by the pooled standard deviation of the two groups. The same procedure was used with quasi-experiments, except the mean pretest score for each group was subtracted from the mean posttest score.

In some cases, ESs had to be calculated by estimating missing means and standard deviations. For a few quasi-experiments, ESs

had to be calculated separately for both pretest and posttest (the quality measures were not identical). An adjusted ES was then obtained by subtracting pretest ES from posttest ES. Moreover, before calculating some ESs, it was necessary to average the performance of two or more groups in each condition (e.g., statistics were reported separately by grade) using the Nouri and Greenberg procedure (Cortina & Nouri, 2000).

All quasi-experiments where classes were assigned to treatment conditions, but student-level effects were examined, were adjusted for clustering effects with imputed intraclass correlation (ICC) estimates for reading comprehension from national studies (Hedges & Hedberg, 2007) that were adjusted to writing quality ICCs, with data from a large study of writing that involved a single grade level (Rock, 2007). In addition, it was necessary to adjust the effects for three true experiments (Glaser, Buddle, & Brunstein, 2011; Jones, 2004; Norris, Reichard, & Mokhtari, 1997) that involved cluster randomized assignment (classes were randomly assigned to treatments, and summary statistics were based on class-level data) with the imputed ICCs described above. All computed effects were adjusted for small sample size bias.

Statistical analysis. This meta-analysis employed a weighted random-effects model. For each writing treatment, we calculated an average weighted ES (weighted by multiplying each ES by its inverse variance) as well as the confidence interval and statistical significance of the obtained ES. Two measures of homogeneity (Q and I2) were also calculated, allowing us to determine whether variability in the ESs for a specific writing treatment was larger than expected based on sampling error alone. When homogeneity in ESs for a specific writing treatment exceeded sampling error alone, there were at least eight ESs, and each treatment subcategory tested involved at least four effects, we conducted moderator analysis to determine whether this excess variability could be accounted for by identifiable differences between studies (e.g., participant type).

Finally, the ESs for each writing treatment were examined to see whether any specific ES was exerting undue influence in terms of sample size or magnitude of ES. Outliers were defined with Tukey's (1977) definition of an extreme observation as falling 3 times the interquartile range above the 75th percentile or below the 25th percentile of the distribution of all related scores. Three effects (Kozlow & Bellamy, 2004; Pritchard & Marshall, 1994; A. L. Thibodeau, 1964) exerted undue influence due to sample size and were Winsorized so that they did not exceed Tukey's definition.

Results

Table 1 contains information on the studies testing each writing treatment. A more detailed version of Table 1 that includes additional information on the treatment and control condition in each study, genre tested at posttest, sample size of the study, publication type, and study quality score is contained in the supplemental materials. Table 2 includes the number of studies, average weighted ES, confidence interval, standard error, and statistical significance for each writing treatment as well the two heterogeneity measures (Q and I2).

ELEMENTARY META-ANALYSIS

883

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Table 1 Information on Individual Studies for Writing Treatments That Included Four or More Effect Sizes

Study

Grade

Participant type

Harris et al. (2006)a SRSD

Harris et al. (2011) SRSD Harris & Graham (2004)a SRSD Lane et al. (in press)a SRSD Graham et al. (2005)a SRSD

Tracy et al. (2009) SRSD

Curry (1997) SRSD Glaser et al. (2011)a SRSD

Glaser & Brunstein (2007) SRSD Walser (2000)a

Warrington (1999)

Englert et al. (1991) Troia & Graham (2002)a

MacArthur et al. (1991) SRSD Anderson (1997)a SRSD

Sawyer et al. (1992) SRSD

Torrance et al. (2007) SRSD Fitzgerald & Markham (1987)a

Welch (1992)

Wong et al. (2008) SRSD

Strategy instruction

2 2?3 2?3 2?3

3 3 4 4 4 4 4 4?5 4?5 4?6 5 5?6 6 6 6 6

FR SW SW SW SW FR SW FR FR FR FR FR, SW SW SW FR, SW SW FR FR SW AVG

Harris et al. (2006)a Graham et al. (2005)a

Kurtz (1987) Brunstein & Glaser (2011)a

Glaser & Brunstein (2007)

Sawyer et al. (1992)

Adding self-regulation to strategy instruction

2

SW

3

SW

3?6

SW

4

FR

4

FR

5?6

SW

Carr et al. (1991)

Sinclair (2005)

Riley (1997) Fitzgerald & Teasley (1986)a

Kaminski (1994) Gambrell & Chasen (1991)a Gordon & Braun (1986)a

Raphael et al. (1986) Crowhurst (1991)a

Text structure instruction

2

FR

3

FR

3?5

FR

4

SW

4

FR

4?5

SW

5

FR

5?6

FR

6

FR

Jampole et al. (1991)a

Fortner (1986) Jampole et al. (1994)a

Stoddard (1982)

Creativity/imagery instruction

3?4

HA

3?6

SW

4?5

HA

5?6

HA

Graham et al. (2000)a Graham & Harris (2005)a

Jones (2004)

Jones & Christensen (1999)

Rutberg (1998) Graham et al. (2002)a Berninger et al. (2002)a

Shorter (2001)

Teaching transcription skills

1

SW

1

SW

1

FR

1

SW

1

SW

2

SW

3

SW

3

FR

Green (1991) Anderson (1997)a

Pantier (1999)

A. E. Thibodeau (1964)

Grammar instruction

3 5 5 6

BLL FR, SW FR FR

Effect size

1.89 1.11 0.67 0.68 1.78 0.25 0.57 1.31 1.19 0.67 0.52 0.51 0.83 1.26 1.49 0.63 3.19 0.31 1.72 0.64

0.32 0.13 1.09 0.86 0.87 0.02

0.94 0.33 0.32 0.17 0.13 0.90 0.71 0.34 0.74

0.82 0.83 0.84 0.23

0.54 0.21 1.00 2.40 0.12 0.12 0.35 0.38

0.47 1.49

0.21 0.38

(table continues)

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download