OSTP OCCT 2007 Technical Report



Oklahoma School Testing ProgramOklahoma Core Curriculum TestsGrades 3-8Technical ReportSpring 2010 AdministrationNovember 2010Copyright ? 2010 by the State of Oklahoma Department of Education. All rights reserved. Table of Contents TOC \o "1-3" \h \z \t "Level 2,2,Level 3,3,Level 4,4" INTRODUCTION PAGEREF _Toc241919503 \h 1CHAPTER I. OVERVIEW OF THE OCCT PAGEREF _Toc241919504 \h 21.1 Skills Assessed by the OCCT PAGEREF _Toc241919505 \h 21.2 Test Development Procedures PAGEREF _Toc241919506 \h 31.3 Configuration of the Tests PAGEREF _Toc241919507 \h 3CHAPTER II. STATISTICAL ANALYSIS PAGEREF _Toc241919508 \h 232.1 Data Files for Statistical Analysis PAGEREF _Toc241919509 \h 232.2 Analysis of the Multiple-Choice Tests PAGEREF _Toc241919510 \h 232.2.1 Classical Item Analyses PAGEREF _Toc241919511 \h 232.2.2 Differential Item Functioning Analyses PAGEREF _Toc241919512 \h 262.2.3 Item Calibration and Equating PAGEREF _Toc241919513 \h 282.2.4 Raw Score to Scaled Score Conversion PAGEREF _Toc241919514 \h 302.2.5 Test Score Reliability PAGEREF _Toc241919515 \h 30Coefficient Alpha PAGEREF _Toc241919516 \h 31Standard Error of Measurement PAGEREF _Toc241919517 \h 31Conditional Standard Error of Measurement PAGEREF _Toc241919518 \h 32Reliability of Performance-Level Classification Decisions PAGEREF _Toc241919519 \h 322.2.6 Validity PAGEREF _Toc241919520 \h 342.3 Analysis of the Writing Tests PAGEREF _Toc241919521 \h 402.3.1 Prompt Scoring Formula PAGEREF _Toc241919522 \h 402.3.2 Statistical Adjustments to Scale the Writing Scores PAGEREF _Toc241919523 \h 41Adjustment for Prompt Difficulty and Rater-Year Effects PAGEREF _Toc241919524 \h 41Adjustment for Rater-Year Effects PAGEREF _Toc241919525 \h 41A Compound Adjustment PAGEREF _Toc241919526 \h 422.3.3 Rater Agreement for Operational Writing Prompts PAGEREF _Toc241919527 \h 44CHAPTER III. STATE RESULTS PAGEREF _Toc241919528 \h 46CHAPTER IV. PERFORMANCE STANDARDS PAGEREF _Toc241919529 \h 67REFERENCES PAGEREF _Toc241919530 \h 68APPENDIX A. DATA REVIEW RESULTS PAGEREF _Toc241919531 \h 70APPENDIX B. RAW-TO-SCALED SCORE CONVERSION TABLES AND FREQUENCY DISTRIBUTIONS PAGEREF _Toc241919532 \h 72List of Tables TOC \h \z \t "Table Heading,1" Table 1.3.A Number of Operational and Field-test Items by Content Area and Grade PAGEREF _Toc268701361 \h 3Table 1.3.B 2010 PASS Blueprint and Actual Item Counts: Grade 3 Reading PAGEREF _Toc268701362 \h 4Table 1.3.C 2010 PASS Blueprint and Actual Item Counts: Grade 4 Reading PAGEREF _Toc268701363 \h 5Table 1.3.D 2010 PASS Blueprint and Actual Item Counts: Grade 5 Reading PAGEREF _Toc268701364 \h 6Table 1.3.E 2010 PASS Blueprint and Actual Item Counts: Grade 6 Reading PAGEREF _Toc268701365 \h 7Table 1.3.F 2010 PASS Blueprint and Actual Item Counts: Grade 7 Reading PAGEREF _Toc268701366 \h 8Table 1.3.G 2010 PASS Blueprint and Actual Item Counts: Grade 8 Reading PAGEREF _Toc268701367 \h 9Table 1.3.H 2010 PASS Blueprint and Actual Item Counts: Grade 3 Mathematics PAGEREF _Toc268701368 \h 10Table 1.3.I 2010 PASS Blueprint and Actual Item Counts: Grade 4 Mathematics PAGEREF _Toc268701369 \h 11Table 1.3.J 2010 PASS Blueprint and Actual Item Counts: Grade 5 Mathematics PAGEREF _Toc268701370 \h 12Table 1.3.K 2010 PASS Blueprint and Actual Item Counts: Grade 6 Mathematics PAGEREF _Toc268701371 \h 13Table 1.3.L 2010 PASS Blueprint and Actual Item Counts: Grade 7 Mathematics PAGEREF _Toc268701372 \h 14Table 1.3.M 2010 PASS Blueprint and Actual Item Counts: Grade 8 Mathematics PAGEREF _Toc268701373 \h 15Table 1.3.N.1 2010 PASS Blueprint and Actual Item Counts: Grade 5 Science Process Standards PAGEREF _Toc268701374 \h 16Table 1.3.N.2 2010 PASS Blueprint and Actual Item Counts: Grade 5 Science Content Standards PAGEREF _Toc268701375 \h 17Table 1.3.O.1 2010 PASS Blueprint and Actual Item Counts: Grade 8 Science Process Standards PAGEREF _Toc268701376 \h 18Table 1.3.O.2 2010 PASS Blueprint and Actual Item Counts: Grade 8 Science Content Standards PAGEREF _Toc268701377 \h 19Table 1.3.P 2010 PASS Blueprint and Actual Item Counts: Grade 5 Social Studies PAGEREF _Toc268701378 \h 20Table 1.3.Q 2010 PASS Blueprint and Actual Item Counts: Grade 7 Geography PAGEREF _Toc268701379 \h 21Table 1.3.R 2010 PASS Blueprint and Actual Item Counts: Grade 8 U.S. History PAGEREF _Toc268701380 \h 22Table 2.2.A Classical Item and Test Analyses Summary for the Operational Forms PAGEREF _Toc268701381 \h 25Table 2.2.B Item Analyses Summary for Field-test Items PAGEREF _Toc268701382 \h 26Table 2.2.C Number of Anchor Items by Subject and Grade PAGEREF _Toc268701383 \h 30Table 2.2.D Conditional Standard Errors of Measurement for Each Achievement Level Cut Score PAGEREF _Toc268701384 \h 33Table 2.2.E Estimates of the Reliability of Decisions for Specified Cut Scores a PAGEREF _Toc268701385 \h 34Table 2.2.F Standards Intercorrelation: Grade 3 Reading PAGEREF _Toc268701386 \h 36Table 2.2.G Standards Intercorrelation: Grade 4 Reading PAGEREF _Toc268701387 \h 36Table 2.2.H Standards Intercorrelation: Grade 5 Reading PAGEREF _Toc268701388 \h 36Table 2.2.I Standards Intercorrelation: Grade 6 Reading PAGEREF _Toc268701389 \h 36Table 2.2.J Standards Intercorrelation: Grade 7 Reading PAGEREF _Toc268701390 \h 36Table 2.2.K Standards Intercorrelation: Grade 8 Reading PAGEREF _Toc268701391 \h 37Table 2.2.L Standards Intercorrelation: Grade 3 Mathematics PAGEREF _Toc268701392 \h 37Table 2.2.M Standards Intercorrelation: Grade 4 Mathematics PAGEREF _Toc268701393 \h 37Table 2.2.N Standards Intercorrelation: Grade 5 Mathematics PAGEREF _Toc268701394 \h 38Table 2.2.O Standards Intercorrelation: Grade 6 Mathematics PAGEREF _Toc268701395 \h 38Table 2.2.P Standards Intercorrelation: Grade 7 Mathematics PAGEREF _Toc268701396 \h 38Table 2.2.Q Standards Intercorrelation: Grade 8 Mathematics PAGEREF _Toc268701397 \h 38Table 2.2.R Standards Intercorrelation: Grade 5 Science PAGEREF _Toc268701398 \h 39Table 2.2.S Standards Intercorrelation: Grade 8 Science PAGEREF _Toc268701399 \h 39Table 2.2.T Standards Intercorrelation: Grade 5 Social Studies PAGEREF _Toc268701400 \h 39Table 2.2.U Standards Intercorrelation: Grade 7 Geography PAGEREF _Toc268701401 \h 39Table 2.2.V Standards Intercorrelation: Grade 8 U. S. History PAGEREF _Toc268701402 \h 40Table 2.3.A Weights Assigned to Writing Analytic Traits PAGEREF _Toc268701403 \h 41Table 2.3.B Scaled Score Ranges for Each Achievement Level PAGEREF _Toc268701404 \h 42Table 2.3.C Sample Means and Standard Deviations Used for Calculating Constants PAGEREF _Toc268701405 \h 43Table 2.3.D Grade 5 Writing Results PAGEREF _Toc268701406 \h 43Table 2.3.E Grade 8 Writing Results PAGEREF _Toc268701407 \h 44Table 2.3.F Inter-rater Agreement for Operational Writing Prompts PAGEREF _Toc268701408 \h 45Table 3.1 Means and Standard Deviations of Students' Raw Scores and Scaled Scores PAGEREF _Toc268701409 \h 46Table 3.2 Percentage of Students Performing within Each Achievement Category in 2008 to 2010 PAGEREF _Toc268701410 \h 47Table 3.3 Subgroup Results: Grade 3 Reading PAGEREF _Toc268701411 \h 48Table 3.4 Subgroup Results: Grade 4 Reading PAGEREF _Toc268701412 \h 49Table 3.5 Subgroup Results: Grade 5 Reading PAGEREF _Toc268701413 \h 50Table 3.6 Subgroup Results: Grade 6 Reading PAGEREF _Toc268701414 \h 51Table 3.7 Subgroup Results: Grade 7 Reading PAGEREF _Toc268701415 \h 52Table 3.8 Subgroup Results: Grade 8 Reading PAGEREF _Toc268701416 \h 53Table 3.9 Subgroup Results: Grade 3 Mathematics PAGEREF _Toc268701417 \h 54Table 3.10 Subgroup Results: Grade 4 Mathematics PAGEREF _Toc268701418 \h 55Table 3.11 Subgroup Results: Grade 5 Mathematics PAGEREF _Toc268701419 \h 56Table 3.12 Subgroup Results: Grade 6 Mathematics PAGEREF _Toc268701420 \h 57Table 3.13 Subgroup Results: Grade 7 Mathematics PAGEREF _Toc268701421 \h 58Table 3.14 Subgroup Results: Grade 8 Mathematics PAGEREF _Toc268701422 \h 59Table 3.15 Subgroup Results: Grade 5 Science PAGEREF _Toc268701423 \h 60Table 3.16 Subgroup Results: Grade 8 Science PAGEREF _Toc268701424 \h 61Table 3.17 Subgroup Results: Grade 5 Social Studies PAGEREF _Toc268701425 \h 62Table 3.18 Subgroup Results: Grade 7 Geography PAGEREF _Toc268701426 \h 63Table 3.19 Subgroup Results: Grade 8 History PAGEREF _Toc268701427 \h 64Table 3.20 Subgroup Results: Grade 5 Writing PAGEREF _Toc268701428 \h 65Table 3.21 Subgroup Results: Grade 8 Writing PAGEREF _Toc268701429 \h 66Table 4.1 Final Scaled Score Ranges for Reading and Mathematics PAGEREF _Toc268701430 \h 67Table A Data Review Results PAGEREF _Toc268701431 \h 71Table B.1 Raw-to-Scaled Score Table and Frequency Distribution: Grade 3 Reading PAGEREF _Toc268701432 \h 73Table B.2 Raw-to-Scaled Score Table and Frequency Distribution: Grade 4 Reading PAGEREF _Toc268701433 \h 74Table B.3 Raw-to-Scaled Score Table and Frequency Distribution: Grade 5 Reading PAGEREF _Toc268701434 \h 75Table B.4 Raw-to-Scaled Score Table and Frequency Distribution: Grade 6 Reading PAGEREF _Toc268701435 \h 76Table B.5 Raw-to-Scaled Score Table and Frequency Distribution: Grade 7 Reading PAGEREF _Toc268701436 \h 77Table B.6 Raw-to-Scaled Score Table and Frequency Distribution: Grade 8 Reading PAGEREF _Toc268701437 \h 78Table B.7 Raw-to-Scaled Score Table and Frequency Distribution: Grade 3 Mathematics PAGEREF _Toc268701438 \h 79Table B.8 Raw-to-Scaled Score Table and Frequency Distribution: Grade 4 Mathematics PAGEREF _Toc268701439 \h 80Table B.9 Raw-to-Scaled Score Table and Frequency Distribution: Grade 5 Mathematics PAGEREF _Toc268701440 \h 81Table B.10 Raw-to-Scaled Score Table and Frequency Distribution: Grade 6 Mathematics PAGEREF _Toc268701441 \h 82Table B.11 Raw-to-Scaled Score Table and Frequency Distribution: Grade 7 Mathematics PAGEREF _Toc268701442 \h 83Table B.12 Raw-to-Scaled Score Table and Frequency Distribution: Grade 8 Mathematics PAGEREF _Toc268701443 \h 84Table B.13 Raw-to-Scaled Score Table and Frequency Distribution: Grade 5 Science PAGEREF _Toc268701444 \h 85Table B.14 Raw-to-Scaled Score Table and Frequency Distribution: Grade 8 Science PAGEREF _Toc268701445 \h 86Table B.15 Raw-to-Scaled Score Table and Frequency Distribution: Grade 5 Social Studies PAGEREF _Toc268701446 \h 87Table B.16 Raw-to-Scaled Score Table and Frequency Distribution: Grade 7 Geography PAGEREF _Toc268701447 \h 89Table B.17 Raw-to-Scaled Score Table and Frequency Distribution: Grade 8 U.S. History PAGEREF _Toc268701448 \h 90Table B.18 Composite Score Frequency Distribution: Grade 5 Writing PAGEREF _Toc268701449 \h 91Table B.19 Raw-to-Scaled Score Table and Frequency Distribution: Grade 8 Writing PAGEREF _Toc268701450 \h 92INTRODUCTIONThe Oklahoma Core Curriculum Test (OCCT) is a component of the Oklahoma School Testing Program (OSTP) administered in Grades 3 through 8. It is a transparent, standard-based, criterion-referenced assessment system designed to monitor student achievement of the Oklahoma Priority Academic Student Skills (PASS) adopted by the Oklahoma State Board of Education. Currently, the OCCT includes direct Writing assessments in Grades 5 and 8 and Multiple-Choice (MC) assessments of Reading and Mathematics in Grades 3 through 8; Science in Grades 5 and 8; as well as Social Studies in Grade 5; Geography in Grade 7; and U.S. History, Constitution, and Government in Grade 8.In 2010, the OCCT was administered during the spring. The Writing assessments were administered on February 24th and March 3rd. The Grade 7 Geography and Grade 8 Reading and Mathematics were administered online during an April 12th to May 14th window. The remaining tests were administered during the MC testing window of April 12th to May 7th. The Writing and Multiple-Choice administration windows were extended in 2010 as a result of inclement weather in the state.This technical report outlines the statistical analyses that were carried out in support of the 2010 OCCT. Chapter I provides an overview of the test content and design. Chapter II details the statistical procedures that were carried out in support of the OCCT. These procedures include preliminary item analyses, differential item functioning analyses, calibration and equating, and various miscellaneous analyses. Chapter III presents statewide test results. Chapter IV describes the performance standard setting process and results. Two appendices are provided. Appendix A presents the data review results. Appendix B presents the raw score to scaled score (RS – SS) conversion tables and frequency distribution by grade.The technical information provided in this report is intended for use by all interested in how the test is evaluated, how the scores are interpreted, and the subsequent educational decisions based on the test results. It is assumed that the reader has technical knowledge of test construction and measurement procedures, as stated in Standards for Educational and Psychological Testing (American Educational Research Association, American Psychological Association, National Council on Measurement in Education, 1999).CHAPTER I. OVERVIEW OF THE OCCTThe purpose of the Oklahoma Core Curriculum Test (OCCT) is to fulfill accountability requirements and to provide feedback about student mastery of the knowledge and skills delineated in the Oklahoma Priority Academic Student Skills (PASS) standards. In the spring of 2010, the OCCT assessments were administered to all eligible public school students in Grades 3 through 8. The OCCT includes assessments of Reading and Mathematics in Grades 3 through 8; Writing and Science in Grades 5 and 8; as well as Social Studies in Grade 5; Geography in Grade 7; and U.S. History, Constitution, and Government in Grade 8. All tests were designed to measure the Oklahoma Priority Academic Student Skills (PASS) adopted by the Oklahoma State Board of Education. The 2010 administration of the OCCT was the sixteenth for students in Grades 5 and 8 and the sixth for students in Grades 3, 4, and 7 (Geography only). This was the fifth operational administration of the Reading and Mathematics tests in Grades 6 and 7. Data Recognition Corporation (DRC) worked with the Oklahoma State Department of Education (SDE) to construct OCCT test forms aligned to the PASS standards. Each test form included a set of operational items used to produce student test scores, and a set of embedded field-test items. The Writing assessments included one extended constructed-response (CR) item. The Reading, Mathematics, Science, Social Studies, Geography and History assessments were composed of Multiple-Choice (MC) items only. For each content and grade, there were eight forms consisting of a common set of operational items and a unique set of 10 field-test items. Responses to the operational items were used to produce student scores. Responses to the field-test items were used to evaluate the psychometric properties of these newly developed items for possible inclusion on future forms.The OCCT is an untimed test. The MC tests in Grades 3 – 5 were administered in two sessions. The Writing tests in Grades 5 and 8 and the MC tests in Grades 6 – 8 were administered in one session. With the exception of Grade 7 Geography and Grade 8 Reading and Mathematics, all assessments were administered as paper-and-pencil tests. The Grade 7 Geography and Grade 8 Reading and Mathematics assessments were delivered primarily online, with paper forms available for accommodated administrations and for make-ups. In the following sections, more information is provided on the skills assessed by the OCCT, test development procedures, and the configuration of the tests.1.1 Skills Assessed by the OCCTThe standards assessed by each test can be found at . The OCCT measures all PASS standards except for objectives that cannot be appropriately measured within the limitations of a large scale, Multiple-Choice test. For example, the majority of PASS standards (listening, reviewing, etc.) are not measured in the ELA assessment. Standards not measured by the OCCT must be assessed by local school districts.1.2 Test Development ProceduresThe items used in the operational 2010 OCCT were selected from the SDE-owned pool of items. All items selected had previously been reviewed and approved by Oklahoma content, bias, and sensitivity review committees. These operational items had been field tested during previous administrations. The field-test statistics for these operational items indicated that the items were of acceptable quality.For field-test items embedded in the OCCT, DRC assessment specialists selected field test ready items from SDE's item bank, as well as items newly-developed for 2010. A total of 80 items per content/grade were selected for embedded field-testing. 1.3 Configuration of the Tests Table 1.3.A shows the number of operational and field-test items by content area and grade used in the 2010 operational tests. Also shown is the number of operational items included in the anchor set used for equating the 2010 forms to the previously established reporting scale. For all Multiple-Choice tests, each form contained a core set of operational items common across forms and a unique set of field-test items.Table 1.3.A Number of Operational and Field-test Items by Content Area and GradeContent AreaGradeNumber of FormsNumber of Operational Items per FormaNumber of Operational Items in Anchor SetaNumber of Field-test Items per FormTotal Number of Items per FormTotal Number of Field-test Items Per Grade????????Reading3-885014-18b106080????????Writing5, 8110010????????Mathematics3-884517-20c105580????????Science5, 884516,14105580????????Social Studies586020107080????????Geography784515105580????????U.S. History884515105580a Operational item counts include anchor items. b Anchor counts for Reading tests were 17, 18, 16, 14, 17, and 15 in Grades 3 through 8 respectively. c Anchor counts for Mathematics tests were 20, 19, 18, 18, 17, and 17 in Grades 3 through 8 respectively. Tables 1.3.B through 1.3.R provide information drawn from the official 2010 test PASS blueprints. These tables show the number of items by content standard specified in the blueprints and the number of items that appeared on the 2010 operational assessments. Table 1.3.B 2010 PASS Blueprint and Actual Item Counts: Grade 3 ReadingPASS Standards & ObjectivesIdealNumberof Items for Alignment to PASS*Actual Number of Items on 2010 TestIdealPercentageof Items **Vocabulary 121224%Words in Context (2.1) 2-43Affixes, Roots, and Stems (2.2) 2-43Synonyms, Antonyms, and Homonyms (2.3) 2-42Using Resource Materials (2.4) 2-44Comprehension/Critical Literacy 242448%Literal Understanding (4.1) 55Inferences and Interpretation (4.2) 77Summary and Generalization (4.3) 66Analysis and Evaluation (4.4) 66Literature 8816%Literary Elements (5.2) 42Figurative Language/Sound Devices (5.3) 46Research and Information 6612%Accessing Information (6.1) 66Total Test 5050100%* A minimum of 4 items is required to report results for an objective, and a minimum of 6 items is required to report a standard. While the actual numbers of items on the test may not match the blueprint exactly, each future test will move toward closer alignment with the ideal blueprint.** Percents are approximations and may result in a sum other than 100 due to rounding. Table 1.3.C 2010 PASS Blueprint and Actual Item Counts: Grade 4 ReadingPASS Standards & ObjectivesIdealNumberof Items for Alignment to PASS*Actual Number of Items on 2010 TestIdealPercentageof Items **Vocabulary 121224%Words in Context (1.1) 43Affixes, Roots, and Stems (1.2) 45Synonyms, Antonyms, and Homonyms (1.3) 44Comprehension/Critical Literacy 232346%Literal Understanding (3.1) 44Inferences and Interpretation (3.2) 66Summary and Generalization (3.3) 77Analysis and Evaluation (3.4) 66Literature 9918%Literary Elements (4.2) 55Figurative Language/Sound Devices (4.3) 44Research and Information 6612%Accessing Information (5.1) 66Total Test 5050100%* A minimum of 4 items is required to report results for an objective, and a minimum of 6 items is required to report a standard. While the actual numbers of items on the test may not match the blueprint exactly, each future test will move toward closer alignment with the ideal blueprint.** Percents are approximations and may result in a sum other than 100 due to rounding. Table 1.3.D 2010 PASS Blueprint and Actual Item Counts: Grade 5 ReadingPASS Standards & ObjectivesIdealNumberof Items for Alignment to PASS*Actual Number of Items on 2010 TestIdealPercentageof Items **Vocabulary 121224%Words in Context (1.1) 44Affixes, Roots, and Stems (1.2) 44Synonyms, Antonyms, and Homonyms (1.3) 44Comprehension/Critical Literacy 202040%Literal Understanding (3.1) 44Inferences and Interpretation (3.2) 4-66Summary and Generalization (3.3) 4-65Analysis and Evaluation (3.4) 4-65Literature 121224%Literary Genre (4.1) 44Literary Elements (4.2) 44Figurative Language/Sound Devices (4.3) 44Research and Information 6612%Accessing Information (5.1) 2-44Interpreting Information (5.2) 2-42Total Test 5050100%* A minimum of 4 items is required to report results for an objective, and a minimum of 6 items is required to report a standard. While the actual numbers of items on the test may not match the blueprint exactly, each future test will move toward closer alignment with the ideal blueprint.** Percents are approximations and may result in a sum other than 100 due to rounding. Table 1.3.E 2010 PASS Blueprint and Actual Item Counts: Grade 6 ReadingPASS Standards & ObjectivesIdealNumberof Items for Alignment to PASS*Actual Number of Items on 2010 TestIdealPercentageof Items **Vocabulary 8816%Words in Context (1.1) 44Word Origins (1.2) 44Comprehension/Critical Literacy 202040%Literal Understanding (3.1) 44Inferences and Interpretation (3.2) 4-66Summary and Generalization (3.3) 4-66Analysis and Evaluation (3.4) 4-64Literature 141428%Literary Genres (4.1) 44Literary Elements (4.2) 4-65Figurative Language/Sound Devices (4.3) 4-65Research and Information 8816%Accessing Information (5.1) 44Interpreting Information (5.2) 44Total Test 5050100%* A minimum of 4 items is required to report results for an objective, and a minimum of 6 items is required to report a standard. While the actual numbers of items on the test may not match the blueprint exactly, each future test will move toward closer alignment with the ideal blueprint.** Percents are approximations and may result in a sum other than 100 due to rounding. Table 1.3.F 2010 PASS Blueprint and Actual Item Counts: Grade 7 ReadingPASS Standards & ObjectivesIdealNumberof Items for Alignment to PASS*Actual Number of Items on 2010 TestIdealPercentageof Items **Vocabulary 101020%Words in Context (1.1) 3-44Word Origins (1.2) 3-43Idioms and Comparisons (1.3) 3-43Comprehension/Critical Literacy 202040%Literal Understanding (3.1) 44Inferences and Interpretation (3.2) 4-66Summary and Generalization (3.3) 4-66Analysis and Evaluation (3.4) 4-64Literature 121224%Literary Genres (4.1) 44Literary Elements (4.2) 44Figurative Language/Sound Devices (4.3) 44Research and Information 8816%Accessing Information (5.1) 44Interpreting Information (5.2) 44Total Test 5050100%* A minimum of 4 items is required to report results for an objective, and a minimum of 6 items is required to report a standard. While the actual numbers of items on the test may not match the blueprint exactly, each future test will move toward closer alignment with the ideal blueprint.** Percents are approximations and may result in a sum other than 100 due to rounding. Table 1.3.G 2010 PASS Blueprint and Actual Item Counts: Grade 8 ReadingPASS Standards & ObjectivesIdealNumberof Items for Alignment to PASS*Actual Number of Items on 2010 TestIdealPercentageof Items **Vocabulary 6612%Words in Context (1.1)23Word Origins (1.2)21Idioms and Comparisons (1.3)22Comprehension/Critical Literacy 212142%Literal Understanding (3.1) 44Inferences and Interpretation (3.2) 4-66Summary and Generalization (3.3) 5-75Analysis and Evaluation (3.4) 6-86Literature 151530%Literary Genre (4.1) 44Literary Elements (4.2) 5-75Figurative Language/Sound Devices (4.3) 4-66Research and Information 8816%Accessing Information (5.1) 44Interpreting Information (5.2) 44Total Test 5050100%* A minimum of 4 items is required to report results for an objective, and a minimum of 6 items is required to report a standard. While the actual numbers of items on the test may not match the blueprint exactly, each future test will move toward closer alignment with the ideal blueprint.** Percents are approximations and may result in a sum other than 100 due to rounding. Table 1.3.H 2010 PASS Blueprint and Actual Item Counts: Grade 3 MathematicsPASS Standards & ObjectivesIdealNumberof Items for Alignment to PASS*Actual Number of Items on 2010 TestIdealPercentageof Items **Patterns and Algebraic Reasoning 8818%Algebra Patterns (1.1) 44Problem Solving (1.2) 44Number Sense 7716%Place Value (2.1) 3-44Whole Numbers and Fractions (2.2) 3-43Number Operations and Computation 121227%Estimation (3.1) 44Multiplication (3.2) 44Money Problems (3.3) 44Geometry and Measurement 121227%Spatial Reasoning (4.1) 44Measurement (4.2) 44Time and Temperature (4.4) 44Data Analysis and Probability 6613%Data Analysis (5.1) 2-44Probability (5.2) 2-42Total Test 4545100%* A minimum of 4 items is required to report results for an objective, and a minimum of 6 items is required to report a standard. While the actual numbers of items on the test may not match the blueprint exactly, each future test will move toward closer alignment with the ideal blueprint.** Percents are approximations and may result in a sum other than 100 due to rounding. Table 1.3.I 2010 PASS Blueprint and Actual Item Counts: Grade 4 MathematicsPASS Standards & ObjectivesIdealNumberof Items for Alignment to PASS*Actual Number of Items on 2010 TestIdealPercentageof Items **Patterns and Algebraic Reasoning 8818%Algebra Patterns (1.1) 44Functions (1.2) 44Number Sense 101022%Place Value (2.1) 44Whole Numbers and Decimals (2.2) 2-43Fractions (2.3) 2-43Number Operations and Computation 111124%Multiplication (3.1) 2-43Division (3.2) 2-44Estimation (3.3) 4-54Geometry and Measurement 101022%Lines and Angles (4.1) 2-44Spatial Reasoning (4.3) 2-42Measurement (4.4) 44Data Analysis and Probability 6613%Data Analysis (5.1) 66Total Test 4545100%* A minimum of 4 items is required to report results for an objective, and a minimum of 6 items is required to report a standard. While the actual numbers of items on the test may not match the blueprint exactly, each future test will move toward closer alignment with the ideal blueprint.** Percents are approximations and may result in a sum other than 100 due to rounding. Table 1.3.J 2010 PASS Blueprint and Actual Item Counts: Grade 5 MathematicsPASS Standards & ObjectivesIdealNumberof Items for Alignment to PASS*Actual Number of Items on 2010 TestIdealPercentageof Items **Patterns and Algebraic Reasoning 8818%Algebra Patterns (1.1) 44Problem Solving (1.2) 44Number Sense 8818%Fractions/Decimals/Percents (2.1) 44Number Theory (2.2) 44Number Operations and Computation 8818%Estimation (3.1) 44Whole Numbers/Decimals/Fractions (3.2)44Geometry and Measurement 121227%Geometric Figure Properties (4.1) 44Perimeter/Area (4.2) 44Convert Measurements (4.5) 44Data Analysis and Probability 9920%Data Analysis (5.1) 55Probability (5.2) 44Total Test 4545100%* A minimum of 4 items is required to report results for an objective, and a minimum of 6 items is required to report a standard. While the actual numbers of items on the test may not match the blueprint exactly, each future test will move toward closer alignment with the ideal blueprint.** Percents are approximations and may result in a sum other than 100 due to rounding. Table 1.3.K 2010 PASS Blueprint and Actual Item Counts: Grade 6 MathematicsPASS Standards & ObjectivesIdealNumberof Items for Alignment to PASS*Actual Number of Items on 2010 TestIdeal Percentage of Items **Algebraic Reasoning 101022% Patterns (1.1) 55Order of Operations (1.2) 55Number Sense 131329% Multiply/Divide Fractions (2.1) 2-32Decimals (2.2) 2-33Estimation (2.3) 44Expressions (2.5) 44Geometry 6613% Angles (3.1)2-44Congruent and Similar Figures (3.2) 2-42Measurement7716% Compare/Convert Units (4.2) 3-43Estimate Measurements (4.3) 3-44Data Analysis and Statistics 9920% Collect/Organize/Interpret Data (5.1) 2-32Construct/Interpret Graphs (5.2) 2-33Median/Mode (5.3) 44Total Test 4545100% * A minimum of 4 items is required to report results for an objective, and a minimum of 6 items is required to report a standard. While the actual numbers of items on the test may not match the blueprint exactly, each future test will move toward closer alignment with the ideal blueprint.** Percents are approximations and may result in a sum other than 100 due to rounding. Table 1.3.L 2010 PASS Blueprint and Actual Item Counts: Grade 7 MathematicsPASS Standards & ObjectivesIdeal Number of Items for Alignment to PASS*Actual Number of Items on 2010 TestIdeal Percentage of Items ** Algebraic Reasoning 8 818% Properties (1.1) 4 4Linear Equations (1.2) 4 4Number Sense 12 1227% Integers (2.1) 4 4Ratio/Proportion/Percent (2.2) 4 4Exponents (2.3) 4 4Geometry 9 920% Geometric Figures (3.1) 2-3 3Angles (3.2) 2-3 2Coordinate System (3.3) 4 4Measurement7 716% Area and Perimeter (4.1) 2-4 3Customary/Metric Measurements (4.2) 2-4 4Data Analysis and Probability 9 920% Outcomes/Simple Probability (5.1) 4 4Probability with Or, And, or Not (5.2) 2-3 3Combinations/Permutations (5.3) 2-3 2Total Test 45 45100% * A minimum of 4 items is required to report results for an objective, and a minimum of 6 items is required to report a standard. While the actual numbers of items on the test may not match the blueprint exactly, each future test will move toward closer alignment with the ideal blueprint.** Percents are approximations and may result in a sum other than 100 due to rounding. Table 1.3.M 2010 PASS Blueprint and Actual Item Counts: Grade 8 MathematicsPASS Standards & ObjectivesIdeal Number of Items for Alignment to PASS*Actual Number of Items on 2010 TestIdeal Percentage of Items **Algebraic Reasoning 9 920% Equations (1.1) 5 5Inequalities (1.2) 4 4Number Sense 8 818% Rational Numbers/Proportions (2.1) 4 4Exponents (2.2) 4 4Geometry 8 818% Classify Solids (3.1) 4 4Pythagorean Theorem (3.2) 4 4Measurement 12 1227% Estimate Surface Area/Volume (4.1)4 4Similar Figures (4.2) 4 4Formulas (4.3) 4 4Data Analysis and Statistics 8 818% Data Representation (5.1) 4 4Central Tendency (5.2) 4 4Total Test 45 45100% * A minimum of 4 items is required to report results for an objective, and a minimum of 6 items is required to report a standard. While the actual numbers of items on the test may not match the blueprint exactly, each future test will move toward closer alignment with the ideal blueprint.** Percents are approximations and may result in a sum other than 100 due to rounding. Table 1.3.N.1 2010 PASS Blueprint and Actual Item Counts: Grade 5 Science Process StandardsPASS Process Standards & ObjectivesIdeal Number of Items for Alignment to PASS*Actual Number of Items on 2010 TestIdeal Percentage of Items **Observe and Measure 10 10 22% SI Metric (P1.1) 5 5 Similar/different characteristics (P1.2) 5 5 Classify 10 10 22% Observable properties (P2.1) 5 5 Serial order (P2.2) 5 5 Experiment 11 11 24% Experimental design (P3.2) 7 7 Hazards/practice safety (P3.4) 4 4 Interpret and Communicate 14 14 31% Data tables/line/bar/trend and circle graphs (P4.2) 6 6 Prediction based on data (P4.3) 4 4 Explanations based on data (P4.4) 4 4 Total Test 45 45 100% * A minimum of 4 items is required to report results for an objective, and a minimum of 6 items is required to report a standard. While the actual numbers of items on the test may not match the blueprint exactly, each future test will move toward closer alignment with the ideal blueprint.** Percents are approximations and may result in a sum other than 100 due to rounding. Table 1.3.N.2 2010 PASS Blueprint and Actual Item Counts: Grade 5 Science Content StandardsPASS Content Standards & Objectives IdealNumberof Items for Alignment to PASSActual Number of Items on 2010 TestIdeal Percentage of ItemsProperties of Matter and Energy 18 18 44% Matter has physical properties (1.1) 6 6 Physical properties can be measured (1.2) 6 6 Energy can be transferred (1.3) 6 6 Organisms and Environments 12 12 29% Dependence upon community (2.1) 6 6 Individual organism and species survival (2.2)6 6 Structures of the Earth and the Solar System11 11 27% Weather patterns (3.2) 6 6 Earth as a planet (3.3) 5 5 Total Test 41*41*100%*** Safety items are not included within the content blueprint** The ideal percents are based on the total number of items on a test that are matched to the content standards and do not include items added for safety.Table 1.3.O.1 2010 PASS Blueprint and Actual Item Counts: Grade 8 Science Process StandardsPASS Standards & ObjectivesIdeal Number of Items for Alignment to PASS*Actual Number of Items on 2010 TestIdeal Percentage of Items **Observe and Measure 8 8 18% Qualitative/quantitative observations/changes (P1.1) 4 4 SI (metrics) units/appropriate tools (P1.2 and P1.3) 4 4 Classify 8 8 18% Classification system (P2.1) 4 4 Properties ordered (P2.2) 4 4 Experiment 16 16 36% Experimental design (P3.2) 6 6 Identify variables (P3.3) 6 6 Hazards/practice safety (P3.6) 4 4 Interpret and Communicate 13 13 29% Data tables/line/bar/trend and circle graphs (P4.2) 7 7 Explanations/prediction (P4.3) 6 6 Total Test 45 45100% * A minimum of 4 items is required to report results for an objective, and a minimum of 6 items is required to report a standard. While the actual numbers of items on the test may not match the blueprint exactly, each future test will move toward closer alignment with the ideal blueprint.** Percents are approximations and may result in a sum other than 100 due to rounding. Table 1.3.O.2 2010 PASS Blueprint and Actual Item Counts: Grade 8 Science Content StandardsPASSStandards & ObjectivesIdealNumberof Items for Alignment to PASSActual Number of Items on 2010 TestIdeal Percentage of ItemsProperties and Chemical Changes in Matter 7-8819%Chemical reactions (1.1) 3-44Conservation of matter (1.2) 3-44Motion and Forces 8820%Motion of an object (2.1) 44Object subjected to a force (2.2) 44Diversity and Adaptations of Organisms 9922%Classification (3.1) 55Internal and external structures (3.2) 44Structures/Forces of the Earth/Solar System 8820%Landforms result from constructive and destructive forces (4.1) 44Rock cycle (4.2) 44Earth’s History 7-8819%Catastrophic events (5.1) 3-44Fossil evidence (5.2) 3-44Total Test 41*41100%*** Safety items are not included within the content blueprint** The ideal percents are based on the total number of items on a test that are matched to the content standards and do not include items added for safety.Table 1.3.P 2010 PASS Blueprint and Actual Item Counts: Grade 5 Social StudiesPASS Standards & ObjectivesIdealNumberof Items for Alignment to PASS*Actual Number of Items on 2010 TestIdeal Percentage of Items **Early Exploration 8813% Expeditions (2.1) 44Native American Reaction (2.2) 44Colonial America 121220% Settlements and Migration (3.1) 44Colonial Life (3.2) 44Individuals and Groups (3.3) 44American Revolution 121220% Causes and Results (4.1) 44Declaration of Independence (4.3) 44Individuals (4.4) 44Early Federal Period 8813% Constitutional Provisions (5.2) 44Ratification and Rights (5.3) 44Geographic Skills 202033% Maps/Charts/Graphs Usage (7.1) 77Human/Environment Interaction (7.2) 55Historical Places (7.3) 44Westward Movement (7.4) 44Total Test 6060100%* A minimum of 4 items is required to report results for an objective, and a minimum of 6 items is required to report a standard. While the actual numbers of items on the test may not match the blueprint exactly, each future test will move toward closer alignment with the ideal blueprint.** Percents are approximations and may result in a sum other than 100 due to rounding. Table 1.3.Q 2010 PASS Blueprint and Actual Item Counts: Grade 7 GeographyPASS Standards & ObjectivesIdealNumberof Items for Alignment to PASS*Actual Number of Items on 2010 TestIdeal Percentage of Items **Geographic Tools 449% Map Concepts (1.2) 44Regions 121227% Regional Characteristics (2.1) 44Conflict/Cooperation (2.2) 44Locations (2.4) 44Physical Systems 8818% Climate/Weather (3.2) 44Natural Disasters (3.3) 44Human Systems 8818% World Cultures (4.1) 44Population Issues (4.5) 44Human/Environment Interaction 8818% Natural Resources (5.1) 44Human Modification (5.2) 44Geography Skills 5511% Maps/Charts/Graphs (6.1) 55Total Test 4545100% * A minimum of 4 items is required to report results for an objective, and a minimum of 6 items is required to report a standard. While the actual numbers of items on the test may not match the blueprint exactly, each future test will move toward closer alignment with the ideal blueprint.** Percents are approximations and may result in a sum other than 100 due to rounding. Table 1.3.R 2010 PASS Blueprint and Actual Item Counts: Grade 8 U.S. HistoryPASS Standards & ObjectivesIdeal Number of Items for Alignment to PASS*Actual Number of Items on 2010 TestIdeal Percentage of Items **Social Studies Process Skills (1.0) 6 613% Causes of the American Revolution (3.0) 5 511% Results of the American Revolution (4.0) 5 511% Governing Documents/Early Federal Period (5.0) 6 613% Northern/Southern Economic Growth (6.0) 4 49% Jacksonian Era (7.0) 4 49% Cultural Growth and Reform (8.0) 4 49% Westward Movement (9.0) 6 613% Eve of War (10.0) 5 511% Total Test 45 45100% * A minimum of 4 items is required to report results for an objective, and a minimum of 6 items is required to report a standard. While the actual numbers of items on the test may not match the blueprint exactly, each future test will move toward closer alignment with the ideal blueprint.** Percents are approximations and may result in a sum other than 100 due to rounding. CHAPTER II. STATISTICAL ANALYSISThis section provides an overview of the research and statistical analyses carried out for the 2010 administration of the OCCT. Following the administration of the OCCT, student demographic and item response data were transmitted to DRC’s Psychometric Services (PS) department. PS staff is responsible for analyzing the OCCT test data and producing the scoring tables used for reporting.The analyses of the test data can be broken down into several components: 1) classical item analyses; 2) differential item functioning (DIF) analyses; 3) reliability analyses; 4) calibration and equating; 5) production of scoring tables; and 6) validity analyses. In the following sections, the analysis procedures for each component are described. Separate sections are provided for the Multiple-Choice tests and the Writing tests.2.1 Data Files for Statistical Analysis Preliminary item and DIF analyses and the final calibration/equating for the Multiple-Choice tests were conducted using early-return sample data consisting of approximately 50% of the examinee population. The final item and DIF analyses were conducted using data files that contained 100% of the student data. For the analysis of the Writing tests, the rater-year effect analysis was conducted based on a sample (n = 510) which was randomly drawn from 2007 and rescored in 2010. All other analyses on writing were conducted using the final population data file.2.2 Analysis of the Multiple-Choice Tests2.2.1 Classical Item Analyses Classical item analyses were conducted using DRC’s software, iTEMs (DRC, 2010). The analyses involved computing a set of statistics based on classical test theory for every item in each form. Each statistic is designed to provide empirical information about the characteristics of each item. The statistics estimated for OCCT items are described below. Classical Item Difficulty (“p-value”): This statistic indicates the proportion of examinees in the sample that answered the item correctly. Desired p-values generally fall within the range of 0.30 to 0.90. Occasionally, items that fall outside this range can be justified for inclusion in an assessment based upon other quality indicators (e.g., adequate point-biserial), the educational importance of the item’s content, or to better measure students with very high or low achievement. Item Discrimination (“point-biserial”):This statistic describes the relationship between performance on the specific item and performance on the entire form. Estimated as the correlation between the item score and the total test score, it indicates the extent to which test takers with high test scores tend to answer the item correctly, and those with low scores tend to answer incorrectly. The point-biserial correlation for item i is given by,where is the mean score for those students who answered item i correctly, and and are the mean score and standard deviation for the test form, and is the item difficulty for item i.Items with negative or low correlations can indicate problems with the item (e.g., incorrect key, multiple correct answers or unusually complex content), or can indicate that students have not been taught the content. Examination of Empirical Item Response Curves (EIRC):iTEMs provides graphical displays of student performance on each item. In the MC item plots, the x-axis represents the criterion score level (the total number-correct score) and y-axis represents the percentage of examinees choosing the response option. Each response option is plotted, representing the percentage of examinees that chose that particular option by ability level. One would expect the curve for the correct option to increase as ability level increases. These graphs were reviewed by DRC psychometricians.Percentage of Students Omitting or not Reaching an Item:This statistic is useful for identifying issues related to testing time and item/test layout. Testing time issues do not exist for the OCCT’s as they are untimed. However, if the omit percentage is greater than 5% for a single item, this could be an indication of an item/test layout problem. For example, students might accidentally skip an item that follows a lengthy stem.For the OCCT operational and field-test analyses, a series of flags were created in order to automatically identify items with performance characteristics that are at times considered unusual. The following flagging criteria were applied to all items tested in spring 2010: P-value less than .30 or greater than .90; The percentage choosing an “incorrect” option is equal to or greater than the percentage choosing the correct option;The percentage of students selecting any of the “incorrect” options is larger than 30%;Point-biserial correlation is less than .30 for the correct answer;Any of the “incorrect” answer options (distractors) with a positive point-biserial;Percentage of test takers omitting the item is greater than 5%.After the operational administration, the early-return data were used to conduct preliminary analyses to verify the accuracy of the scoring keys and to obtain an early indication of how items were functioning. Content specialists examined all flagged items to ensure that the items were correctly keyed.Upon receipt of the complete student data, the items were further scrutinized during a final round of classical item analysis. After content specialist’s review and verification, item statistics were prepared for uploading to the item bank.Summary statistics describing the difficulty and discrimination of items comprising the operational forms are given in Table 2.2.A. Results are combined across test forms for a given grade and content area because the operational items were the same for all test forms. Differential item functioning (DIF) flags and reliability indices (alpha, SEM, and stratified alpha) are also provided. These statistics are described in the sections that follow.Table 2.2.A Classical Item and Test Analyses Summary for the Operational FormsContent Area/GradeGradeN ItemsAlphaSEMStratified Alphap-valuePt-BiserialFlagged Item CountMeanSDMeanSDStatsaDIFReading3500.892.900.890.690.140.400.071004500.892.830.890.710.140.400.07705500.892.700.890.740.140.410.071026500.902.990.900.660.120.410.08827500.872.880.870.700.150.370.071528500.882.830.880.720.150.380.08155Mathematics3450.902.580.910.740.130.430.08504450.892.620.890.720.150.420.07805450.892.770.890.670.140.420.08826450.892.810.900.670.110.420.08507450.872.820.870.650.160.390.06908450.882.820.880.680.110.400.0791Science5450.882.720.880.690.150.400.061108450.862.840.860.650.160.380.07142Social Studies5600.893.410.890.600.140.370.08200Geography7450.862.900.860.640.130.380.08164U.S. History8450.892.880.890.620.130.410.0790aClassical item statistics flagged using the criteria from the bulleted points above.As shown in Table 2.2.A, the mean item difficulties of the tests across content/grade ranged from 0.60 to 0.74, and the mean point-biserial correlations ranged from 0.37 to 0.43. The internal consistency reliability estimates (coefficient alpha) of all tests were high, ranging from 0.86 to 0.91. The stratified alpha coefficients were almost identical to the alpha (after rounding). The SEMs ranged from 2.58 to 3.41. Table 2.2.A also shows that a small number of operational test items were flagged for out-of-range statistics and/or C category DIF. Items flagged for out-of-range statistics were scrutinized by content experts to verify the accuracy of the items in the test books, to verify keys, and to judge whether items were performing as expected. All items flagged for out-of-range statistics were found to be accurate, correctly keyed, and performing in a satisfactory manner with respect to content. DIF results for the operational items are discussed in the next section.The results of the classical item analysis for the field-test items are presented in Table 2.2.B. Field-test items with extreme difficulty values, low point-biserials or poorly functioning distractors, and/or DIF were flagged for review by content experts. The number of items flagged for poor statistics ranged from 34 to 59 per content/grade. A very small number of items were flagged for DIF. All flagged items were evaluated at a Data Review Meeting and individual decisions were made about each item. Items rejected during Data Review will not be eligible as operational items in the future test administrations. Items accepted with revisions will be returned to the item bank and will be revised and re-field tested as necessary. The results of the Data Review are presented in Appendix A.Table 2.2.B Item Analyses Summary for Field-test ItemsContent AreaGradeSample SizeN Itemsp-valuePt-BiserialFlagged Item Count RangeMeanSDMeanSDStatsaDIFReading35323-5535800.610.170.360.10302?44800-5452800.620.160.340.10283?54259-5493800.620.170.340.12270?64692-5398800.570.170.320.12291?74176-5255800.650.160.340.11281?84939-5526800.650.160.300.11395Mathematics35435-5603800.750.160.360.10313?44953-5511800.640.160.360.10321?54407-5515800.580.230.330.13452?64766-5410800.550.200.310.13361?74198-5267800.540.200.300.10452?84844-6058800.590.160.330.11321Science54453-5574800.560.180.300.11361?84375-5220800.550.190.300.11403Social Studies54985-5952800.470.140.290.12450Geography75157-8292800.530.170.270.13471History84798-5515800.500.150.330.13302aOut-of-range classical item statistics2.2.2 Differential Item Functioning AnalysesOne of the goals of test development is to assemble a set of items that provides an estimate of a student’s ability that is as fair and accurate as possible for all groups within the population. DIF statistics are used to identify items that groups of students with the same underlying level of ability have different probabilities of answering correctly. If the item is differentially more difficult for an identifiable subgroup when conditioned on ability, the item may be measuring something different from the intended construct. However, it is important to recognize that DIF-flagged items might be related to actual differences in relevant knowledge or skills (item impact) or a statistical Type I error (a “false positive”). As a result, DIF statistics are used to identify potential sources of item bias. Subsequent review by content experts and bias/sensitivity committees is required to determine the source and meaning of performance differences.For the OCCT, the Mantel-Haenszel (MH) procedure ADDIN EN.CITE <EndNote><Cite><Author>Mantel</Author><Year>1959</Year><RecNum>198</RecNum><MDL><REFERENCE_TYPE>0</REFERENCE_TYPE><AUTHORS><AUTHOR>Mantel, N.</AUTHOR><AUTHOR>Haenszel, W.</AUTHOR></AUTHORS><YEAR>1959</YEAR><TITLE>Statistical aspects of the analysis of data from retrospective studies of disease</TITLE><SECONDARY_TITLE>Journal of the National Cancer Institute</SECONDARY_TITLE><VOLUME>22</VOLUME><PAGES>719-748</PAGES></MDL></Cite></EndNote>(Mantel & Haenszel, 1959; Holland & Thayer, 1988) was used to estimate DIF statistics for subgroups of interest defined by the SDE for NCLB accountability. Comparison groups were based on gender (female versus male), ethnicity (Hispanic versus White, American Indian versus White, African American versus White, Asian versus White, Pacific Islander versus White), and economic status (students who are economically disadvantaged as indicated by participation in a free and reduced-price school lunch program versus students who are not economically disadvantaged). Items with statistically significant differences in performance were flagged for possible biased or unfair content that was undetected in earlier fairness and bias reviews. DIF analyses results were not considered as valid when the sample size for either the reference group (i.e., male, White, not economically disadvantaged) or focal group (i.e., female, Hispanic, American Indian, African American, Asian, Pacific Islander, economically disadvantaged) was less than 300 and the sample size for the two groups combined was less than 700.The MH procedure is one of the more commonly used methods to detect DIF. This method uses contingency tables to compare the probability of success on each item for the studied groups of interest after matching on overall ability (i.e., total test score). The common odds ratio is estimated across all categories of matched examinee ability. The resulting estimate is interpreted as the relative likelihood of success on a particular item for members of two groups when matched on ability. As such, the common odds ratio provides an estimated effect size where a value of unity indicates equal odds, and thus no DIF ADDIN EN.CITE <EndNote><Cite><Author>Dorans</Author><Year>1993</Year><RecNum>199</RecNum><MDL><REFERENCE_TYPE>7</REFERENCE_TYPE><AUTHORS><AUTHOR>Dorans, Neil J.</AUTHOR><AUTHOR>Holland, Paul W.</AUTHOR></AUTHORS><YEAR>1993</YEAR><TITLE>DIF detection and description: Mantel-Haenszel and standardization</TITLE><SECONDARY_AUTHORS><SECONDARY_AUTHOR>Holland, Paul W.</SECONDARY_AUTHOR><SECONDARY_AUTHOR>Wainer, Howard</SECONDARY_AUTHOR></SECONDARY_AUTHORS><SECONDARY_TITLE>Differential item functioning</SECONDARY_TITLE><PLACE_PUBLISHED>Hillsdale, NJ</PLACE_PUBLISHED><PUBLISHER>Lawrence Erlbaum</PUBLISHER><PAGES>35-66</PAGES></MDL></Cite></EndNote>(Dorans & Holland, 1993). The common odds ratio () is estimated as where Rrs = the number of examinees in the reference group who answer the item correctly,Wfs = the number of examinees in the focal group who answer the item incorrectly,Rfs = the number of examinees in the focal group who answer the item correctly,Wrs = the number of examinees in the reference group who answer the item incorrectly, Nts = the total number of examinees.The odds ratio takes on values from 0 to infinity and is interpreted as the average factor by which the odds that an examinee of the reference group will answer an item correctly exceed that of a member of the comparable focal group. The statistical test is Ho: = 1, where is a common odds ratio assumed equal for all matched score categories s = 1 to S. Values less than unity indicate DIF in favor of the focal group, a value of unity indicates the null condition, and a value greater than one indicates DIF in favor of the reference group. The associated MH 2 is distributed as a chi-square random variable with 1 degree of freedom. As an index of magnitude, the odds ratio is frequently transformed to a delta scale given by MH D-DIF = -2.35 ln () where negative values indicate DIF in favor of the reference group and positive values favor the focal group. A classification scheme puts items into three DIF categories on the basis of a combination of statistical significance and magnitude (absolute value) of MH D-DIF (Zwick and Ercikan, 1989):A-items or negligible DIF: MH D-DIF is not statistically different from 0 (at the .05 level) or its absolute value is less than 1 delta unit;B-items or intermediate DIF: MH D-DIF is statistically different from 0 (at the .05 level) and its absolute value is at least 1 but less than 1.5 or an absolute value of at least 1 but not significantly greater than 1 (at the .05 level);C-items or large DIF: MH D-DIF is statistically different from 1 (at the .05 level) and its absolute value is at least 1.5.Items classified as B+ or C+ tend to be easier for members of the focal group than for members of the reference group whose total scores on the test are like those of the focal group. Items classified as B- or C- tend to be harder for members of the focal group than for members of the reference group whose total scores on the test are like those of the focal group. Items classified in category C were sent to test development staff for review. They were asked to consider any identifiable characteristics that may have contributed to the differential item functioning. The items were then submitted to the SDE for further review. Table 2.2.A shows that a small number of operational items were flagged for C DIF. These items were reviewed by DRC’s content experts. Recommendations were made by DRC on whether to remove an item with C DIF from scoring or not. SDE content experts further reviewed these items and made the final decision. As a result, no items were dropped in the 2010 administration. DIF analysis was also conducted on the field tests. Items with C DIF were flagged and reviewed by SDE and DRC’s content experts at the data review meeting. Appendix A reports the items rejected due to DIF and/or other poor statistics. 2.2.3 Item Calibration and Equating The purpose of item calibration and equating is to create a common scale for expressing the difficulty estimates of all the items across forms within a test. The scale is initially defined so that the examinees used in the calibration will have a mean score of 0 and a standard deviation of 1. It should be noted that the metric of this scale is often referred to as the “theta” metric. This scale is not used for reporting purposes because its values typically range from –3.0 to +3.0, which is a scale that is not easily understood. Therefore, following calibration and equating, the scale is usually transformed to a reporting scale that can be understood more easily by students, teachers, and other stakeholders. The three-parameter logistic (3PL) model was used to calibrate the OCCT test items. The 3PL model expresses the probability that a person with ability will respond correctly to item j as a function of item and ability parameters:where:Ujis the response to item j, 1 if correct and 0 if incorrect;ajis the slope parameter of item j, characterizing its discriminating power;bjis the threshold parameter of item j, characterizing its difficulty; andcjis the lower asymptote parameter of item j, reflecting the chance that students with very low proficiency will select the correct answer, sometimes called the “pseudo-guessing” level.The parameters estimated for the 3PL model were discrimination (a), difficulty (b), and the pseudo-guessing level (c). All item response theory (IRT) based analyses were conducted using PARSCALE (Muraki and Bock, 2003).For each operational test, items were calibrated separately by content and grade. The calibrations were examined to assess the quality of the parameter estimates and model-data fit. Items were flagged for:a-parameters less than 0.3 or greater than 2.3b-parameters less than -3.5 or greater than 3.5c-parameters greater than 0.35 for 4-option itemsNot calibrated due to biserial correlations less than 0.10Data model fit is “bad.” (This criterion varies depending on the response n-count.) Flagged items were reviewed to determine whether they should be excluded from scoring. No items in the 2010 OCCT were excluded from scoring because of IRT results.After the final set of item parameter estimates were established, the scales for the 2010 operational tests were linked to the reporting scale using the test characteristic curve (TCC) method described by Stocking and Lord (1983). The Stocking and Lord procedure involves finding a linear transformation that will minimize the sum of squared differences between two TCCs generated from two sets of anchor item parameters.Embedded in the 2010 OCCT were sets of anchor items that had served as operational items in the 2009 OCCT. These items were positioned so their sequences were very similar to that in the prior year. The sets were chosen for being both content and statistically representative of the entire test to ensure an accurate equating result. The anchor set is mostly unique for each testing cycle, though some items may be used as anchors for multiple administrations. Repeated use of an item creates the risk of overexposure and is avoided in practice. Table 2.2.C summarizes the number of anchor items per test. The parameters for the 2009 items were expressed on the reporting scale. These 2009 item parameters served as a reference item set and were used with their 2010 counterparts and the Stocking and Lord procedure to find transformation constants. These constants were used to transform the 2010 item parameters so that they were expressed on the reporting scale. Once this was done, the transformed parameters were used to generate raw score to scaled score conversion tables. Table 2.2.C Number of Anchor Items by Subject and GradeSubject Grade Number of Operational Items per Form a Number of Operational Items in Anchor Set a Reading 35017Mathematics 34520Reading 45018Mathematics 45019Reading 55016Mathematics 55018Science 54516Social Studies 56020Reading 65014Mathematics 65018Reading 75017Mathematics 75017Geography 74515Reading 85015Mathematics 85017Science84514U.S. History 84515Writing 5, 8 10a Operational item counts include anchor items.Field-test items were placed on the operational scale in a similar fashion. For each content/grade, field-test items were calibrated with the operational items. Resulting field-test parameters were placed on the OCCT reporting scale using the operational items as the anchor set in the Stocking and Lord procedure. 2.2.4 Raw Score to Scaled Score Conversion Since 2005, the OCCT scaled scores have been produced using a number-correct scoring procedure that is based on IRT. This procedure produces maximum-likelihood trait estimates for each obtainable raw score, except for raw scores at chance or below-chance levels and the perfect raw score. It is conventional to assign scaled scores to at and below-chance level raw scores and perfect raw scores using a rational, but not necessarily maximum likelihood, procedure. These values are called the lowest obtainable scaled score (LOSS) and the highest obtainable scaled score (HOSS). The LOSS and HOSS values assigned to all OCCT operational tests were 400 and 990, respectively.For all MC tests, the OCCT score scale uses a three-digit integer that spans a range from 400 (LOSS) to 990 (HOSS). The Proficient cut score for reading and Mathematics and the Satisfactory cut score for science and social studies is 700 for all tests. The raw-score to scaled-score conversion tables are provided in Appendix B. 2.2.5 Test Score ReliabilityTest score reliability focuses on the extent to which differences in test scores reflect true differences in the knowledge, ability, or skills being tested rather than random fluctuations. The variance in the distributions of test scores, essentially the differences among individuals, is partly due to real differences in the knowledge, skills, or ability being tested (called true score variance) and partly due to random factors that cause variability in examinee performance (called error variance). The number used to describe reliability is an estimate of the proportion of true score variance to total variance. Several different ways of estimating this proportion exist. Coefficient AlphaWhen the goal is to estimate the precision of a set of test scores from a single administration, a measure of internal consistency (sensitive to random errors associated with item content sampling) is frequently used to estimate reliability. For the OCCT, the measure of internal consistency called coefficient alpha (Cronbach, 1951) was used to estimate the reliability of the test scores. The formula for coefficient alpha is given bywhere k is the number of items on the test, is item score variance summed over all items, and is observed-score variance.Internal consistency measures apply only to the test form being analyzed. They do not take into account form-to-form variation due to equating limitations, nor are they sensitive to day-to-day variation due, for example, to state of health or testing environment. Reliability coefficients may range from 0 to 1. The higher the reliability coefficient for a set of scores, the more likely it would be for individuals to obtain very similar scores over replicated testing (e.g., using the same number of items, sampling same content domain(s), etc.). The internal consistency of the multiple-choice test scores are reported in Table 2.2.A for all examinees and in Tables 3.3 – 3.21 by demographic subgroup.When a test contains different components (e.g., content standards), the stratified alpha coefficient can provide a more accurate estimate of the overall test reliability (Qualls, 1995). The stratified alpha coefficient is calculated by ,where, is the variance of the total test scores; is the variance of scores for each test component (i.e., content standards in this case); and is the coefficient alpha reliability for scores from content standard J. The stratified alpha coefficients for the multiple-choice test scores are reported in Table 2.2.A for all examinees and in Tables 3.3-3.21 by demographic subgroup. Standard Error of MeasurementThe standard error of measurement (SEM) is the standard deviation of the errors of measurement that is associated with the test scores of a specific group of test takers. In Classical Test theory (CTT), an overall SEM can be estimated as a function of the standard deviation of observed scores and test reliability coefficient:where SEM is standard error of measurement, sx is standard deviation of observed scores, and rxx is a coefficient of reliability.The SEM is particularly useful in determining the confidence interval (CI) that captures an examinee’s true score. Assuming that measurement error is normally distributed, it can be said that upon infinite testing replications, approximately 95 percent of the CIs of 1.96 SEM around the observed score would contain an examinee’s true score (Crocker & Algina, 1986). For example, if an examinee’s observed score on a given test equals 15 points, and SEM equals 1.92, one can be 95% confident that the examinee’s true score lies between 11 and 19 points (15 3.76 rounded to the nearest integer). Table 2.2.A provides the SEM for each multiple-choice test.Conditional Standard Error of MeasurementFrom the IRT framework, a standard error of measurement can be estimated for each measured ability. Thus, it is often referred to as a conditional standard error of measurement (CSEM). The expected a posterior estimation of CSEM proposed by Kolen, Zeng, and Hanson (1996) was used for the OCCT. The calculation of CSEM can be expressed as: where is the scaled score for a particular number correct score X; is the IRT ability scaled value conditioned on; and is the probability function that is computed using a recursive algorithm given by Thissen, Pommerich, Billeaud, and Williams (1995). For the operational OCCT, CSEMs were provided for each obtainable scaled score (see Appendix B).Reliability of Performance-Level Classification DecisionsStudent performance on the OCCT is classified into one of four achievement levels using cut scores adopted by the SDE. Table 2.2.D provides the cut score for each achievement level and the CSEM associated with each cut score in 2010. Table 2.2.D Conditional Standard Errors of Measurement for Each Achievement Level Cut ScoreContent AreaGradeLimited KnowledgeSatisfactoryAdvancedScale ScoreCSEMScale ScoreCSEMScale ScoreCSEMReading364926700238915046582170021845495641237002383047664726700218283976682470022802328655267002583341Mathematics363628700237983646392970025816385642327002376725666025700207542276672870026766248662277002477127Science56384970026814248647627002882924Social Studies5645377002478620Geography7595707003384733History8622497002582132Writinga526na36na54na825na36na54naaWriting cut scores are in the composite score metric. The reliability of 2010 achievement-level classification decisions was assessed using the computer program BB-CLASS (Brennan, 2004), which provides two statistics that describe the reliability of classifications based on test scores (Livingston & Lewis, 1995). More specifically, information from an administration of one form is used to estimate the following: Decision Accuracy, which describes the extent to which performance-level classification decisions based on the administered test form would agree with the decisions that would be made on the basis of a perfectly reliable test (i.e., meaning if it was possible to know each examinee’s true score). Decision accuracy answers the question: How does the actual classification of test takers, based on their single-form scores, agree with the classification that would be made on the basis of their true scores, if their true scores were somehow known?Decision Consistency, which describes the extent to which classification decisions based on the administered test form would agree with the decisions made if a parallel alternate form had been administered. Decision consistency answers the question: What is the agreement between the classifications based on two non-overlapping, equally difficult forms of the test?For each performance level and test, true scores and single-form scores on forms parallel to the one actually given are estimated following the Livingston and Lewis (1995) method. The decision accuracy is estimated using an estimated joint distribution of reported performance level classifications on the current form of the exam and the performance-level classifications based on the true score. Decision consistency is estimated using an estimated joint distribution of reported performance-level classifications on the current form of the exam and performance-level classifications on the parallel alternate form. In each case, the proportion of performance-level classifications with exact agreement is the sum of the entries in the diagonal of the contingency table representing the joint distribution. Reliability of classification at each performance-level cut score is estimated by collapsing the joint distribution at the passing score boundary into a 2-by-2 table and summing the two entries. Table 2.2.E provides the results for decision accuracy and consistency analyses that were conducted at the Limited Knowledge, Proficient/Satisfactory, and Advanced cut scores, and for the four performance levels (total). It should be noted that decision accuracy and consistency indices for the four performance levels should be lower than those for each cut, as shown in Table 2.2.D. This is not surprising since classification using four levels would allow more opportunity to change achievement levels. Hence there would be more classification errors in the four achievement levels, resulting in lower consistency indices.For the OCCT, a PASSing score is one that meets or exceeds the Proficient/Satisfactory cut score. Across all tests, the decision accuracy of the Proficient/Satisfactory cut scores ranged from 0.89–0.94 and decision consistency ranged from 0.85–0.92.These results indicate that at least 89% students meeting or exceeding the Proficient/Satisfactory cut score would receive the same PASS/fail classification if their true scores were known. If a parallel test were administered, at least 85% or more of students meeting or exceeding the Proficient/Satisfactory cut score would be classified in the same way.Table 2.2.E Estimates of the Reliability of Decisions for Specified Cut Scores aContent AreaGradeDecision AccuracyDecision ConsistencyLimited KnowledgeProficient/SatisfactoryAdvancedTotalLimited KnowledgeProficient/SatisfactoryAdvancedTotalReading30.950.910.980.840.930.880.970.7840.940.910.970.820.910.870.960.7650.950.910.910.770.930.870.900.7160.940.910.940.800.920.880.930.7370.930.900.910.750.900.860.880.6780.940.910.910.770.920.870.880.69Mathematics30.960.920.910.790.940.890.880.7140.950.910.920.780.920.870.890.7050.950.910.910.770.920.870.870.6860.930.910.910.760.900.870.870.6770.920.890.900.720.880.850.870.6480.930.900.910.750.900.860.870.66Science50.980.940.900.820.970.920.860.7680.980.940.920.830.960.910.880.77Social Studies50.940.910.920.780.920.880.890.70Geography70.980.920.910.810.970.890.870.74History80.950.910.930.790.930.880.910.72a: The analysis was based on the final data files with students who took the standard OCCT. 2.2.6 ValidityAs noted in the Standards for Educational and Psychological Testing, “validity refers to the degree to which evidence and theory support the interpretations of test scores entailed by the proposed uses of the tests” (AERA, APA, & NCME, 1999, P.9). Content representativeness considerations, item bias (i.e., DIF) analysis, and correlations among content standards are often used as sources of validity evidence.Each test’s blueprint specifies the proportion of items that should be devoted to any given content unit. The blueprint is used as a guide by test developers when assembling a test from a pool of candidate items that are classified by content unit. Validity evidence related to test content is bolstered to the extent that the numbers of items allocated to each PASS Standard/Objective reflect what is specified by the test blueprints. Tables 1.2 to 1.18 in Chapter 1 provide content validity evidence by standard for the 2010 OCCT.Differential item functioning with respect to gender, ethnicity, and economic status helps address construct-irrelevant variance, which represents an important threat to the validity of achievement tests. As noted in the section of Differential Item Functioning Analyses, field-test items are screened and reviewed for DIF by SDE content specialists. Only items approved by SDE are eligible for operational use. DIF analyses were also conducted on the operational items. After SDE and DRC’s content specialists’ review, no item was dropped from the operational tests. The number of operational and field-test items with C DIF were reported in Tables 2.2.A and 2.2.B.Intercorrelations among standards provide evidence of convergent test validity. The analyses were performed by summing the obtained raw score points for each standard and then correlating the subtotals associated with each standard. Standards with low point totals, e.g., less than five, usually have markedly attenuated coefficients, meaning that they will be spuriously low in magnitude. Tables 1.3.B to 1.3.R list the numbers of items associated with each standard. The correlations among standards are reported in the left corner of Tables 2.2.F through 2.2.V. The correlations corrected for attenuation are reported in the right corner of Tables 2.2.E through 2.2.U. Correcting for attenuation adjusts the correlation between the two measures to account for the unreliability of both. Although the theoretical upper bound for a correlation is 1.0, disattenuated correlations can be greater. This is often seen in practice when the correlations are relatively high and the reliabilities relatively low. However, two underlying factors should be noted. The first is that sample statistics are being used to estimate population parameters. The second, and likely more prevailing issue, is that something akin to a “design misspecification” occurs. The internal consistency reliability indices used for the OCCT likely do not capture all the sources of random error in the test scores, and, as such, might over estimate reliability. One might also postulate potential negative biases (e.g., lack of item homogeneity due to multidimensional content standards). Thus, it is possible that any given tabled disattenuated correlation may be too high, or too low, depending on which bias prevails. Also note that the correlations between standards and total test are spuriously inflated given they have items in common.Given that none of these tests have perfect reliabilities (equal to one), the disattenuated correlations are somewhat higher than the correlations. Disattenuated correlations less than 1.0 suggest that the different strands are measuring slightly different aspects of the constructs. Values around 1.0 suggest that the same or very similar constructs are being measured.Table 2.2.F Standards Intercorrelation: Grade 3 Reading?ReadingVocabularyComprehension/Critical LiteracyLiteratureResearch andInformationReading--?1.091.121.101.06Vocabulary0.84?--0.940.960.96Comprehension/Critical Literacy0.940.68?--0.990.90Literature0.830.630.70--?0.90Research and Information0.720.560.570.52--?Table 2.2.G Standards Intercorrelation: Grade 4 Reading?ReadingVocabularyComprehension/Critical LiteracyLiteratureResearch andInformationReading?--1.111.101.081.09Vocabulary0.82--?0.950.880.97Comprehension/Critical Literacy0.940.67--?0.920.94Literature0.800.540.65--?0.94Research and Information0.740.550.610.53--?Table 2.2.H Standards Intercorrelation: Grade 5 Reading?ReadingVocabularyComprehension/Critical LiteracyLiteratureResearch andInformationReading?--1.091.131.131.11Vocabulary0.87?--0.970.980.99Comprehension/Critical Literacy0.920.71?--1.020.97Literature0.870.670.71?--0.99Research and Information0.740.590.590.56--?Table 2.2.I Standards Intercorrelation: Grade 6 Reading?ReadingVocabularyComprehension/Critical LiteracyLiteratureResearch andInformationReading--?1.111.091.121.14Vocabulary0.79--?0.970.991.01Comprehension/Critical Literacy0.930.66?--0.990.98Literature0.880.620.74--?1.02Research and Information0.770.540.640.60--Table 2.2.J Standards Intercorrelation: Grade 7 Reading?ReadingVocabularyComprehension/Critical LiteracyLiteratureResearch andInformationReading--?1.121.131.161.15Vocabulary0.76--?0.950.950.95Comprehension/Critical Literacy0.930.60--?0.981.01Literature0.820.520.65--?0.97Research and Information0.780.500.650.54--?Table 2.2.K Standards Intercorrelation: Grade 8 Reading?ReadingVocabularyComprehension/Critical LiteracyLiteratureResearch andInformationReading?--1.101.111.151.16Vocabulary0.67--?0.970.920.98Comprehension/Critical Literacy0.930.56--?0.971.01Literature0.870.480.70?--1.00Research and Information0.770.450.630.57--?Table 2.2.L Standards Intercorrelation: Grade 3 Mathematics?MathematicsPatterns andAlgebraicReasoning NumberSenseNumber Operationsand ComputationGeometry andMeasurementData Analysis and ProbabilityMathematics?--1.051.071.041.111.03Patterns andAlgebraicReasoning 0.82?--0.940.840.930.90Number Sense0.780.59--?0.850.980.94Number Operationsand Computation0.860.600.57--?0.890.84Geometry andMeasurement0.850.610.600.62--?0.94Data Analysisand Probability0.810.610.590.600.62--?Table 2.2.M Standards Intercorrelation: Grade 4 Mathematics?MathematicsPatterns andAlgebraicReasoningNumberSenseNumber Operations and ComputationGeometry andMeasurementData Analysis and ProbabilityMathematics?--1.071.111.051.111.07Patterns andAlgebraicReasoning0.79--?0.920.890.900.96Number Sense0.830.57--?0.890.980.94Number Operationsand Computation0.870.620.62--?0.860.87Geometry andMeasurement0.790.530.590.57--?0.93Data Analysis and Probability0.760.560.560.570.52--?Table 2.2.N Standards Intercorrelation: Grade 5 Mathematics?MathematicsPatterns andAlgebraicReasoningNumberSenseNumber Operations and ComputationGeometry andMeasurementData Analysis and ProbabilityMathematics--?1.111.141.091.101.03Patterns andAlgebraicReasoning0.80?--1.010.940.950.89Number Sense0.800.58--?0.990.970.93Number Operationsand Computation0.820.580.59--?0.940.84Geometry andMeasurement0.870.610.610.63--?0.88Data Analysis and Probability0.790.550.570.540.60--?Table 2.2.O Standards Intercorrelation: Grade 6 Mathematics?MathematicsAlgebraicReasoningNumberSenseGeometryMeasurementData Analysis and StatisticsMathematics?--1.081.111.011.041.09Algebraic Reasoning0.84--?0.960.850.860.91Number Sense0.890.67--?0.840.930.94Geometry0.680.490.51--?0.830.87Measurement0.810.580.650.48--?0.88Data Analysis and Statistics0.800.580.620.470.56?--Table 2.2.P Standards Intercorrelation: Grade 7 Mathematics?MathematicsAlgebraicReasoningNumberSenseGeometryMeasurementData Analysis and ProbabilityMathematics?--1.131.131.111.081.10Algebraic Reasoning0.75--?0.940.920.900.91Number Sense0.850.54?--0.930.970.92Geometry0.790.500.57?--0.880.89Measurement0.770.500.600.52--?0.86Data Analysis and Probability0.800.500.570.530.51--?Table 2.2.Q Standards Intercorrelation: Grade 8 Mathematics?MathematicsAlgebraicReasoningNumberSenseGeometryMeasurementData Analysis and StatisticsMathematics?--1.071.111.111.111.02Algebraic Reasoning0.79?--0.890.850.900.82Number Sense0.790.53--?0.920.980.87Geometry0.730.470.49--?0.950.84Measurement0.880.590.620.55--?0.87Data Analysisand Statistics0.790.530.540.480.60?--Table 2.2.R Standards Intercorrelation: Grade 5 Science?ScienceObserve and MeasureClassifyExperimentInterpret and CommunicateScience?--1.141.141.141.12Observe and Measure0.82--?1.001.000.99Classify0.830.59?--1.010.99Experiment0.850.610.62--?0.99Interpret and Communicate0.900.640.650.68--?Table 2.2.S Standards Intercorrelation: Grade 8 Science?ScienceObserve and MeasureClassifyExperimentInterpret and CommunicateScience?--1.171.191.151.13Observe and Measure0.79--?1.030.991.00Classify0.760.51?--0.991.00Experiment0.880.590.56--?0.97Interpret and Communicate0.870.600.570.66--?Table 2.2.T Standards Intercorrelation: Grade 5 Social Studies?SocialStudiesEarlyExplorationColonialAmericaAmericanRevolutionEarly Federal PeriodGeographicSkillsSocial Studies--?1.151.111.101.091.07Early Exploration0.74--?1.031.010.990.95Colonial America0.840.56--?0.980.970.93American Revolution0.850.560.64--?0.980.93Early Federal Period0.760.500.580.59--?0.89Geographic Skills0.880.560.650.660.58--?Table 2.2.U Standards Intercorrelation: Grade 7 Geography?GeographyGeographicToolsRegionsPhysicalSystemsHumanSystemsHuman/EnvironmentInteractionGeographySkillsGeography?--1.131.171.181.191.131.13Geographic Tools0.65--?0.971.000.980.961.02Regions0.840.47?--1.011.010.961.01Physical Systems0.750.420.53--?0.990.981.02Human Systems0.770.420.550.47--?0.991.01Human/Environment Interaction0.800.450.560.510.53--?0.97Geography Skills0.740.450.560.500.500.52?--Table 2.2.V Standards Intercorrelation: Grade 8 U. S. History?HistoryHAHBHCHDHEHFHGHHHIHistory--?1.131.071.121.141.141.081.221.131.05HA0.76--?0.981.021.031.020.971.070.980.93HB0.770.54--?0.930.960.980.950.960.950.90HC0.730.500.49--?1.021.030.931.051.000.93HD0.740.500.510.48--?0.990.971.030.980.91HE0.670.450.460.450.42--?1.011.071.030.95HF0.720.490.520.460.470.45--?1.020.980.94HG0.570.380.370.360.350.330.36--?1.030.96HH0.730.480.500.480.470.440.480.36?--0.92HI0.740.490.510.480.470.440.500.360.48--?HA – Social Studies Process SkillsHB – Causes of American RevolutionHC – Results of American RevolutionHD – Governing Documents/Early Federal PeriodHE – Northern/Southern Economic GrowthHF – Jacksonian EraHG – Cultural Growth and ReformHH – Westward MovementHI – Eve of War2.3 Analysis of the Writing TestsThe administration of the spring 2010 Writing assessment took place on February 24th and March 3rd. Students at Grades 5 and 8 were given one operational writing prompt. The Grade 5 operational prompt was field-test prompt #7 in 2007; the Grade 8 operational prompt was field-test prompt #9 in 2007. The following sections describe the statistical analyses conducted to place the 2010 operational writing prompts on the scale established in 2006. 2.3.1 Prompt Scoring FormulaThe writing score is a weighted composite of five analytic scores that focus on specific domains of writing skills. These skills are listed in Table 2.3.A. Each student’s response to a prompt is read by two independent raters; the raters’ scores for each domain are averaged. The domain scores range from 1 (the lowest score) to 4 (the highest score). The raw writing score is calculated as a weighted composite of the average of two independent ratings for each of the five analytic traits: Raw Composite Score (RCS) = 15*(0.30*ID + 0.25*OUC + 0.15*WC + 0.15*SP + 0.15*GUM)Table 2.3.A Weights Assigned to Writing Analytic TraitsWriting Analytic TraitsWeightIdeas and Development (ID)30%Organization, Unity, and Coherence (OUC)25%Word Choice (WC)15%Sentences and Paragraphs (SP)15%Grammar, Usage, and Mechanics (GUM)15%2.3.2 Statistical Adjustments to Scale the Writing ScoresThe baseline for each grade’s operational writing scale was 2006. To place the 2010 operational prompts on the 2006 scale, transformation constants were obtained to adjust RCS scores for prompt difficulty and for rater-year effects relative to a target distribution. All calculations were performed on the RCS prior to rounding. For reporting, the scaled composite scores (SCS) were then rounded to the nearest integer between 15 and 60.Adjustment for Prompt Difficulty and Rater-Year Effects For each of the 2007 field-test prompts, ETS provided a set of unique transformation constants to adjust for both prompt difficulty and rater-year effects. Based on ETS’ report, OCCT Writing: Scaling the 2007 Field-Test Prompts (ETS, 2007), the following equation was used to adjust the 2010 raw composite scores ():.Where represents the scaled composite score after adjusting for the 2007 prompt difficulty and rater-year effects; and are the additive and multiplicative constants (Grade 5: = -0.647451 = 1.023409; Grade 8:= -1.572272 = 1.043021).Adjustment for Rater-Year EffectsIn 2010, DRC performed a rater drift study similar to the one conducted by ETS in 2007 to adjust for the rater-year effects. DRC’s Performance Assessment Services (PAS) staff randomly pulled 510 student responses from 2007 for each grade’s prompt and distributed these into the current administration scoring throughout the entire scoring timeframe. The student responses were pulled by lithocode and were only the valid scored responses (i.e., no condition codes such as off-topic present). 2010 scorers then rescored these papers. The lithcodes randomly pulled by PAS were provided to EIS for generating the data files for Psychometric Services (PS) department. The 2010 rater-year effect constants, and , were determined by using the means () and standard deviations () of the 2007 raw composite scores and the 2010 rescored raw composite scores () as calculated below for each grade: The formula for the 2010 rater-year effects adjusted score is:Once the transformation constants are applied to the 2010 rescored raw composite scores, the mean and standard deviation of the adjusted 2010 scores should be the same as the 2007 mean and standard deviation. A Compound AdjustmentFollowing the calculation of the 2010 transformation constants, compound adjustments were made to the 2010 operational raw composite scores. The generic formula for producing the final 2010 scaled composite score () is: To simplify the calculation, transformation constants for each of the Grades 5 and 8 were calculated as below:The following formula was used to calculate the final scaled composite scores. The calculated values are rounded to the nearest whole integer. Resulting values outside of the 15-60 range are set to the nearest bound. The scaled composite score will be converted to the performance level using Table 2.3.B. Table 2.3.B Scaled Score Ranges for Each Achievement LevelGrade 5Scaled Composite ScoreGrade 8Scaled Composite ScorePerformance Level54 – 6054 – 60Advanced36 – 5336 – 53Satisfactory26 – 3525 – 35Limited Knowledge15 – 2515 – 24UnsatisfactoryUnscorableUnscorableUnsatisfactorySummary statistics for the scaling analysis of the operational writing prompts are provided in Tables 2.3.C to 2.3.E. Table 2.3.C provides the sample means and standard deviations used to calculate the transformation constants for each grade. The results indicate that sampled students in both grades had lower 2010 prompt scores. Because the responses scored were the same across the two years, this indicates that the raters were more strict in 2010.Table 2.3.C Sample Means and Standard Deviations Used for Calculating ConstantsGradeStatisticRaters200720105N508508MIN1515MAX6060MEAN*42.8741.07STD*9.028.418N509509MIN1515MAX6060MEAN*44.3641.26STD*7.647.29*Tabled values are rounded for display purposes. Transformations were performed without rounding.Tables 2.3.D and 2.3.E provide the resulting score distribution statistics with no adjustment, only the ETS adjustment, and the compound DRC and ETS adjustment. Transformation constants are provided at the bottom of the Tables. The 2008 and 2009 score distributions are also provided for comparison. Relative to no adjustment and the ETS only adjustment, the DRC and ETS compound adjustment led to higher mean scores at both grades. Table 2.3.D Grade 5 Writing Results2010No Adjustment2010ETS Only2010DRC & ETS2009 Scores2008 ScoresStatisticN4499444994449944366541988MIN1515151917MAX6060606060MEAN41.5841.7943.6744.5744.01STD7.697.828.258.548.91Perf Level %PL 1 U2.93.12.93.54.6PL 2 L19.819.615.013.715.0PL 3 S71.471.171.569.067.4PL 4 A6.06.310.713.813.0ConstantsDRCCAdditive-1.147132DMultiplicative1.071898ETSAAdditive-0.647451BMultiplicative1.023409CombinedEAdditive-1.821436FMultiplicative1.096990Table 2.3.E Grade 8 Writing Results2010No Adjustment2010ETS Only2010DRC & ETS2009 Scores2008 ScoresStatisticN4215342153421534096242271MIN1515161918MAX6060606060MEAN42.7842.8946.1945.7345.50STD7.037.227.397.427.04Perf Level %PL 1 U2.12.31.92.01.8PL 2 L14.414.29.68.88.7PL 3 S77.777.476.475.278.0PL 4 A5.96.112.214.111.6ConstantsDRCCAdditive1.113081DMultiplicative1.048035ETSAAdditive-1.572272BMultiplicative1.043021CombinedEAdditive-0.411305FMultiplicative1.0931222.3.3 Rater Agreement for Operational Writing PromptsAs stated earlier, student responses were rated by two independent raters, and the score for each domain was the average of the two ratings. The average of the two ratings was used for the calculation of the final composite score. Consistency between the two ratings was evaluated with the following statistics: Percentage of exact agreement between raters Percentage of adjacent agreement between raters Correlation between ratings 1 and 2Table 2.3.F provides a summary of the rater-agreement analysis for the Grade 5 and Grade 8 operational prompts. Included are the mean and standard deviation of assigned ratings, the percentage of exact and adjacent ratings, and the correlation between ratings. In Grade 5 writing, the exact agreement rate ranged from 69% to 75%, and the sum of exact plus adjacent agreement rates ranged from 99% to 100%. For Grade 8 Writing, the exact agreement rate ranged from 73% to 79%, and the sum of the exact plus adjacent agreement rates was 100%. The correlations between ratings ranged from 0.63 to 0.67 in Grade 5 and 0.60 to 0.66 in Grade 8. In general, the raters were fairly consistent in each domain.Table 2.3.F Inter-rater Agreement for Operational Writing PromptsGradeDomainaNRating 1Rating 2Percent AgreementCorrbMeanSDMeanSDExactAdjacentExact+Adjacent5ID 448782.790.612.790.6175251000.64OUC 448782.660.642.660.647326990.65WC 448782.800.622.790.6273261000.63SP 448782.780.712.770.707029990.66GUM 448782.880.712.870.7169311000.678ID 420152.850.582.840.5875251000.63OUC 420152.830.582.820.5876241000.63WC 420052.930.522.910.5279211000.60SP 420152.840.612.830.6175251000.66GUM 420152.830.612.830.6173271000.63a ID=Ideas and Development; OUC=Organization, Unity, and Coherence; WC=Word Choice; SP=Sentences and Grammar; GUM=Grammar, Usage, and Mechanicsb Pearson correlation between first and second ratingsCHAPTER III. STATE RESULTSIn this section, performance on the OCCT is summarized for the participating Oklahoma student population and for demographic subgroups. All reported results are based on valid scores on the 2010 forms in the final student data received by June 24, 2010. These data differ from the analysis data in several ways: corrections were made to student and school information, invalidations and missing data issues were resolved, and all students who took the standard, but not the equivalent and Braille, OCCTs were included. Thus, final counts of examinees by test differ somewhat from samples used for item and test analysis.As described in Chapter II, prior to the release of student reports, raw scores were converted to a reporting scale metric. Raw scores on the Multiple-Choice tests were converted to scaled scores using the conversion tables provided in Appendix B. For the Writing tests, analytic scores were converted to composite scores using the formulas provided in the previous section. Achievement level scores were assigned as well using the SDE-established OCCT cut scores. The means and standard deviations of students’ raw scores and scaled or composite scores are shown in Table 3.1. Table 3.2 provides the percentage of students in each achievement category in 2008, 2009 and 2010. Tables 3.3 to 3.21 provide test results by demographic subgroups. Tables B.1 through B.19 provide the raw score, scaled score, CSEM, achievement level, and frequency distributions for each OCCT test.Table 3.1 Means and Standard Deviations of Students' Raw Scores and Scaled ScoresContent AreaGradeValid NRaw ScoresScaled ScoresMeanSDMinMaxMeanSDMinMaxReading34373734.48.7150734.681.840099044302135.68.6250720.173.740099054251037.28.2050726.678.940099064234333.29.3150720.978.040099074088735.18.0350728.170.840099084026435.88.1050740.081.9400990Mathematics34441433.18.3145736.889.440099044370832.48.0045734.490.040099054293830.48.4045729.484.240099064260230.38.6045719.776.140099074103129.37.7045723.779.140099084021430.68.2145728.183.8400990Science54345731.27.8045775.372.240099084110929.17.6045772.068.5400990Social Studies54660435.910.5060731.676.2400990Geography74463429.07.8245775.387.8400990History84388528.18.6145733.181.6400990Writing54487041.57.7156043.78.2156084200542.77.0156046.27.41660a Mean writing composite scores are reported.Table 3.2 presents the percentage of students scoring in each of the four achievement levels for all students for the current year and the past two years. It shows that for most of the grades and subject areas, the percentage of students scoring at or above the Satisfactory/Proficient achievement level increased slightly from 2009 to 2010.Table 3.2 Percentage of Students Performing within Each Achievement Category in 2008 to 2010Content AreaGradeUnsatisfactoryLimited KnowledgeSatisfactory/ ProficientAdvancedSatisfactory or Advanced200820092010200820092010200820092010200820092010200820092010Reading?32141311191883636744287676941171862017885762463926365541313122222745656109984656567151612202172565798781646475171817131564545313161478706784151414181673575791013826770Mathematics?321211202220624548162121786670421313152020645051191716836767521312112220593536273032876568661818131918483131323233806364772220151515513537262828776365842118141816573639242527816166Science?534312121058585827262984848784339119756969121719878688Social Studies?5121412201918464647222224686871Geography7332171616676364131818808182History8999252322565857101012666869Tables 3.3 to 3.21 present the scaled score and achievement level results for each test by population subgroups. Ethnic category membership is based on identifying one ethnicity; those identifying more than one or none are classified as “Other”. Economically disadvantaged is based on participation in Free and Reduced-Price Lunch. Table 3.3 Subgroup Results: Grade 3 ReadingGroupSubgroupValid NRaw ScoresAlphaStratified AlphaScaled ScoresPercent in Achievement LevelMeanSDMeanSDUnsatLimSatAdv?????????????Overall?4373734.48.70.890.89734.681.812.718.266.72.4??GenderFemale2189733.88.80.890.89729.582.414.318.964.72.1?Male2179734.98.50.880.89739.980.811.017.568.82.7??EthnicityAfrican American460630.88.90.880.88702.281.022.123.953.10.9?American Indian832533.88.50.880.88728.877.812.919.765.61.7?Hispanic527131.18.70.880.88704.878.520.025.254.00.8?Asian89736.48.40.890.89755.485.58.716.270.05.1?White9432.58.30.870.87718.881.416.022.356.45.3?Pacific Islander2425535.98.30.880.88748.680.09.315.072.53.2?Other28933.19.00.890.89725.086.713.823.259.93.1??IEPNo4236234.78.50.880.88737.480.111.518.168.02.5?Yes137524.89.10.880.88648.985.548.922.029.10.1??ELLNo4292934.58.60.890.89736.081.112.117.967.52.5?Yes80825.88.80.870.87658.483.440.132.127.40.5??FLSNo1734737.37.80.880.88762.478.46.312.576.94.3?Yes2639032.48.70.880.88716.378.816.921.960.01.2Table 3.4 Subgroup Results: Grade 4 ReadingGroupSubgroupValid NRaw ScoresAlphaStratified AlphaScaled ScoresPercent in Achievement LevelMeanSDMeanSDUnsatLimSatAdv?????????????Overall?4302135.68.60.890.89720.173.717.817.461.53.4??GenderFemale2137335.28.80.900.90717.675.419.217.559.93.4?Male2160135.98.30.890.89722.771.816.417.363.03.4??EthnicityAfrican American456932.29.00.890.89692.072.729.320.948.51.3?American Indian831435.08.40.880.89714.370.318.919.060.02.2?Hispanic493832.49.00.890.89693.672.327.921.449.41.3?Asian85237.48.20.890.89736.975.213.314.666.75.5?Pacific Islander7733.89.80.920.92706.381.124.715.655.83.9?White2385637.08.10.880.89732.672.013.215.466.94.6?Other41534.58.50.880.88711.872.420.519.856.92.9??IEPNo4146135.98.40.890.89722.672.216.617.362.73.5?Yes156027.39.60.890.89653.079.650.019.929.50.6??ELLNo4241735.78.50.890.89721.273.217.317.362.03.4?Yes60426.08.60.860.86644.170.455.522.521.90.2??FLSNo1744338.37.50.880.88744.470.69.412.971.95.9?Yes2557833.78.70.890.89703.671.123.520.454.31.7Table 3.5 Subgroup Results: Grade 5 ReadingGroupSubgroupValid NRaw ScoresAlphaStratified AlphaScaled ScoresPercent in Achievement LevelMeanSDMeanSDUnsatLimSatAdv?????????????Overall?4251037.28.20.890.89726.678.913.021.756.48.9??GenderFemale2122336.88.60.900.90723.081.414.821.455.38.5?Male2123837.67.80.880.88730.376.111.222.057.59.3??EthnicityAfrican American450933.98.70.890.89695.776.822.329.644.04.0?American Indian819936.48.20.890.89718.376.514.324.055.16.5?Hispanic466734.28.60.890.89698.075.921.629.045.24.2?Asian80939.08.00.900.90747.484.89.017.857.715.5?Pacific Islander8736.08.70.900.90716.177.419.525.346.09.2?White2378738.67.60.880.88740.476.79.118.261.311.4?Other45236.18.80.900.90716.881.218.421.753.16.9??IEPNo4080137.57.90.890.89729.877.111.721.457.69.2?Yes170928.69.70.900.90650.783.743.729.225.61.5??ELLNo4205137.38.10.890.89727.678.412.621.656.89.0?Yes45926.78.60.870.87635.772.052.531.215.70.7??FLSNo1791739.97.00.870.87753.274.96.115.463.914.6?Yes2459335.28.50.890.89707.276.018.026.350.94.8Table 3.6 Subgroup Results: Grade 6 ReadingGroupSubgroupValid NRaw ScoresAlphaStratified AlphaScaled ScoresPercent in Achievement LevelMeanSDMeanSDUnsatLimSatAdv?????????????Overall?4234333.29.30.900.90720.978.015.520.657.16.8??GenderFemale2119332.69.60.900.90715.580.418.220.854.76.3?Male2111133.98.90.890.89726.475.012.920.359.47.4??EthnicityAfrican American454829.39.40.890.89688.777.026.827.343.22.7?American Indian802832.69.10.890.89715.374.816.422.156.25.3?Hispanic448529.79.50.890.89692.177.925.925.545.63.1?Asian82336.58.70.900.90751.579.67.817.458.416.4?Pacific Islander6729.410.90.920.92691.292.631.322.441.84.5?White2373434.88.90.890.89733.575.811.417.862.18.6?Other65832.88.60.870.87716.268.213.423.658.74.4??IEPNo4074533.69.10.890.89723.976.414.220.358.47.1?Yes159823.88.80.860.86643.978.250.326.022.90.8??ELLNo4201933.39.20.900.90721.677.415.220.657.46.9?Yes32421.79.00.870.87625.087.262.721.314.21.9??FLSNo1878536.28.50.890.89745.774.08.614.765.611.2?Yes2355830.89.30.890.89701.175.321.125.250.33.4Table 3.7 Subgroup Results: Grade 7 ReadingGroupSubgroupValid NRaw ScoresAlphaStratified AlphaScaled ScoresPercent in Achievement LevelMeanSDMeanSDUnsatLimSatAdv?????????????Overall?4088735.18.00.870.87728.170.817.515.453.313.7??GenderFemale2009134.48.40.880.88722.373.920.615.551.112.8?Male2076235.87.50.860.86733.867.114.515.355.514.7??EthnicityAfrican American412231.78.30.870.87698.470.629.619.545.25.7?American Indian776534.28.00.860.86720.068.620.217.152.210.6?Hispanic423732.18.40.870.87701.871.828.219.146.06.7?Asian83137.37.40.860.86749.068.411.911.954.022.1?Pacific Islander7332.39.50.900.90705.184.431.516.438.413.7?White2331136.57.50.860.86740.267.912.613.656.617.2?Other54834.88.10.870.87726.472.719.216.651.612.6??IEPNo3946835.47.80.870.87730.569.516.315.354.314.2?Yes141927.38.60.870.87662.674.050.520.127.32.0??ELLNo4058435.27.90.870.87728.870.217.115.453.613.8?Yes30324.68.80.870.87638.380.265.313.219.52.0??FLSNo1908537.47.10.850.85748.466.39.712.058.220.1?Yes2180233.18.20.870.87710.369.724.318.449.18.2Table 3.8 Subgroup Results: Grade 8 ReadingGroupSubgroupValid NRaw ScoresAlphaStratified AlphaScaled ScoresPercent in Achievement LevelMeanSDMeanSDUnsatLimSatAdv?????????????Overall?4026435.88.10.880.88740.081.914.116.456.612.9??GenderFemale2021235.28.30.880.88733.282.416.017.255.611.1?Male1998636.57.80.870.87747.180.812.115.557.714.7??EthnicityAfrican American405232.18.50.880.88703.680.726.022.945.95.3?American Indian751335.37.90.870.87733.577.514.618.257.110.0?Hispanic405032.08.90.890.89703.384.827.320.047.35.4?Asian82838.58.00.890.89770.789.28.110.655.625.7?Pacific Islander8232.99.80.910.91709.599.125.612.253.78.5?White2325437.27.50.860.86754.178.39.614.260.116.1?Other48534.88.20.870.88728.879.917.518.154.49.9??IEPNo3934136.08.00.880.88741.681.113.416.257.213.2?Yes92328.59.00.880.88670.185.641.623.033.02.4??ELLNo4000835.98.00.880.88740.881.413.716.356.913.0?Yes25623.07.90.830.83620.478.467.219.112.90.8??FLSNo1964038.27.10.860.86764.577.37.112.161.819.0?Yes2062433.58.30.870.87716.679.420.720.451.77.1Table 3.9 Subgroup Results: Grade 3 MathematicsGroupSubgroupValid NRaw ScoresAlphaStratified AlphaScaled ScoresPercent in Achievement LevelMeanSDMeanSDUnsatLimSatAdv?????????????Overall?4441433.18.30.900.91736.889.411.019.548.221.3??GenderFemale2237033.68.20.900.91742.489.89.918.448.223.4?Male2200032.68.30.900.90731.288.512.120.648.119.2??EthnicityAfrican American466729.29.10.910.91696.991.922.625.540.311.5?American Indian849132.68.20.900.90731.586.911.720.748.619.0?Hispanic536531.18.50.900.90715.587.915.824.145.914.2?Asian93436.17.70.910.91773.394.66.712.445.135.8?Pacific Islander9731.09.00.910.91717.595.314.427.840.217.5?White2456734.37.70.890.90749.786.27.717.150.225.1?Other29331.58.40.900.90721.488.814.324.643.317.7??IEPNo4245933.48.10.900.90739.688.510.319.048.722.1?Yes195527.28.70.890.89675.786.427.130.236.95.8??ELLNo4317533.28.20.900.90738.089.010.619.348.421.7?Yes123928.79.10.900.91693.392.224.526.838.510.3??FLSNo1750335.57.30.890.89762.785.25.714.449.730.1?Yes2691131.58.50.900.90719.988.014.522.847.115.6Table 3.10 Subgroup Results: Grade 4 MathematicsGroupSubgroupValid NRaw ScoresAlphaStratified AlphaScaled ScoresPercent in Achievement LevelMeanSDMeanSDUnsatLimSatAdv?????????????Overall?4370832.48.00.890.89734.490.013.120.050.516.4??GenderFemale2189232.98.00.890.90740.091.412.218.650.818.4?Male2176731.97.90.890.89728.988.114.021.450.214.4??EthnicityAfrican American464328.78.30.890.89693.587.924.426.242.66.8?American Indian843431.87.90.890.89727.387.114.221.850.513.5?Hispanic508030.48.20.890.89711.387.718.424.147.410.1?Asian87936.37.10.890.89784.995.25.211.149.434.2?Pacific Islander8229.99.00.910.91707.796.826.818.341.513.4?White2417333.67.50.890.89748.087.69.717.652.720.0?Other41731.87.70.880.88726.984.514.421.853.010.8??IEPNo4153932.77.80.890.89737.789.012.119.551.317.1?Yes216926.58.10.870.87670.884.232.429.235.33.1??ELLNo4273032.57.90.890.89735.789.512.719.850.916.6?Yes97827.18.70.890.89678.092.231.626.934.67.0??FLSNo1758234.97.10.880.88764.187.46.814.653.425.3?Yes2612630.78.00.880.89714.486.017.423.648.610.4Table 3.11 Subgroup Results: Grade 5 MathematicsGroupSubgroupValid NRaw ScoresAlphaStratified AlphaScaled ScoresPercent in Achievement LevelMeanSDMeanSDUnsatLimSatAdv?????????????Overall?4293830.48.40.890.89729.484.212.319.536.431.8??GenderFemale2156530.78.60.900.90732.587.012.618.035.434.0?Male2132430.18.20.890.89726.481.212.021.037.429.6??EthnicityAfrican American454827.18.50.880.89696.783.721.525.434.518.6?American Indian827929.28.30.880.88717.880.714.522.237.525.7?Hispanic477728.38.50.890.89709.483.717.323.236.722.8?Asian82734.58.20.910.91775.796.26.310.628.754.4?Pacific Islander9028.99.00.900.90713.293.215.623.337.823.3?White2397031.78.00.890.89742.481.68.917.036.537.6?Other44728.68.70.890.90711.887.819.218.337.425.1??IEPNo4080430.78.20.890.89732.882.611.219.036.733.0?Yes213423.88.50.880.88664.789.532.928.729.09.4??ELLNo4227230.58.30.890.89730.583.711.919.436.532.2?Yes66623.58.60.880.88661.189.436.327.924.910.8??FLSNo1803533.07.70.880.89756.181.16.513.935.244.5?Yes2490328.48.30.880.88710.081.116.623.637.222.7Table 3.12 Subgroup Results: Grade 6 MathematicsGroupSubgroupValid NRaw ScoresAlphaStratified AlphaScaled ScoresPercent in Achievement LevelMeanSDMeanSDUnsatLimSatAdv?????????????Overall?4260230.38.60.890.90719.776.118.118.230.832.8??GenderFemale2142830.68.70.900.90722.278.618.017.529.634.9?Male2113630.08.40.890.89717.373.418.219.032.030.7??EthnicityAfrican American457126.78.80.890.89689.377.031.721.127.219.9?American Indian807529.48.40.880.89712.472.620.019.732.328.0?Hispanic456328.18.80.890.89700.777.525.421.129.324.2?Asian84335.47.80.900.90769.580.88.29.820.361.7?Pacific Islander7128.39.50.910.91706.485.125.428.219.726.8?White2382231.58.20.890.89730.273.513.816.931.637.7?Other65729.38.60.890.89709.677.120.418.933.627.1??IEPNo4065630.68.40.890.89722.974.616.718.031.334.0?Yes194622.78.00.860.86654.377.748.423.919.87.9??ELLNo4223330.48.60.890.89720.375.717.818.230.933.0?Yes36922.59.20.890.89652.491.751.521.715.211.7??FLSNo1883832.97.90.890.89742.173.210.214.431.044.4?Yes2376428.28.50.880.89702.073.624.421.330.723.6Table 3.13 Subgroup Results: Grade 7 MathematicsGroupSubgroupValid NRaw ScoresAlphaStratified AlphaScaled ScoresPercent in Achievement LevelMeanSDMeanSDUnsatLimSatAdv?????????????Overall?4103129.37.70.870.87723.779.120.115.237.127.6??GenderFemale2027629.47.80.870.87725.280.519.614.737.428.3?Male2072029.17.70.860.87722.277.720.615.736.727.0??EthnicityAfrican American410926.08.00.870.87691.483.135.116.831.916.2?American Indian780228.27.50.850.86712.776.023.017.437.821.7?Hispanic430226.87.80.860.86698.880.030.117.234.418.3?Asian84533.77.20.870.87772.780.97.110.232.849.9?Pacific Islander7726.88.60.890.89699.887.936.414.323.426.0?White2334930.57.40.860.86736.075.515.113.938.332.7?Other54728.87.20.840.84719.470.919.218.341.021.6??IEPNo3934729.57.60.860.87726.477.718.815.137.628.5?Yes168422.97.50.840.84659.483.149.818.125.17.1??ELLNo4066829.37.70.870.87724.378.719.815.237.227.8?Yes36322.58.00.860.86653.388.953.716.819.89.6??FLSNo1912131.57.20.860.86746.574.912.012.237.937.9?Yes2191027.37.60.850.86703.877.227.217.836.318.7Table 3.14 Subgroup Results: Grade 8 MathematicsGroupSubgroupValid NRaw ScoresAlphaStratified AlphaScaled ScoresPercent in Achievement LevelMeanSDMeanSDUnsatLimSatAdv?????????????Overall?4021430.68.20.880.88728.183.818.116.038.627.3??GenderFemale2025830.78.40.890.89729.786.518.515.537.228.8?Male1988830.58.00.870.88726.880.717.616.539.925.9??EthnicityAfrican American404327.08.30.870.87692.783.330.718.537.413.4?American Indian749029.68.10.870.88718.880.820.917.838.023.4?Hispanic414128.08.40.880.88702.184.527.417.937.517.1?Asian85036.27.20.890.89793.892.36.16.630.257.1?Pacific Islander8529.09.50.910.91711.6100.624.716.531.827.1?White2311931.77.80.870.88740.080.213.714.939.531.9?Other48629.38.10.870.88715.081.222.017.939.320.8??IEPNo3907730.78.10.880.88730.083.017.415.938.827.9?Yes113724.38.10.860.86665.684.643.420.329.46.9??ELLNo3997030.68.20.880.88728.683.517.916.038.727.4?Yes24422.88.20.860.86651.289.650.820.921.76.6??FLSNo1956732.87.60.870.88750.780.411.012.839.636.6?Yes2064728.48.20.870.87706.881.224.919.037.618.5Table 3.15 Subgroup Results: Grade 5 ScienceGroupSubgroupValid NRaw ScoresAlphaStratified AlphaScaled ScoresPercent in Achievement LevelMeanSDMeanSDUnsatLimSatAdv?????????????Overall?4345731.27.80.880.88775.372.22.710.457.529.3??GenderFemale2183131.58.10.890.89778.576.13.210.354.232.3?Male2157630.97.60.870.87772.267.92.310.660.926.3??EthnicityAfrican American464326.97.90.870.87737.571.56.619.861.212.4?American Indian839330.57.70.870.87769.269.32.611.660.625.2?Hispanic480728.17.70.860.86747.969.14.715.864.615.0?Asian82933.67.60.890.89798.873.01.46.848.543.3?Pacific Islander8929.88.40.890.89761.180.65.613.559.621.3?White2426232.87.40.870.87789.469.41.77.254.736.3?Other43430.67.70.870.87770.666.22.312.758.326.7??IEPNo4095131.67.70.880.88778.870.82.49.457.630.6?Yes250624.87.80.850.85719.272.69.326.856.27.8??ELLNo4285231.37.80.880.88776.371.72.610.157.629.6?Yes60522.97.10.820.82703.368.310.933.252.43.5??FLSNo1816733.97.00.870.87800.067.91.15.451.142.5?Yes2529029.27.80.870.87757.670.03.914.162.219.8Table 3.16 Subgroup Results: Grade 8 ScienceGroupSubgroupValid NRaw ScoresAlphaStratified AlphaScaled ScoresPercent in Achievement LevelMeanSDMeanSDUnsatLimSatAdv?????????????Overall?4110929.17.60.860.86772.068.53.49.168.918.6??GenderFemale2076529.37.90.870.87774.071.83.89.266.020.9?Male2029528.97.30.850.85770.164.83.09.071.816.3??EthnicityAfrican American417725.27.30.830.83738.366.77.216.969.86.1?American Indian769828.27.30.840.84764.864.03.410.072.713.9?Hispanic418525.97.70.850.85743.472.87.414.668.89.1?Asian85532.47.70.880.88801.072.41.86.056.435.9?Pacific Islander8427.17.90.860.86752.773.87.113.166.713.1?White2360130.67.30.850.86784.665.12.16.667.823.6?Other50928.67.30.840.85766.370.13.38.474.713.6??IEPNo3923629.57.50.860.86775.066.83.08.369.319.4?Yes187322.06.90.810.81709.373.012.526.558.52.4??ELLNo4072429.27.60.860.86772.967.83.28.969.118.8?Yes38519.16.40.760.76680.179.123.433.242.11.3??FLSNo1984631.47.10.850.85792.064.21.65.366.027.1?Yes2126327.07.50.850.85753.467.15.112.771.510.7Table 3.17 Subgroup Results: Grade 5 Social StudiesGroupSubgroupValid NRaw ScoresAlphaStratified AlphaScaled ScoresPercent in Achievement LevelMeanSDMeanSDUnsatLimSatAdv?????????????Overall?4660435.910.50.890.89731.676.211.917.646.723.8??GenderFemale2382536.610.90.900.90735.780.312.315.744.527.4?Male2272235.210.00.880.88727.471.411.519.548.920.1??EthnicityAfrican American513630.410.10.880.88691.781.124.326.638.810.4?American Indian910435.110.10.880.88726.772.412.119.348.520.1?Hispanic516232.39.90.870.87706.874.417.524.345.412.8?Asian84539.810.70.910.91758.777.07.711.841.239.3?Pacific Islander9335.210.90.900.90724.386.215.115.149.520.4?White2578737.910.20.890.89745.472.48.414.048.029.5?Other47735.510.30.890.89729.373.213.616.447.822.2??IEPNo4125537.010.10.890.89740.071.99.015.948.726.4?Yes534927.18.90.840.84667.177.634.530.430.94.2??ELLNo4593836.010.50.890.89732.675.811.617.346.924.1?Yes66626.58.30.820.82664.772.435.433.527.53.6??FLSNo1884240.09.80.890.89759.769.05.310.747.236.7?Yes2776233.110.00.880.88712.574.916.422.246.315.1Table 3.18 Subgroup Results: Grade 7 GeographyGroupSubgroupValid NRaw ScoresAlphaStratified AlphaScaled ScoresPercent in Achievement LevelMeanSDMeanSDUnsatLimSatAdv?????????????Overall?4463429.07.80.860.86775.387.82.416.063.618.0??GenderFemale2253929.38.00.870.87779.292.32.715.860.920.7?Male2202528.67.40.850.85771.682.62.116.366.315.3??EthnicityAfrican American464425.07.80.850.85731.489.75.628.858.37.2?American Indian865928.27.50.840.85766.582.92.217.866.213.8?Hispanic463926.37.80.850.85745.588.34.324.961.19.8?Asian85732.87.00.850.85819.181.10.57.056.935.6?Pacific Islander8326.48.80.890.89743.6111.88.418.160.213.3?White2518730.37.40.850.85790.684.21.611.864.222.4?Other56528.97.70.860.86773.887.83.014.566.515.9??IEPNo4182029.57.50.850.85781.684.11.814.065.119.1?Yes281420.67.10.810.81681.888.412.246.439.91.5??ELLNo4440929.07.70.860.86775.987.42.315.963.718.1?Yes22519.87.30.820.82670.593.914.750.233.81.3??FLSNo2010931.77.00.840.84805.280.21.08.363.327.4?Yes2452526.87.70.850.85750.986.13.622.463.810.3Table 3.19 Subgroup Results: Grade 8 HistoryGroupSubgroupValid NRaw ScoresAlphaStratified AlphaScaled ScoresPercent in Achievement LevelMeanSDMeanSDUnsatLimSatAdv?????????????Overall?4388528.18.60.890.89733.181.69.122.156.612.2??GenderFemale2253828.68.90.900.90737.685.69.520.056.014.4?Male2128927.58.30.880.88728.476.98.724.257.39.8??EthnicityAfrican American459024.28.30.870.87696.681.216.832.646.34.2?American Indian833027.18.40.880.88724.579.19.925.155.79.3?Hispanic444725.28.40.870.87707.179.714.328.651.25.9?Asian86732.18.40.900.90771.984.05.011.656.526.9?Pacific Islander8927.18.60.890.88723.076.911.224.753.910.1?White2501129.58.30.880.89746.279.06.618.459.715.3?Other55127.48.30.880.88725.780.810.722.157.99.3??IEPNo3976229.08.30.880.88741.477.26.720.259.813.3?Yes412319.67.20.820.82653.280.132.840.325.71.2??ELLNo4346528.28.60.890.89734.081.18.821.957.012.3?Yes42018.36.40.770.77640.576.338.840.720.20.2??FLSNo2059530.88.10.880.88757.676.94.815.161.618.5?Yes2329025.78.40.870.88711.479.512.928.352.26.6Table 3.20 Subgroup Results: Grade 5 WritingGroupSubgroupValid NRaw ScoresAlphaStratified AlphaScaled Scores aPercent in Achievement LevelMeanSDMeanSDUnsatLimSatAdv???????????Overall?4487041.57.7--43.78.21.815.272.310.8??GenderFemale2287140.07.6--42.08.22.619.570.87.1?Male2195043.17.4--45.47.90.910.673.814.6??EthnicityAfrican American492739.47.7--41.38.43.321.668.76.4?American Indian846240.87.5--42.98.11.817.272.48.7?Hispanic467739.77.4--41.88.02.519.472.06.0?Asian80344.47.4--46.77.70.28.872.518.4?Pacific Islander9740.87.9--42.88.53.115.570.111.3?White2514042.47.6--44.68.11.412.673.113.0?Other76441.87.7--44.08.31.316.269.812.7??IEPNo4100642.37.2--44.57.70.812.475.211.6?Yes386433.47.9--34.88.712.544.141.61.9??ELLNo4451841.67.7--43.78.21.715.072.410.9?Yes35235.47.1--37.07.77.733.858.00.6??FLSNo1863043.97.4--46.27.80.79.073.217.1?Yes2624039.97.4--41.98.12.619.571.66.3a Weighted composite scoresTable 3.21 Subgroup Results: Grade 8 WritingGroupSubgroupValid NRaw ScoresAlphaStratified AlphaScaled Scores aPercent in Achievement LevelMeanSDMeanSDUnsatLimSatAdv???????????Overall?4200542.77.0--46.27.40.89.777.212.3??GenderFemale2154141.27.1--44.67.61.313.576.98.3?Male2043144.36.5--47.96.70.45.777.416.5??EthnicityAfrican American432640.47.0--43.77.61.316.176.36.3?American Indian730042.17.0--45.57.51.110.778.010.1?Hispanic395441.06.8--44.47.31.213.678.66.7?Asian83445.77.0--49.26.90.44.769.225.8?Pacific Islander7742.67.1--46.17.51.311.777.99.1?White2402543.56.9--47.07.20.67.877.014.5?Other148942.86.9--46.27.20.88.479.011.8??IEPNo3884643.46.5--47.06.80.47.179.313.2?Yes315934.17.2--36.87.96.241.751.21.0??ELLNo4164942.87.0--46.37.30.89.477.412.4?Yes35635.17.1--38.07.84.838.255.31.7??FLSNo2019944.56.6--48.16.70.45.676.517.6?Yes2180641.17.0--44.47.51.313.577.87.4 a Weighted composite scoresCHAPTER IV. PERFORMANCE STANDARDSPerformance standards represent the criteria which specify a minimum score a student must achieve on the statewide assessment to be placed into a given performance level. In Oklahoma, four performance levels (i.e., unsatisfactory, limited knowledge, satisfactory, and advanced) were previously established for grades 5 and 8 in 2001, for grades 3 and 4 in 2005, and for grades 6 and 7 in 2006. However, to increase rigor by raising standards for grades 3 through 8 student’s achievement on the OCCT as a means to be more competitive at the national and international levels, to vertically align proficiency expectations for students on the OCCT tests across grades 3 through 8, and to align student expectations on the OCCT more closely with student expectations for the National Assessment of Educational Progress (NAEP), revised performance standards (unsatisfactory, limited knowledge, proficient, and advanced) for reading and mathematics were established in 2009. The workshop to set new academic achievement level cutpoints for grades 3 through 8 in reading and mathematics was held June 15-18, 2009 in Oklahoma City. Thirty seven educational stakeholders from Oklahoma participated in recommending cut scores for the OCCT. Committee members were primarily selected to span grades 3 through 8, although a small number of higher education teachers and those from the business community who are knowledgeable of education in Oklahoma were also selected. The standard setting method known as the Bookmark procedure (Lewis, Mitzel, & Green, 1996), which is the same procedure used in the previous setting of performance level cut scores in reading and mathematics for grades 3 through 8, was employed. The details of the standard setting materials, procedures, methods, and results were reported in the OCCT Standard Setting: Technical Report (SDE, 2009). Table 4.1 summarizes the final scaled score ranges for the achievement levels. Table 4.1 Final Scaled Score Ranges for Reading and MathematicsSubjectGradeUnsatisfactoryLimited KnowledgeProficientAdvancedReading3400-648649-699700-890891-9904400-657658-699700-844845-9905400-640641-699700-829830-9906400-646647-699700-827828-9907400-667668-699700-801802-9908400-654655-699700-832833-990Mathematics3400-635636-699700-797798-9904400-638639-699700-815816-9905400-641642-699700-766767-9906400-659660-699700-753754-9907400-666667-699700-765766-9908400-661662-699700-770771-990REFERENCESBrennan, R.L. (2004). BB-CLASS [Computer Software]. Iowa City, IA.Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory. New York: Harcourt Brace Jovanovich College Publishers.Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16, 297 – 334. Data Recognition Corporation (2008). iTEMs [Computer Software]. Maple Grove, MN.Dorans, N. J., & Holland, P.W. (1993). DIF Detection and description: Mantel-Haenszel and standardization. In P. W. Holland & H. Wainer (Ed.), Differential item functioning (pp. 35–66). Hillsdale, NJ: Lawrence Erlbaum.Hanson, B. A., & Brennan, R. L. (1990). An investigation of classification consistency indexes estimated under alternative strong true score models. Journal of Educational Measurement, 27, 345 – 359. Holland, P. W., & Thayer, D. T. (1988). Differential item performance and the Mantel-Haenszel procedure. In H. Wainer & H. I. Braun (Eds.), Test validity (pp. 129–145). Hillsdale, NJ: Lawrence Erlbaum.Kolen, M.J., Zeng, L., & Hanson, B.A. (1996). Conditional standard errors of measurement for scale scores using IRT. Journal of Educational Measurement, 2, 129-140.Lewis, D. M., Mitzel, H. C. & Green, D. R. (1996). Standard Setting: A Bookmark Approach. In D. R. Green (Chair), IRT-based standard-setting procedures utilizing behavioral anchoring. Symposium conducted at the Council of Chief State School Officers National Conference on Large Scale Assessment, Phoenix, AZ.Livingston, S. A., & Lewis, C. (1995). Estimating the consistency and accuracy of classifications based on test scores. Journal of Educational Measurement, 32(2), 179 – 197. Lord, F. M. (1965). A strong true-score theory with applications. Psychometrika, 30, 239 – 297. Mantel, N., & Haenszel, W. (1959). Statistical aspects of the analysis of data from retrospective studies of disease. Journal of the National Cancer Institute, 22, 719–748.Muraki, E., & Bock, R.D. (2003). PARSCALE 4 [Computer Software]. Chicago: Scientific Software International.Qualls, A. L. (1995). Estimating the Reliability of a Test Containing Multiple Item Formats. Applied Measurement in Education, 8, 111-120.SDE (2009). OCCT Standard Setting: Technical Report. OK: Oklahoma City.Stocking, M. L., & Lord, F. M. (1983). Developing a common metric in item response theory. Applied Psychological Measurement, 7(2), 201–210.Thissen, D., Pommerich, M., Billeaud, K., & Williams, V. S. L. (1995). Item response theory for scores on tests including polytomous items with ordered responses. Applied Psychological Measurement, 19, 39-49.Zwick, R., & Ercikan, K. (1989). Analysis of differential item functioning in the NAPE history assessment. Journal of Educational Measurement, 26, 55-66.APPENDIX A. DATA REVIEW RESULTSThe items with poor statistics were flagged for SDE’s review. The results of the item data review are shown below in Table A.Table A Data Review ResultsSubjectGradeAcceptAcceptW/R*AcceptTotalPercentAcceptRejectTotalMathematics380480580680780880Reading380480580680780880Social Studies580780880Science580880* Items may be edited and returned to the pool for future field testing. APPENDIX B. RAW-TO-SCALED SCORE CONVERSION TABLES AND FREQUENCY DISTRIBUTIONSTable B.1 Raw-to-Scaled Score Table and Frequency Distribution: Grade 3 ReadingRaw ScoreScaled ScoreCSEMLevelFrequencyPercentCumulative FrequencyCumulative Percent040055U00.0000.00140055U10.0010.00240055U10.0020.00340055U10.0030.01440055U20.0050.01540055U40.0190.02640055U90.02180.04740055U130.03310.07840055U280.06590.13940055U570.131160.271044662U800.181960.451149166U1210.283170.721251966U1540.354711.081354063U1960.456671.531455658U2620.609292.121556952U2960.6812252.801658146U3200.7315453.531759241U3940.9019394.431860237U4611.0524005.491961133U4821.1028826.592061931U5491.2634317.842162829U6491.4840809.332263628U6881.57476810.902364326U7671.75553512.662465126L8721.99640714.652565825L10092.30741616.952666625L10792.47849519.422767324L11062.53960121.952868024L11782.691077924.642968724L12892.951206827.593069423L14203.251348830.843170223S14323.271492034.113270923S15703.591649037.703371623S16653.811815541.513472424S17473.991990245.503573124S18914.322179349.833673924S18264.182361954.003774725S19034.352552258.353875525S19544.472747662.823976426S20184.612949467.434077427S20064.593150072.024178429S19134.373341376.394279531S18774.293529080.694380733S19084.363719885.054482137S17644.033896289.084583741S15123.464047492.544685646S12912.954176595.494788150S9092.084267497.574891649A6201.424329498.994997836A3330.764362799.755099036A1100.2543737100.00Table B.2 Raw-to-Scaled Score Table and Frequency Distribution: Grade 4 ReadingRaw ScoreScaled ScoreCSEMLevelFrequencyPercentCumulative FrequencyCumulative Percent040056U00.0000.00140056U00.0000.00240056U20.0020.00340056U10.0030.01440056U10.0040.01540056U30.0170.02640056U70.02140.03740056U120.03260.06840056U260.06520.12940056U490.111010.231046662U810.191820.421149964U1220.283040.711252263U1230.294270.991353959U2060.486331.471455353U2210.518541.991556547U2380.5510922.541657642U2720.6313643.171758537U3570.8317214.001859433U3680.8620894.861960230U3860.9024755.752061028U4511.0529266.802161726U4821.1234087.922262424U5701.3239789.252363123U5791.35455710.592463723U6621.54521912.132564422U7331.71595213.842665021U8101.88676215.722765621U8972.08765917.802866221L10322.40869120.202966821L10612.47975222.673067420L11252.611087725.283168120L12832.981216028.273268720L14203.301358031.573369320L15463.591512635.163470021S16543.841678039.013570621S16983.951847842.953671321S18764.362035447.313772122S19114.442226551.753872822S20464.762431156.513973623S21424.982645361.494074424S22045.122865766.614175325S21535.003081071.624276327S22045.123301476.744377429S20214.703503581.444478632S20094.673704486.114580036S17744.123881890.234681742S15393.584035793.814783848S12082.814156596.624886954A8602.004242598.614992351A4521.054287799.675099051A1440.3343021100.00Table B.3 Raw-to-Scaled Score Table and Frequency Distribution: Grade 5 ReadingRaw ScoreScaled ScoreCSEMLevelFrequencyPercentCumulative FrequencyCumulative Percent040046U10.0010.00140046U40.0150.01240046U60.01110.03340046U20.00130.03440046U30.01160.04540046U50.01210.05640046U70.02280.07740046U150.04430.10840046U130.03560.13940046U270.06830.201044153U500.121330.311147656U700.162030.481250056U1090.263120.731351854U1260.304381.031453350U1660.396041.421554546U1850.447891.861655641U2060.489952.341756637U2280.5412232.881857533U2840.6715073.551958331U2990.7018064.252059128U3140.7421204.992159927U3380.8024585.782260626U4160.9828746.762361325U3800.8932547.652462024U4571.0837118.732562723U5271.2442389.972663423U5961.40483411.372764023U6961.64553013.012864723L7331.72626314.732965422L8732.05713616.793066122L10172.39815319.183166822L10632.50921621.683267522L12502.941046624.623368222L13313.131179727.753468923L14343.371323131.123569723L15393.621477034.743670423S17564.131652638.883771224S19804.661850643.533872124S20094.732051548.263972925S21184.982263353.244073926S22725.342490558.594174827S23365.502724164.084275928S24155.682965669.764377030S24205.693207675.464478333S24655.803454181.254579837S21985.173673986.424681542S19884.683872791.104783749A16963.994042395.094886854A11902.804161397.894991953A6681.574228199.465099053A2290.5442510100.00Table B.4 Raw-to-Scaled Score Table and Frequency Distribution: Grade 6 ReadingRaw ScoreScaled ScoreCSEMLevelFrequencyPercentCumulative FrequencyCumulative Percent040058U00.0000.00140058U30.0130.01240058U40.0170.02340058U30.01100.02440058U60.01160.04540058U30.01190.04640058U80.02270.06740058U230.05500.12840058U450.11950.22940058U600.141550.371046165U1080.262630.621149968U1490.354120.971252567U2270.546391.511354664U3150.749542.251456259U3600.8513143.101557753U3930.9317074.031658947U5401.2822475.311760042U5531.3128006.611861037U6221.4734228.081962034U7171.6941399.772062831U7411.75488011.522163629U8281.96570813.482264427U8742.06658215.542365125L9262.19750817.732465824L9372.21844519.942566423L9782.31942322.252667122L10802.551050324.802767722L11062.611160927.422868321L11712.771278030.182968921L12352.921401533.103069521L12723.001528736.103170121S13513.191663839.293270721S14033.311804142.613371421S14413.401948246.013472021S15063.562098849.573572721S16773.962266553.533673322S17144.052437957.583774022S16663.932604561.513874823S17404.112778565.623975624S18074.272959269.894076425S17884.223138074.114177326S17624.163314278.274278328S17244.073486682.344379330S16303.853649686.194480633S15953.773809189.964582037S13543.203944593.164683742A10902.574053595.734785848A8552.024139097.754888951A5711.354196199.104994444A2950.704225699.795099044A870.2142343100.00Table B.5 Raw-to-Scaled Score Table and Frequency Distribution: Grade 7 ReadingRaw ScoreScaled ScoreCSEMLevelFrequencyPercentCumulative FrequencyCumulative Percent040049U00.0000.00140049U00.0000.00240049U00.0000.00340049U20.0020.00440049U20.0040.01540049U20.0060.01640049U50.01110.03740049U90.02200.05840049U180.04380.09940049U300.07680.171043957U410.101090.271147961U670.161760.431250662U930.232690.661352760U1230.303920.961454456U1450.365371.321555951U1760.437131.751657246U2410.599542.341758341U3110.7612653.101859337U3270.8015923.901960334U3860.9419784.842061231U4050.9923835.832162029U4811.1828647.012262828U5701.3934348.402363627U6211.5240559.922464326U6131.50466811.422565125U6901.69535813.112665824U8562.09621415.202766524U9382.29715217.492867223L10762.63822820.132967923L11372.78936522.913068523L12583.081062325.983169222L13663.341198929.323269922L14713.601346032.923370623S15973.911505736.833471323S17014.161675840.993572123S19214.701867945.693672823S20154.932069450.613773624S20254.952271955.573874424S21695.302488860.873975325S21675.302705566.174076226S21465.252920171.424177127S21765.323137776.744278128S20364.983341381.724379330S18564.543526986.264480533A17064.173697590.434581937A13443.293831993.724683743A10642.603938396.324785848A7971.954018098.274888950A4311.054061199.324994444A2190.544083099.865099044A570.1440887100.00Table B.6 Raw-to-Scaled Score Table and Frequency Distribution: Grade 8 ReadingRaw ScoreScaled ScoreCSEMLevelFrequencyPercentCumulative FrequencyCumulative Percent040053U10.0010.00140053U10.0020.00240053U10.0030.01340053U20.0050.01440053U20.0070.02540053U10.0080.02640053U20.00100.02740053U30.01130.03840053U150.04280.07940053U210.05490.121040053U380.09870.221145359U530.131400.351248863U670.172070.511351362U1160.293230.801453259U1600.394831.201554855U1800.456631.641656150U2150.538782.181757345U2400.6011182.771858441U3150.7814333.561959437U3530.8817864.432060434U3930.9821795.412161332U4291.0726086.472262230U4781.1930867.662363029U5351.3336218.992463828U6711.67429210.662564627U6831.70497512.352665426U6971.73567214.082766126L8452.10651716.182866925L9392.33745618.522967625L10332.57848921.083068425L11932.96968224.043169125L12273.051090927.093269925L13493.351225830.443370725S13923.461365033.903471425S15143.761516437.663572225S18214.521698542.183673126S18284.541881346.723773926S18384.562065151.293874827S20475.082269856.373975828S21315.292482961.664076829S21825.422701167.084177931S21555.352916672.444279132S20345.053120077.494380435S19864.933318682.424481938S18784.663506487.084583642A15953.963665991.054685647A14953.713815494.764788251A10762.673923097.434891850A6281.563985898.994998236A3290.824018799.815099036A770.1940264100.00Table B.7 Raw-to-Scaled Score Table and Frequency Distribution: Grade 3 MathematicsRaw ScoreScaled ScoreCSEMLevelFrequencyPercentCumulative FrequencyCumulative Percent040045U00.0000.00140045U10.0010.00240045U10.0020.00340045U10.0030.01440045U20.0050.01540045U20.0070.02640045U230.05300.07740045U280.06580.13840045U450.101030.23944153U690.161720.391047557U1090.252810.631150057U1510.344320.971252055U1780.406101.371353752U2500.568601.941455248U3280.7411882.671556543U3180.7215063.391657740U4150.9319214.331758937U4641.0423855.371859934U4801.0828656.451960932U6141.3834797.832061930U6631.4941429.332162829U7521.69489411.022263728L7741.74566812.762364627L8761.97654414.732465426L10092.27755317.012566225L10252.31857819.312667024L11012.48967921.792767824L12222.751090124.542868623L13072.941220827.492969323L13403.021354830.503070123S14213.201496933.703170823S16043.611657337.313271622S16443.701821741.023372423S17854.022000245.043473223S18174.092181949.133574023S19124.302373153.433674924S20354.582576658.013775825S21424.822790862.843876827S22665.103017467.943977930S23405.273251473.214079234S24235.463493778.664180840A25025.633743984.304282747A23185.223975789.514385555A21574.864191494.374490258A16213.654353598.024599058A8791.9844414100.00Table B.8 Raw-to-Scaled Score Table and Frequency Distribution: Grade 4 MathematicsRaw ScoreScaled ScoreCSEMLevelFrequencyPercentCumulative FrequencyCumulative Percent040049U20.0020.00140049U10.0030.01240049U10.0040.01340049U10.0050.01440049U50.01100.02540049U40.01140.03640049U80.02220.05740049U110.03330.08840049U260.06590.13940049U470.111060.241044257U760.171820.421148161U1210.283030.691250761U1950.454981.141352759U2090.487071.621454455U3170.7310242.341555850U3590.8213833.161657145U4361.0018194.161758341U4581.0522775.211859438U5041.1527816.361960535U5901.3533717.712061533U7281.6740999.382162531U7741.77487311.152263430U8551.96572813.112364329L9712.22669915.332465228L10772.46777617.792566127L11182.56889420.352666926L12132.781010723.122767826L13333.051144026.172868626L14593.341289929.512969525L15633.581446233.093070325S17073.911616936.993171225S17313.961790040.953272025S18084.141970845.093372925S18964.342160449.433473826S19844.542358853.973574826S20964.802568458.763675827S21134.832779763.603776929S22245.093002168.693878030S21414.903216273.583979333S21644.953432678.534080836S22205.083654683.614182541A20844.773863088.384284746A18824.314051292.694387651A15503.554206296.234492549A10982.514316098.754599049A5481.2543708100.00Table B.9 Raw-to-Scaled Score Table and Frequency Distribution: Grade 5 MathematicsRaw ScoreScaled ScoreCSEMLevelFrequencyPercentCumulative FrequencyCumulative Percent040051U10.0010.00140051U10.0020.00240051U30.0150.01340051U20.0070.02440051U80.02150.03540051U120.03270.06640051U190.04460.11740051U430.10890.21840051U580.141470.34941762U980.232450.571047768U1770.414220.981151270U2060.486281.461253868U2950.699232.151355863U3580.8312812.981457557U4421.0317234.011559051U5411.2622645.271660245U6431.5029076.771761440U7141.6636218.431862536U7931.85441410.281963533U8772.04529112.322064430L9092.12620014.442165328L10322.40723216.842266127L11042.57833619.412366926L11952.78953122.202467725L13203.071085125.272568524L13443.131219528.402669224L14683.421366331.822770023S15283.561519135.382870723S15253.551671638.932971423S16483.841836442.773072222S17424.062010646.833172922S17003.962180650.783273722S18714.362367755.143374423S18094.212548659.363475223S18524.312733863.673576124S19354.512927368.183676925A18864.393115972.573777926A17614.103292076.673878928A17454.063466580.733980131A17474.073641284.804081435A16843.923809688.724183040A14893.473958592.194285046A12822.994086795.184387951A10292.404189697.574493148A7011.634259799.214599048A3410.7942938100.00Table B.10 Raw-to-Scaled Score Table and Frequency Distribution: Grade 6 MathematicsRaw ScoreScaled ScoreCSEMLevelFrequencyPercentCumulative FrequencyCumulative Percent040067U10.0010.00140067U30.0140.01240067U00.0040.01340067U20.0060.01440067U20.0080.02540067U30.01110.03640067U170.04280.07740067U370.09650.15840067U580.141230.29940067U900.212130.501047774U1510.353640.851152176U2380.566021.411254974U3200.759222.161357068U4230.9913453.161458660U5251.2318704.391559952U5541.3024245.691661145U6411.5030657.191762139U8021.8838679.081863134U8792.06474611.141963930U9172.15566313.292064728U9672.27663015.562165526U10972.57772718.142266224L12122.84893920.982366923L11672.741010623.722467622L12162.851132226.582568221L13293.121265129.702668921L14153.321406633.022769520L14303.361549636.372870120S15293.591702539.962970720S15653.671859043.643071420S16253.812021547.453172020S15783.702179351.153272620S16073.772340054.933373320S17684.152516859.083474020S16673.912683562.993574721S17804.182861567.173675522A18114.253042671.423776323A18204.273224675.693877225A17514.113399779.803978127A17414.093573883.894079331A16623.903740087.794180637A16013.763900191.554282345A13893.264039094.814384756A11272.654151797.454488762A7651.804228299.254599062A3200.7542602100.00Table B.11 Raw-to-Scaled Score Table and Frequency Distribution: Grade 7 MathematicsRaw ScoreScaled ScoreCSEMLevelFrequencyPercentCumulative FrequencyCumulative Percent040061U20.0020.00140061U20.0040.01240061U20.0060.01340061U10.0070.02440061U10.0080.02540061U80.02160.04640061U120.03280.07740061U270.07550.13840061U600.151150.28940761U880.212030.491047568U1290.313320.811151170U1890.465211.271253668U2600.637811.901355564U3200.7811012.681457158U4030.9815043.671558652U4891.1919934.861659846U5671.3825606.241761041U7011.7132617.951862137U7391.8040009.751963234U8962.18489611.932064232U9882.41588414.342165131U11272.75701117.092266029U12373.01824820.102366928L14143.45966223.552467828L14313.491109327.042568727L16033.911269630.942669626L17894.361448535.302770426S18654.551635039.852871225S18424.491819244.342972125S19424.732013449.073072924S19914.852212553.923173724S19884.852411358.773274624S18864.602599963.363375424S19204.682791968.043476324S17704.312968972.363577225A17084.163139776.523678126A16293.973302680.493779127A15053.673453184.163880228A14113.443594287.603981330A13073.193724990.784082634A10742.623832393.404184238A9392.293926295.694286143A7281.773999097.464388847A5391.314052998.784493444A3630.884089299.664599044A1390.3441031100.00Table B.12 Raw-to-Scaled Score Table and Frequency Distribution: Grade 8 MathematicsRaw ScoreScaled ScoreCSEMLevelFrequencyPercentCumulative FrequencyCumulative Percent040068U00.0000.00140068U20.0020.00240068U20.0040.01340068U20.0060.01440068U20.0080.02540068U60.01140.03640068U70.02210.05740068U250.06460.11840068U440.11900.22940068U760.191660.411044268U1420.353080.771149772U1850.464931.231253072U2210.557141.781355369U3170.7910312.561457163U3890.9714203.531558656U4461.1118664.641659950U5401.3424065.981761144U5871.4629937.441862239U6811.6936749.141963135U7341.83440810.962064032U8852.20529313.162164930U9472.35624015.522265728U10522.62729218.132366527L11062.75839820.882467326L12293.06962723.942568025L13443.341097127.282668824L13393.331231030.612769524L14103.511372034.122870224S15893.951530938.072971024S15923.961690142.033071723S16254.041852646.073172524S17424.332026850.403273324S17474.342201554.743374124S18184.522383359.273474924S19094.752574264.013575825S17314.302747368.323676726S17554.362922872.683777728A17804.433100877.113878830A16714.163267981.263980032A16194.033429885.294081436A14293.553572788.844183040A13543.373708192.214285146A11752.923825695.134387950A9122.273916897.404492748A6781.693984699.084599048A3680.9240214100.00Table B.13 Raw-to-Scaled Score Table and Frequency Distribution: Grade 5 ScienceRaw ScoreScaled ScoreCSEMLevelFrequencyPercentCumulative FrequencyCumulative Percent040086U10.0010.00140086U50.0160.01240086U10.0070.02340086U00.0070.02440086U40.01110.03540086U70.02180.04640086U160.04340.08740086U210.05550.13840086U410.09960.22949886U690.161650.381054788U1140.262790.641157684U1360.314150.951259776U1970.456121.411361466U2580.598702.001462856U3250.7511952.751564048L3590.8315543.581665141L4711.0820254.661766136L5471.2625725.921867132L6601.5232327.441968029L7591.7539919.182068827L7991.84479011.022169626L9382.16572813.182270425S10562.43678415.612371124S10552.43783918.042471823S11952.75903420.792572623S12852.961031923.752673322S14153.261173427.002774022S14513.341318530.342874722S16033.691478834.032975421S17414.011652938.043076121S18544.271838342.303176821S20034.612038646.913277622S20054.612239151.523378322S20024.612439356.133479122S20604.742645360.873580023S21885.032864165.913680823S20964.823073770.733781725A21104.863284775.593882726A20934.823494080.403983828A19854.573692584.974085131A17954.133872089.104186535A15713.624029192.714288338A13603.134165195.844390841A10042.314265598.154495036A5751.324323099.484599036A2270.5243457100.00Table B.14 Raw-to-Scaled Score Table and Frequency Distribution: Grade 8 ScienceRaw ScoreScaled ScoreCSEMLevelFrequencyPercentCumulative FrequencyCumulative Percent040093U30.0130.01140093U70.02100.02240093U60.01160.04340093U30.01190.05440093U60.01250.06540093U80.02330.08640093U160.04490.12740093U150.04640.16840093U420.101060.26942193U670.161730.421054499U1080.262810.681158598U1390.344201.021261090U2380.586581.601362879U3380.829962.421464267U4070.9914033.411565456L4911.1918944.611666546L6051.4724996.081767539L7111.7332107.811868534L9192.24412910.041969430L10142.47514312.512070228S10172.47616014.982171126S11982.91735817.902271925S12913.14864921.042372625S13243.22997324.262473424S15353.731150827.992574223S15703.821307831.812674923S16944.121477235.932775622S16704.061644240.002876422S18024.381824444.382977122S19144.662015849.043077821S18924.602205053.643178521S19904.842404058.483279221S19364.712597663.193380022S19914.842796768.033480722S18434.482981072.513581522S18694.553167977.063682423S17694.303344881.363783325A16454.003509385.373884226A14693.573656288.943985329A13253.223788792.164086632A10782.623896594.784188135A8712.123983696.904290038A6401.564047698.464392538A3700.904084699.364496930A2160.534106299.894599030A470.1141109100.00Table B.15 Raw-to-Scaled Score Table and Frequency Distribution: Grade 5 Social StudiesRaw ScoreScaled ScoreCSEMLevelFrequencyPercentCumulative FrequencyCumulative Percent040070U10.0010.00140070U00.0010.00240070U00.0010.00340070U10.0020.00440070U10.0030.01540070U20.0050.01640070U20.0070.02740070U60.01130.03840070U100.02230.05940070U160.03390.081040070U310.07700.151140070U580.121280.271240070U1090.232370.511347577U1540.333910.841451780U2230.486141.321554479U3140.679281.991656575U4020.8613302.851758369U4941.0618243.911859762U5421.1623665.081961055U6221.3329886.412062149U7761.6737648.082163244U8501.8246149.902264139U9522.04556611.942365035L10382.23660414.172465933L11092.38771316.552566730L11402.45885319.002667429L11562.481000921.482768127L11792.531118824.012868826L12662.721245426.722969525L13022.791375629.523070124S13262.851508232.363170823S13542.911643635.273271422S14043.011784038.283372022S14313.071927141.353472521S14403.092071144.443573121S15103.242222147.683673720S14973.212371850.893774220S15353.292525354.193874820S15333.292678657.483975320S14723.162825860.634075919S14563.122971463.764176419S15143.253122867.014277019S14933.203272170.214377620S13862.973410773.184478220S13942.993550176.184578820A13762.953687779.134679420A12662.723814381.844780021A12542.693939784.544880721A12202.624061787.154981422A10042.154162189.31Table Continues5082223A10082.164262991.475183025A9702.084359993.555283926A8061.734440595.285384929A6561.414506196.695486032A5021.084556397.775587435A3970.854596098.625689039A3350.724629599.345791141A1620.354645799.685894239A980.214655599.895999027A420.094659799.986099027A70.0246604100.00Table B.16 Raw-to-Scaled Score Table and Frequency Distribution: Grade 7 GeographyRaw ScoreScaled ScoreCSEMLevelFrequencyPercentCumulative FrequencyCumulative Percent040065U00.0000.00140065U00.0000.00240065U10.0010.00340065U30.0140.01440065U20.0060.01540065U60.01120.03640065U150.03270.06740065U380.09650.15840065U660.151310.29945775U1180.262490.561051180U1960.444450.991154680U2570.587021.571257377U3730.8410752.411359570L4631.0415383.441461363L5471.2320854.671562956L6411.4427266.111664350L7091.5934357.691765644L8031.8042389.491866740L8311.86506911.351967837L9552.14602413.492068935L10712.40709515.892169933L11432.56823818.452270932S13172.95955521.412371831S13112.941086624.342472830S15813.541244727.892573730S15983.581404531.472674629S17613.951580635.412775529S17974.031760339.442876429S20004.481960343.922977328S20324.552163548.473078229S20784.662371353.133179229S21764.882588958.003280129S22274.992811662.993381130S21394.793025567.783482231S22044.943245972.723583332S20604.623451977.343684433S20854.673660482.013785735A18074.053841186.063887137A17513.924016289.983988639A13793.094154193.074090441A11342.544267595.614192641A8741.964354997.574295337A5481.234409798.804399027A3490.784444699.584499012A1560.354460299.934599012A320.0744634100.00Table B.17 Raw-to-Scaled Score Table and Frequency Distribution: Grade 8 U.S. HistoryRaw ScoreScaled ScoreCSEMLevelFrequencyPercentCumulative FrequencyCumulative Percent040076U00.0000.00140076U10.0010.00240076U00.0010.00340076U30.0140.01440076U60.01100.02540076U140.03240.05640076U310.07550.13740076U440.10990.23840076U1090.252080.47946076U1980.454060.931051980U2950.677011.601155279U3690.8410702.441257573U5301.2116003.651359366U6811.5522815.201460857U7851.7930666.991562150U9412.1440079.131663343L10032.29501011.421764438L10882.48609813.901865434L11102.53720816.421966331L11962.73840419.152067229L12122.76961621.912168127L13082.981092424.892268826L13253.021224927.912369625L14533.311370231.222470424S14763.361517834.592571123S15103.441668838.032671823S15343.501822241.522772523S16203.691984245.212873222S16513.762149348.982973922S17113.902320452.873074722S17033.882490756.763175422S16993.872660660.633276123S17654.022837164.653376923S17013.883007268.523477724S17263.933179872.463578525S18024.113360076.563679426S17644.023536480.583780427S16623.793702684.373881530S15153.453854187.823982733A14133.223995491.044084137A13052.974125994.024185841A10612.424232096.434288046A7751.774309598.204391146A5011.144359699.344496736A2180.504381499.844599036A710.1643885100.00Table B.18 Composite Score Frequency Distribution: Grade 5 WritingComposite ScoreLevelFrequencyPercentCumulative FrequencyCumulative Percent15U6470.366470.3616U120.036590.3917U680.157270.5418U270.067540.6019U120.037660.6320U1110.258770.8821U260.069030.9322U740.169771.1023U810.1810581.2824U740.1611321.4425U1530.3412851.7926L800.1813651.9627L1730.3915382.3528L590.1315972.4829L2950.6618923.1430L2190.4921113.6331L25535.6946649.3132L7641.70542811.0233L1480.33557611.3534L15523.46712814.8035L9602.14808816.9436S17363.87982420.8137S11832.641100723.4538S19494.341295627.7939S9742.171393029.9640S10122.251494232.2241S20874.651702936.8742S12242.731825339.6043S29116.492116446.0844S16853.762284949.8445S29866.652583556.4946S15623.482739759.9747S3800.842777760.8248S618013.773395774.5949S14523.233540977.8250S22605.043766982.8651S9492.113861884.9752S15333.424015188.3953S3680.824051989.2154A3920.874091190.0855A8471.894175891.9756A7621.704252093.6757A6311.414315195.0758A2720.614342395.6859A3980.894382196.5760A15413.4345362100.00 Table B.19 Raw-to-Scaled Score Table and Frequency Distribution: Grade 8 WritingComposite ScoreLevelFrequencyPercentCumulative FrequencyCumulative Percent15U4210.004210.0016U1040.255250.2517U110.035360.2718U430.105790.3819U00.005790.3820U180.045970.4221U500.126470.5422U380.096850.6323U410.107260.7324U450.117710.8325L470.118180.9526L360.098541.0327L1140.279681.3028L920.2210601.5229L850.2011451.7230L1750.4213202.1431L1460.3514662.4932L17904.2632566.7533L180.0432746.7934L7681.8340428.6235L8011.90484310.5236S7401.76558312.2937S8271.97641014.2638S5511.31696115.5739S10762.56803718.1340S6051.44864219.5741S14093.351005122.9242S10772.561112825.4943S10772.561220528.0544S20554.891426032.9445S14463.441570636.3846S24165.751812242.1447S6391.521876143.6648S19004.522066148.1849S1263030.063329178.2350S12102.883450181.1151S16533.943615485.0552S6191.473677386.5253S4921.173726587.6954A4791.143774488.8355A5611.343830590.1756A5681.353887391.5257A9832.343985693.8658A3380.804019494.6659A3540.844054895.5160A18884.4942436100.00 ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download