Fundamental Principles - Amazon Web Services



Fundamental Principles

of Item Writing

A Guide to Writing

Multiple-Choice Test Items

Copyright © 2005 by ACT, Inc. All rights reserved.

TABLE OF CONTENTS

Introduction 1

Overview of the Guide 1

Confidentiality 2

Section 1: Basic Concepts of Testing 3

Purpose of a Test 3

Characteristics of Measurement: Reliability and Validity 3

Section 2: Multiple-Choice Tests 4

The Rationale for a Multiple-Choice Test 4

An Examination's Content Outline 4

The Four-Option, Multiple-Choice Item 4

Section 3: The Item-Writing Process 5

Overview of Writing Multiple-Choice Test Items 5

Step 1. Select the Item Topic 5

Step 2. Determine the Cognitive Skill Level 5

Step 3. Construct the Item Stem 6

Step 4. Develop the Item Options. 7

Step 5. Classify the Item 7

Section 4: Guidelines for Development of Sound Test Items 8

Focusing the Topic 8

Considering the Cognitive Task 8

Developing the Item Stem 10

Constructing Item Options 11

Classifying the Item 13

Submitting Visuals 14

Section 5: Avoiding Bias and Stereotyping 15

Definition of Bias and Language Sensitivity 15

Approaches for Eliminating Bias and Insensitive Language 15

Language Sensitivity 16

Avoiding Stereotypes 16

Culture-Bound Assumptions 16

Racial Bias 17

Gender and Age Bias 17

Bias Against People with Disabilities 18

Appendix A: Item-Review Checklist 19

Appendix B: Checklist to Avoid Bias and Stereotyping 20

INTRODUCTION

Overview of Fundamental Principles of Item Writing

This Guide outlines some of the fundamentals of test-item development. Its purpose is to provide information that will help item writers and reviewers develop technically sound test items for examinations. Section 1 defines some basic concepts of testing. In this section, item writers are familiarized with the common test terms of reliability and validity.

Section 2 describes some important features of multiple-choice test items. This section discusses the advantages of a multiple-choice test and defines the use of a test blueprint. Examples of acceptable styles of four-option, multiple-choice items also are provided.

Section 3 describes the item-writing process. In this section, item writers learn five basic steps to develop items:

1. select the topic,

2. determine the cognitive skill level of an item,

3. construct the item stem,

4. develop the item options, and

5. classify the item.

Section 4 provides item writers with specific guidelines for developing sound test items, along with examples of acceptable and unacceptable items. Section 5 describes the various types of review that items must undergo by presenting the criteria that content experts should use in judging the quality of items.

Examinations are designed to measure the knowledge, skills, and abilities essential to competent practice. Good items are the foundation of a good examination. Careful construction of items is necessary if competence in specific content areas is to be tested. Individuals with content expertise who write items that meet the standards put forth in this Guide can contribute to the development of high-quality professional certification examinations. Item writers can have the greatest success if they:

• select important topics for items that relate directly to the test blueprint;

• try to measure the examinees’ ability to apply and understand information, rather than the ability merely to recall facts;

• ensure that a best answer exists for each item; and

• formulate a series of plausible, but incorrect, alternative responses for each item.

Confidentiality

It is essential that all items and item-development materials remain secure and confidential. Item writers should not make copies of items for personal files, nor retain the items on a personal computer. Any item-writing training materials or draft items should be shredded. Completed items must be submitted via secure courier.

1

SECTION

BASIC CONCEPTS OF TESTING

Purpose of a Test

The purpose of a test is to determine whether an examinee exhibits mastery of a specified body of knowledge. The purpose of an item is to determine whether an examinee demonstrates mastery of a specific fact or skill. Tests are designed to discriminate between examinees who possess, and examinees who do not possess, the desired knowledge or skills.

Characteristics of Measurement: Reliability and Validity

Two important measures of test performance are reliability and validity.

A reliable test yields the same results consistently for examinees who have the same knowledge and skills required on the exam. Test scores are said to be reliable when they are free from measurement error –; that is, unaffected by inconsistencies irrelevant to the purpose of the test.

The reliability of scores is directly affected by the quality of the item writing. Flawed items, such as those with ambiguities or inaccuracies, increase measurement error and thus diminish reliability. The information in this Guide is intended to help item writers and reviewers develop items that will yield highly reliable test scores.

A valid test measures the specific knowledge and skills that were intended to be measured. The validity of a test is also affected by item quality. Test items must be representative of skill and knowledge important to the specified subject area; therefore, the processes and procedures of test development, especially item writing and reviewing, are critical in determining the appropriateness of exam content.

2

SECTION

MULTIPLE-CHOICE TESTS

The Rationale for a Multiple-Choice Test

The multiple-choice format has gained enormous influence and acceptance in standardized tests because of several factors:

1. An examinee can read and respond to a greater amount of material in a multiple-choice examination than, for example, in an essay examination. Thus, the examinee can exhibit a greater number of behaviors across more topics.

2. The use of a multiple-choice format increases the objectivity of the examination. This greater objectivity leads to greater reliability (i.e., consistency) of test scores.

3. The multiple-choice format permits the use of high-speed scanners for cost-effective rapid scoring and interpretation of enormous numbers of examinee responses.

An Examination's Content Outline

The design of a well-constructed multiple-choice test is based on a content outline, also known as a test blueprint. This blueprint outlines the content domains of a particular discipline so that all areas of pertinent skills and knowledge can be tested. The test blueprint also indicates the relative emphasis, in number of items, that should be given to each content area. Constructing a test conforming to specified guidelines contributes to the validity of the interpretation of scores derived from it.

The content outline serves as the basis for item-writing assignments. Item writers must ensure that the topic chosen for each item they write clearly relates to a content classification detailed in the test blueprint.

The Four-Option, Multiple-Choice Item

The most widely recommended multiple-choice test item has four options, and features:

• a stem, which is a fact pattern presented as a question (closed) or as an unfinished statement that will be completed by the appropriate response (open-ended);

• the keyed option, which is the correct and best answer of those listed; and

• the plausible, yet incorrect, options.

|Open-Ended Stem |Closed Stem |

| | |

| Cincinnati is home to the National | What National Football League (NFL) team is headquartered in|

|Football League (NFL) team called the: |Cincinnati? |

| | |

| A. Bearcats. | A. Bearcats |

| B. Broncos. | B. Broncos |

| C. Reds. | C. Reds |

|> D. Bengals. |> D. Bengals |

3

SECTION

THE ITEM-WRITING PROCESS

Overview of Writing Multiple-Choice Test Items

This section provides guidelines for constructing acceptable multiple-choice test items. Writers should review the examination's content outline, in conjunction with their assigned content areas, before beginning the item-writing process. Before submitting completed items, writers should refer to Appendix A, a handy Item-Review Checklist.

Step 1. Select the Item Topic

By studying the assigned content domains of the test blueprint, writers can get ideas for item topics. A writer must next consider whether a chosen topic is important in professional practice and relevant to the purpose of the test. Two dimensions to measure importance are frequency and criticality. Writers would do well to ask themselves, "How often is this knowledge used in practice?" and "What would be the consequences if an examinee lacked this knowledge?" Trivia that will stump even the most knowledgeable examinee is not the basis for a good item. An item should not ask the examinee to decide among several opinions, all of which might have equal support, nor should it ask examinees to express their own value judgments.

Topics must reflect important concerns that have a single answer with which other experts agree. Writers should use several well-recognized, professionally approved, content-specific textbooks or professional publications when searching for ideas. For legal purposes, item writers should be prepared to provide written, approved reference citations from verifiable sources to document item-content accuracy. Use of copyrighted, preprinted text or graphics is prohibited without specific written permission from the creator/publisher; permission for use is the responsibility of the item writer.

Step 2. Determine the Cognitive Skill Level

Some testing programs require writers, as part of the final classification step, to indicate the level of cognitive processing inherent in a test question. Even in the absence of such a requirement, writers should construct items that require candidates to use different levels of cognitive processing, or thinking skills, to correctly answer the item.

Cognitive skills can be conceived of as a hierarchy, with relatively simple mental processes, such as remembering, at the lowest level. Complex mental processing, such as judgment and decision making, builds on lower-level skills: one must first remember a concept before being able to apply knowledge of the concept in a unique situation. While some topics are so fundamental that examinees just need to recall basic facts, simple knowledge is rarely enough. A professional who wishes to demonstrate advanced competence must also have the necessary mental skills to put the particular knowledge to work in actual practice.

Understanding the continuum of cognitive skills can help a writer focus an item stem. When formulating the stem, a writer might consider whether the content being tested by the item will require the examinee to use mental skills at one of three basic cognitive levels:

• recall,

• interpretation, or

• problem solving.

Recall

Recall items ask the examinee to remember the facts, concepts, principles, and procedures relevant to their subject area. Such items should not, however, ask for a memorized textbook definition. Recall items often test whether an examinee can make fine distinctions between closely related concepts, recognize errors in the protocol of routine procedures, or identify abstract principles from concrete applications. An examinee might also have to reproduce, translate, or understand some concept or information within a recall item.

Interpretation

Interpretation items ask an examinee to make judgments about the meaning of a phenomenon or situation: for example, to evaluate an incomplete work order, or to determine the cost of different items based on information in an accompanying table or chart. Chances are that the examinee has never seen the precise information presented before, and thus cannot use recall skills to determine the correct response. Implicit mental actions associated with interpretation items include to calculate, infer, demonstrate, analyze, contrast, or transform.

Problem Solving

Problem-solving items may require recall and interpretation skills, but then ask the examinee to go farther to make a decision or determine appropriate action. An examinee must build on basic knowledge, consider similar situations or problems, and synthesize a response from the information available. Problem-solving items might require an examinee to prioritize, judge, predict, assess, estimate, or counsel. Scenarios that contain a problem situation lend themselves to higher-order thinking.

Step 3. Construct the Item Stem

After selecting the item topic and determining the cognitive task, item writers are ready to construct the item stem. The stem should present all the information needed to respond to the item. Examinees should not have to sift through the various options to determine the intent of the item, and they should not have to sift through extraneous information unrelated to the purpose of the item. Unnecessary reading reduces the time that the examinees have to respond to other items. Although the item may be complex or demand high-level cognitive processing, it should be clear, unambiguous, and limited in scope.

Step 4. Develop the Item Options

Once the stem has been written, the next step is to provide the most correct option, along with other plausible, yet incorrect, options. The options are designed to differentiate between less knowledgeable examinees and those who are more knowledgeable. Therefore, incorrect options should be believable and attractive to examinees who are trying to guess the correct answer. Statements that are true in their own right, but are only peripherally related to the question, make excellent incorrect options.

Initially an item writer may have in mind only a question and its answer, or a statement and its correct completion. Developing three incorrect, yet plausible, options or distractors is challenging and critical. Common misconceptions about the topic—or options that would be correct for another condition or in another situation—are quite good distractors. Incorrect options should conform to the correct option in content, structure, and length so that the correct option will not “stand out” in any way from the others. The correct option should be clearly defensible as best.

Step 5. Classify the Item

In the last step, item writers must classify the item completely by using the appropriate test blueprint and classification schema associated with the specific testing program. Some blueprints are limited to two or three levels of content classification—the primary domain of knowledge, the secondary topic of specific content, and a relevant subtopic. Other blueprints may have extended subtopics. Item writers should confirm that they've noted the complete classification scheme before submitting their items.

4

SECTION

GUIDELINES FOR DEVELOPMENT OF

SOUND TEST ITEMS

This section provides item writers with specific guidelines for developing technically sound test items and with examples of acceptable and unacceptable items.

Focusing the Topic

Writing a good multiple-choice item is a creative process that requires a great deal of ingenuity and a high level of expertise in the content area to be tested. Items should conform to the test blueprint and should test knowledge critical to practice at the specified experience level. In some programs, items should test examinees' entry-level knowledge; in others, items would appropriately test knowledge after a practitioner, specialist, or technician has been in the field for 2 years or more.

Each item should ask a single focused question. The question may be complex, but should focus on one idea. Consider the following items:

|Unfocused Stem | |Focused Stem |

| | | |

|Seattle is: | |Pike Place Market in Seattle is a popular tourist attraction |

| | |primarily because it: |

| | | |

|A. composed of seven neighborhoods established on seven | |A. is the historic gateway to the Klondike gold rush. |

|hills. | | |

|B. home to the oldest continuously operating farmers’ | |B. is the site of the Space Needle tower and revolving |

|market. | |restaurant. |

|C. the birthplace of grunge music. | |> C. features a fishmonger who tosses "flying fish" to |

| | |customers. |

|D. nicknamed Java City because its primary industry is | |D. houses the original Starbuck's coffeehouse. |

|coffee brewing. | | |

The item on the left addresses the topography, tourism, and commerce of Seattle. Each of these aspects could be the basis for a different item with a single focus. The item on the right uses option B from the unfocused item as the basis for a new stem that features parallel, homogeneous options.

Considering the Cognitive Task

The following examples illustrate how one topic can serve as the basis for items at the cognitive levels of recall, interpretation, and problem solving.

Recall

Houston is named after the general who won the battle for Texas independence at:

A. Gilead.

> B. San Jacinto.

C. the Alamo.

D. the Rio Grande.

Interpretation

A tourist wants to view artifacts related to the historic battle during which Mexican General Santa Anna was captured. The person would most likely visit the area surrounding which Texas city?

A. Ft. Worth

> B. Houston

C. El Paso

D. San Antonio

Problem Solving

A tourist lodging at Austin, Texas, is interested in visiting the historic San Jacinto battleground. The tourist asks a hotel staff member for driving directions based on the map shown above. Which of the following travel directions would be most appropriate?

A. Travel north on Highway 35, then proceed west on Highway 20

> B. Travel south on Highway 35, then proceed east on Highway 10

C. Travel south on Highway 35, then proceed north and west on Highway 10

D. Travel south on Highway 35 to its culmination

Developing the Item Stem

The item stem, which can be written as a complete question or an incomplete statement as illustrated in Section 2, should provide all information necessary to select the correct or best response. In most cases, knowledgeable examinees should be able to formulate an answer without looking at the options. A good stem directs the examinee’s thinking, whereas a poor stem has no focus, and the options are often heterogeneous, testing a wide variety of topics. Consider the following items:

|Draft Stem (Open) | |Improved Stem (Closed) |

| | | |

|Las Vegas is: | |What is the English translation of the Spanish-named city Las Vegas? |

| | | |

|A. the only Nevada city where prostitution is legal. | |A. The city of angels |

|B. Spanish for "the meadows." | |> B. The meadows |

|C. located over an earthquake fault line. | |C. Green spaces |

|D. home to the largest bronze statue in the United States, the | |D. Sin in the sand |

|MGM Grand horse. | | |

A closed-item-stem format that asks a broad, general question about a topic is not necessarily any more focused than the first open-stem draft at the left. Would an examinee have any sense of cognitive direction after reading the following draft stem?

|Draft Stem (Closed) |Improved Stem (Open) |

| | |

|Which of the following statements about Las Vegas is true? |Mobster Benjamin "Bugsy" Siegel opened what famous Las Vegas hotel and|

| |casino? |

| | |

|A. Las Vegas was a Mormon settlement in the mid-1800s. |> A. The Flamingo |

|B. Its primary power source is Hoover Dam, where seven people were |B. The Stardust |

|accidentally entombed during construction. | |

|C. Howard Hughes was general contractor for many of the landmark hotel|C. The Desert Inn |

|casinos built in the 1940s and 1950s. | |

|D. Prostitution is legal in Las Vegas under Clark County laws. |D. The Sands |

The draft stem still fails to alert the examinee to the concept or knowledge being tested. What is actually being asked in this general format is, “What do you know about Las Vegas?” Heterogeneous options like those on the left can be used to create new, well-focused item stems to address the different issues–as long as the issues are important and are relevant to the test blueprint. Remember, the new item’s options should be homogeneous, reflecting similar grammatical structure, length, and content, like those shown on the right.

While a well-focused, concise stem should contain enough information to enable the examinee to respond correctly, it should not be cluttered with irrelevant information. Compare the following examples:

|Irrelevant Information |Improved Stem |

| | |

|William Penn, the founder of Pennsylvania, was a cousin to Captain |A statue of William Penn tops which Philadelphia landmark? |

|William Markham, who settled Philadelphia, the City of Brotherly Love,| |

|in 1681. A likeness of Penn stands tall above which Philadelphia | |

|landmark? | |

| | |

|A. The Philadelphia Museum of Art |A. The Philadelphia Museum of Art |

|B. Liberty Bell Center |B. Liberty Bell Center |

|C. City Hall |> C. City Hall |

|D. Independence Hall |D. Independence Hall |

The essence of both questions is the same. The relationship of William Penn to the founder of Philadelphia, the city's establishment, and its slogan are irrelevant. Irrelevant information in the stem is referred to as “window dressing.”

Negatively Phrased Items. While research has been inconclusive, there is concern that negatively phrased items might confuse test takers, especially those whose first language is not English. Examples of negatively worded stems include:

All of the following cities share the same time zone EXCEPT:

Which of the following states does NOT initiate daylight savings time?

An amateur geologist searching for semiprecious gemstones is LEAST likely to vacation in what state?

Examinees must change normal thought processes to respond to negative items and may overlook the negative aspect and answer the item incorrectly, even though they know the correct response. Whenever an examinee overlooks the negative word because of carelessness, the validity and reliability of the test are lowered. Thus, writers should attempt to frame items positively. Items that contain negatives in both the stem and options are not acceptable.

Constructing Item Options

Items should include four options that are homogeneous in content, grammatical structure, and length. Options should be mutually exclusive (i.e., one option should not be the equivalent of, nor contained within, any other option). Only one option should be keyed as correct; the three alternative options must be incorrect, yet plausible to appeal to examinees who are simply guessing. Incorrect options can often be constructed using common misconceptions associated with a topic, or can be answers that would be obtained if incorrect procedures or principles were used. A “best” answer—if there is not a single, universally known correct answer—is appropriate when the option represents the response that experts would agree is best among the choices listed.

Homogeneity of Options. All options should be similar in content, grammatical structure, and length. The following draft item violates this principle because the options deal with four vastly different issues:

|Unfocused Stem/Heterogenous Options |Focused Stem/Parallel Options |

| | |

|Which of the following statements about Chicago is true? |What is the official nickname of Chicago? |

| | |

|A. It is named after a rock music group. |A. Home of the Blues |

|B. Carl Sandburg once worked there as a hog butcher. |B. Hog Butcher to the World |

|C. The International Pie-Eating Championship is held there annually in|> C. The Windy City |

|July. | |

|D. It is the home of the Cubs baseball team. |D. Cubby Town |

Mutual Exclusiveness of Options. Options should be mutually exclusive and distinctive. Options should not overlap, nor be the equivalent of, nor contained within, any other option. Can you identify the overlap in the following options?

|Overlapping Options |Mutually Exclusive Options |

| | |

|Pennsylvania is located in which region of the United States? |Pennsylvania is bordered on the north by which of the following |

| |states? |

| | |

|A. Mid-Atlantic |> A. New York |

|B. Northeastern |B. Virginia |

|C. New England |C. Ohio |

|D. Eastern seaboard |D. Connecticut |

In the item at the left, depending on which geologic map is consulted, options A, B, and D overlap, and B overlaps with C.

Disallowed Option Phrasings. “None of the above” and “All of the above” should not be used as options. “None of the above” opens up for consideration all possible answers. Unless one of the possible answers is unequivocally correct, examinees may challenge the item's correctness. The use of “all of the above” also is not recommended. An examinee need only know that two of the options are true to select “all of the above,” and need only know that one option is wrong to eliminate “all of the above” as the correct answer.

Options That Clue the Answer. Items should not contain clues that could unfairly help examinees who are test-wise but not well prepared. Can you guess the correct response of, or eliminate any options in, the following sample item?

Item with Clueing

Iowa, once known as "The Tall Corn State," ranks first in the nation in

the export of feed crops and grain byproducts, primarily:

A. potato starch and wheat.

B. soybeans and corn fructose.

C. rice and sunflower seeds.

D. oat straw and chaff.

The correct response, B, is clued partially by the irrelevant slogan in the stem, which includes the word “corn”. Further, B is the only option that presents both a feed crop (soybeans) and a grain byproduct (fructose is the sugar extracted during corn milling).

• words such as “always,” “never,” “all,” or “none,” or phrases that are so broad or so all-inclusive that they are likely to be false.

• words that closely parallel the wording of the stem.

• being longer or very carefully qualified.

• implausibility that belies common sense.

Avoid Abbreviations. Writers should use abbreviations or acronyms only when specifically directed to do so. Instead, writers should spell out a term and include its common abbreviation or acronym in parentheses after the term. This practice removes all doubt about the intended meaning of the information. The important underlying principle is the avoidance of ambiguity. Reducing ambiguity in items contributes to higher test reliability.

Classifying the Item

Items must be classified according to the content outline of the exam, or test blueprint, before they can be electronically stored for test-form selection. Test forms are compiled based on the content classifications and in proportions reflecting the relative emphasis on various knowledge, skills, and abilities. When classifying items, writers must be certain to provide the:

✓ extended alpha/numeric classification from the blueprint [i.e., domain, topic, subtopic(s)];

✓ letter of the correct, or keyed, response (i.e., A, B, C, or D);

✓ item author's printed name;

✓ cognitive level, if applicable;

✓ indication of accompanying visual or graphic material; and

✓ complete reference citation [author(s), text/journal title, edition/volume, publication year, publisher, page numbers] as required.

Submitting Visuals

Writers may choose to develop an item that contains a chart, illustration, table, or line drawing in an effort to tap higher cognitive skills. Graphic materials should be camera-ready quality, or clearly and proportionally drawn and labeled to facilitate production through the use of standard graphics software. Two copies of the graphic should be provided and clearly labeled with the:

• item writer's name;

• date;

• specific number of items the visual material is to accompany; and

• production orientation—the location of the top or bottom, and left or right.

5

SECTION

AVOIDING BIAS AND STEREOTYPING

Our pluralistic society is composed of individuals who possess an immeasurable variety of life experiences, any of which could influence an examinee's response on a cognitive test. Ideally, no test should require knowledge or skills irrelevant to the purpose of the test, including knowledge and experiences confined to a particular demographic group. When the language or content of an item is considered to favor or penalize a particular group, then that item is biased.

A summary list to aid writers in checking items for bias and stereotypical representations appears in Appendix B.

Definition of Bias and Language Sensitivity

Bias refers to a particular group's unfair advantage over another group in responding correctly to a test item. Biased language includes obscure terminology, slurs, slang, and idioms, and outdated referents. Biased content is that which is prejudicial, unfair, or offensive—especially to groups of individuals who have experienced social discrimination. Examples include stereotypes, negative preconceptions, and unjust or distasteful portrayals. A related area of concern is language sensitivity. While biased content actually penalizes various subgroups, insensitive content and language can offend or demean subgroups, as well as promote pejorative or misleading images and stereotypes.

Approaches for Eliminating Bias and Insensitive Language

Increasingly, test authors have focused their attention on developing procedures to eliminate biased and insensitive content from test items. One method is to use neutral terms, such as gender-neutral occupational titles (sales agent rather than salesman), and to avoid unnecessary references to attributes such as race, ethnicity, gender, and age. For example, the item writer may, instead of describing a person as an "elderly Hispanic woman," simply use the term "client."

However, neutralizing language or omitting these references is not always possible, nor desirable. If the test blueprint allows for questions that involve the depiction of individuals from different groups, then item writers need to consider writing items that depict these groups in items across the entire test. Moreover, some test questions—particularly medical or health questions—are designed to require the examinees to sort through a number of factors (including race, gender, etc.) to determine which factors are pertinent. The test-development goal is to strive for a balanced representation in items of the various groups in diverse roles.

Language Sensitivity

Item writers should use standard, formal English in every test item. Idioms and colloquialisms should be avoided because not all examinees may be familiar with their meaning. Regional bias can result when a writer uses content or terminology that is not known universally or has different meanings in different regions of the nation.

Obscure language should also be avoided, including professional jargon, potentially ambiguous acronyms, unfamiliar abbreviations, and nonstandard units of measure. All language used should promote the primary purpose of the examination: to measure as accurately as possible the relevant knowledge and skills of the examinees.

Item writers also need to be wary of creating a condescending tone, especially when referring to people with differing abilities, diseases, and low socioeconomic status. For example, writers of health-related items should avoid describing patients as "complaining" about symptoms. Instead, patients might be described as "having symptoms" or "reporting symptoms." Items also should not label people as diseased: health writers can avoid allowing the disability or disease to become a person's one distinguishing characteristic by stating "a man with diabetes" instead of "a diabetic man."

Avoiding Stereotypes

Item writers also are responsible for the way in which individuals and groups are depicted in the test. This depiction has three facets. First, writers should avoid the portrayal of individuals through positive or negative stereotypes. Second, they should include affirmative depictions of individuals in nontraditional roles, such as men as competent caregivers to children, or elderly individuals as competent workers. Third, item writers should review their items collectively to ensure a neutral or balanced representation of references to various subgroups.

Item writers should be especially aware of inaccurate historical representations of different minority groups, many of whom have traditionally been portrayed in pejorative positions—passive, subordinate, poor, etc.—while the majority group has been shown holding positions of power and prestige. Some of the demographic characteristics affecting bias include sexual preference, geographic region (the South, urban and rural areas), ethnic/cultural background (Asian Americans, Polish Americans, etc.), religious background (Jewish, Muslim, Roman Catholic, etc.), occupation, and physical appearance (height and weight, manner of dress). Additionally, item writers may want to pay special attention to the areas of culture-bound assumptions, race, gender, age, disabilities, and income level.

Culture-Bound Assumptions

Culture-bound assumptions are of particular concern for item writers. Questions should not presume that examinees or people appearing in the items are from the so-called "majority culture," and therefore are white, Christian, or come from any one cultural background. Consider the following item:

Item with a Culture-Bound Assumption

Which of the following is a symptom of carbon monoxide poisoning?

> A. Warm, pink skin

B. Fever

C. Hives

D. Flatulence

Response A assumes that the patient has white or light skin. This problem can be resolved by replacing response A with another correct symptom that parallels options B–D (e.g., "headache" or "nausea").

Racial Bias

Item writers can usually eliminate racial bias and insensitivity with just a little care, including avoidance of common racial minority stereotyping (e.g., that members of minority races are poor, uneducated, delinquent, in subservient occupations, etc.). Some of the racial or cultural groups at particular risk for bias include African Americans, Asian Americans, Hispanic Americans, and Native Americans. Consider the following item stem:

Item with Racial Stereotyping

A 17-year-old African-American girl is referred to the school counseling service after completing a drug rehabilitation program. Her history includes an alcoholic mother, two children of her own, and consistent absences from school classes. Which of the following approaches should the counselor initially consider to establish a relationship?

This item presents racial stereotyping of African Americans as drug users, coming from broken homes, and being irresponsible and sexually active. The race of this client is not necessary to answer the question and should be omitted.

Gender and Age Bias

It is important to avoid stereotyping individuals according to previously traditional occupations, roles, or interests relative to gender and age. Neither sex should be portrayed as submissive or having inferior status to the other. Elderly persons must not be demeaned by repeated portrayal of them as feeble, retired, lonely, or dependent. To help avoid gender and age bias, use a collective pronoun (they) rather than the singular, gender-specific pronoun (he, she). Or, refer to the person by a gender- and age-neutral identifier: client rather than teenager, sales agent rather than saleswoman, firefighter instead of fireman.

Bias Against People with Disabilities

Item writers will also want to show special care in referring to people with disabilities and chronic diseases. The passage of the American Disabilities Act marks a growing societal concern to ensure and guarantee civil rights and equal treatment for people with disabilities. Language should reflect this emerging attitude. For example, stereotypes that depict persons as deserving pity or as dependent on others should be avoided, along with condescending or demeaning language. Also, item writers should avoid portraying people with disabilities as amazingly brave or courageous. It is important to present this group of individuals in active, capable, and independent positions.

While all levels of socioeconomic status can be the subject of bias and stereotyping, economically disadvantaged people are perhaps most commonly depicted through the use of biased and insensitive language. Item writers should avoid assumptions that the poor are also uneducated, delinquent, of minority race, and come from particular living environments (rural or urban areas).

APPENDIX

ITEM-REVIEW CHECKLIST

A

An examination item is well written and appropriate if the:

❑ item does not ask the examinee to make value judgments or arbitrary decisions;

❑ item asks a single question;

❑ item is clear, complete, and well-focused, and addresses a topic that experts in the field would agree is significant;

❑ item stem and options do not contain jargon, slang, or nonstandard abbreviations;

❑ item has four options;

❑ stem is direct, concise, and unambiguous;

❑ stem includes all necessary, but no extraneous, information;

❑ stem and options do not contain confusing double negatives, nor logical inconsistencies;

❑ options logically complete the stem;

❑ options are similar in focus, phrasing, and length;

❑ options do not overlap with each other;

❑ options do not clue unprepared, but test-wise, examinees;

❑ keyed option is clearly the best answer choice;

❑ incorrect options are plausible but clearly not the best answer;

❑ item is classified completely, in accordance with the appropriate test blueprint, and indicates the keyed response, author’s name, and a verifiable reference (if required).

APPENDIX

CHECKLIST TO AVOID BIAS AND STEREOTYPING

B

A final review for bias and insensitivity is essential before submitting your items. Look over the items collectively and ask the following questions:

❑ Are unnecessary references to race, gender, age, class, and sexual preference removed?

❑ Are there any colloquialisms or unfamiliar terms that might confuse examinees, including those whose first language is not English?

❑ Does any item favor examinees from a specific state or region?

❑ Is any one group—in relationship to race, gender, age, socioeconomic status, etc.—represented in limited or stereotyped roles or positions?

❑ Do any items contain dominant-culture assumptions? Do items assume that Western, Judeo-Christian values and mores are somehow correct or ubiquitous?

❑ Is each subgroup that is represented presented, for example, in a variety of occupations, living environments, educational backgrounds?

❑ Is there balanced representation of subgroups across the items?

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download