Biostatistics Module I: Instructor version



BIOSTAT Case Study 1: Exploratory Data Analysis Techniques

Time to Complete Exercise: 30 minutes

LEARNING OBJECTIVES

At the completion of this Case Study, participants should be able to:

➢ Access TB surveillance data from the CDC Web site

➢ Generate box-and-whiskers plots, stem and leaf diagrams, and histograms

➢ Generate percentile values and measures of central tendency and dispersion for skewed distributions

➢ Describe the magnitude of the TB incidence (new case) rates in the United States

➢ Describe the differences in TB incidence rates by sex/gender and state across the United States

ASPH BIOSTATISTICS COMPETENCIES ADDRESSED

A.5. Apply descriptive techniques commonly used to summarize public health data

A.8. Apply basic informatics techniques with vital statistics and public health records in the description of public health characteristics and in public health research and evaluation

ASPH INTERDISCIPLINARY/CROSS-CUTTING COMPETENCIES ADDRESSED

F. 8. [Communication and Informatics] Use information technology to access, evaluate, and interpret public health data

Suggested Citation: New Jersey Medical School Global Tuberculosis Institute. /Incorporating Tuberculosis into Public Health Core Curriculum./ 2009: Biostatistics Case Study 1: Exploratory Data Analysis Techniques STUDENT Version 1.0.

Introduction

Control of tuberculosis (TB) in the United States is an important public health responsibility. Effective TB control requires a complex system that merges elements of laboratory science, investigative work, public health, surveillance, and clinical care.

The Tuberculosis Information Management System (TIMS) is one example of a public health surveillance system. TIMS is one of the main sources of descriptive data regarding TB in the United States. TIMS includes information on all cases of TB that have been reported to the Division of TB Elimination (DTBE) at the Centers for Disease Control and Prevention (CDC). This information is reported to CDC by 50 states, the District of Columbia, the city of New York, Puerto Rico, and other jurisdictions in the Pacific and Caribbean.

Data on person, place, and time relating to TB in the United States are gathered using TIMS. These data are analyzed and published by the CDC annually and may be accessed through the CDC Web site in the form of TB Surveillance Reports at: and the Online Tuberculosis Information System (OTIS) at . If you were to access OTIS and request current TB case reports by sex and state for the period 2001-5, you would obtain the data below. The data presented below are the TB new case rates per 100,000 population for males and females (person), in the 50 states and the District of Columbia (DC) (place) during the years 2001 to 2005 (time).

TB Case Rates per 100,000 Population

Place Females Males Place Females Males

Alabama 3.4 7.2

Alaska 6.6 9.4

Arizona 3.5 6.6

Arkansas 3.4 6.5

California 7.1 10.6

Colorado 2.1 3.0

Connecticut 2.5 3.6

Delaware 2.7 4.6

DC 8.2 19.0

Florida 4.3 8.5

Georgia 4.5 7.8

Hawaii 7.9 12.6

Idaho 0.9 1.1

Illinois 4.1 6.0

Indiana 1.6 2.7

Iowa 1.2 1.8

Kansas 2.1 3.2

Kentucky 2.1 4.7

Louisiana 3.7 7.9

Maine 1.2 2.0

Maryland 4.5 6.0

Massachusetts 3.5 5.0

Michigan 2.4 3.2

Minnesota 3.9 4.7

Mississippi 2.9 6.1

Missouri 1.5 3.1

Montana 0.7 2.1

Nebraska 1.5 2.5

Nevada 3.6 5.1

New Hampshire 1.2 1.3

New Jersey 5.0 6.8

New Mexico 2.3 2.8

New York 5.6 9.5

North Carolina 3.2 5.9

North Dakota 0.8 1.0

Ohio 1.6 2.9

Oklahoma 3.5 6.4

Oregon 2.3 3.9

Pennsylvania 2.2 3.3

Rhode Island 4.0 5.5

South Carolina 4.2 8.2

South Dakota 1.7 2.1

Tennessee 3.4 6.9

Texas 4.9 9.5

Utah 1.2 1.7

Vermont 1.5 0.9

Virginia 3.9 4.9

Washington 3.3 5.0

West Virginia 1.0 2.0

Wisconsin 1.2 1.7

Wyoming 0.5 0.7

Exploratory data analysis techniques are often used to organize, summarize, and describe clinical and epidemiologic data. These techniques include stem-and-leaf plots and box plots. To make this easier, the sorted data, by gender, appear below.

Female TB Case Rates per 100,000 Population

1. Wyoming 0.5

2. Montana 0.7

3. North Dakota 0.8

4. Idaho 0.9

5. West Virginia 1.0

6. Iowa 1.2

7. Maine 1.2

8. New Hampshire 1.2

9. Utah 1.2

10. Wisconsin 1.2

11. Missouri 1.5

12. Nebraska 1.5

13. Vermont 1.5

14. Indiana 1.6

15. Ohio 1.6

16. South Dakota 1.7

17. Colorado 2.1

18. Kansas 2.1

19. Kentucky 2.1

20. Pennsylvania 2.2

21. New Mexico 2.3

22. Oregon 2.3

23. Michigan 2.4

24. Connecticut 2.5

25. Delaware 2.7

26. Mississippi 2.9

27. North Carolina 3.2

28. Washington 3.3

29. Alabama 3.4

30. Arkansas 3.4

31. Tennessee 3.4

32. Arizona 3.5

33. Massachusetts 3.5

34. Oklahoma 3.5

35. Nevada 3.6

36. Louisiana 3.7

37. Minnesota 3.9

38. Virginia 3.9

39. Rhode Island 4.0

40. Illinois 4.1

41. South Carolina 4.2

42. Florida 4.3

43. Georgia 4.5

44. Maryland 4.5

45. Texas 4.9

46. New Jersey 5.0

47. New York 5.6

48. Alaska 6.6

49. California 7.1

50. Hawaii 7.9

51. District of Columbia 8.2

Male TB Case Rates per 100,000 Population

1. Wyoming 0.7

2. Vermont 0.9

3. North Dakota 1.0

4. Idaho 1.1

5. New Hampshire 1.3

6. Utah 1.7

7. Wisconsin 1.7

8. Iowa 1.8

9. Maine 2.0

10. West Virginia 2.0

11. Montana 2.1

12. South Dakota 2.1

13. Nebraska 2.5

14. Indiana 2.7

15. New Mexico 2.8

16. Ohio 2.9

17. Colorado 3.0

18. Missouri 3.1

19. Kansas 3.2

20. Michigan 3.2

21. Pennsylvania 3.3

22. Connecticut 3.6

23. Oregon 3.9

24. Delaware 4.6

25. Kentucky 4.7

26. Minnesota 4.7

27. Virginia 4.9

28. Massachusetts 5.0

29. Washington 5.0

30. Nevada 5.1

31. Rhode Island 5.5

32. North Carolina 5.9

33. Illinois 6.0

34. Maryland 6.0

35. Mississippi 6.1

36. Oklahoma 6.4

37. Arkansas 6.5

38. Arizona 6.6

39. New Jersey 6.8

40. Tennessee 6.9

41. Alabama 7.2

42. Georgia 7.8

43. Louisiana 7.9

44. South Carolina 8.2

45. Florida 8.5

46. Alaska 9.4

47. New York 9.5

48. Texas 9.5

49. California 10.6

50. Hawaii 12.6

51. District of Columbia 19.0

Question 1

Generate separate stem-and-leaf diagrams of these case rates for males and females and describe the distribution of these data. (Hint: use the decimal as the leaf.)

Female TB Case Rates per 100,000 Male TB Case Rates per 100,000

19 19

18 18

17 17

16 16

15 15

14 14

13 13

12 12

11 11

10 10

9 9

8 8

7 7

6 6

5 5

4 4

3 3

2 2

1 1

0 0

Question 2

Describe the distributions. Are they normally distributed or skewed to the right or skewed to the left?

Question 3

What is the median TB case rate among females and among males? The 75% and 25% values? The interquartile (IQ) range? The range?

Question 4

Draw/generate a histogram and a box-and-whiskers plot describing the rates for males and females. Which states/locations have unusually high or low (outlier) rates?

Question 5

Describe the differences in the TB case rates for males and females.

-----------------------

This material was developed by the staff at the Global Tuberculosis Institute (GTBI), one of four Regional Training and Medical Consultation Centers funded by the Centers for Disease Control and Prevention. It is published for learning purposes only.

Case study author(s) name and position:

Marian R. Passannante, PhD

Associate Professor, University of Medicine & Dentistry of New Jersey, New Jersey Medical School and School of Public Health

Epidemiologist, NJMS, GTBI

For further information please contact:

New Jersey Medical School Global Tuberculosis Institute (GTBI)

225 Warren Street P.O. Box 1709

Newark, NJ 07101-1709

or by phone at 973-972-0979

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download