Types of Data Descriptive Statistics
嚜燈utline of Topics
Statistical Methods I
I. Descriptive Statistics
II. Hypothesis Testing
III. Parametric Statistical Tests
IV Nonparametric Statistical Tests
IV.
V. Correlation and Regression
Tamekia L. Jones, Ph.D.
(tjones@cog.ufl.edu)
Research Assistant Professor
Children*s Oncology Group Statistics & Data Center
Department of Biostatistics
Colleges of Medicine and Public Health & Health
Professions
2
Types of Data
Descriptive Statistics
? Nominal Data
? Descriptive statistical measurements are used
in medical literature to summarize data or
describe the attributes of a set of data
每 Gender: Male, Female
? Ordinal Data
每 Strongly disagree, Disagree, Slightly disagree,
Neutral,, Slightly
g y agree,
g , Agree,
g , Strongly
g y agree
g
? Nominal data 每 summarize using
rates/proportions.
/
i
每 e.g. % males, % females on a clinical study
? Interval Data
Can also be used for Ordinal data
每 Numeric data: Birth weight
3
4
1
Descriptive Statistics (contd)
Measures of Central Tendency
? Summary Statistics that describe the
location of the center of a distribution of
numerical or ordinal measurements where
? Two parameters used most frequently in
clinical medicine
每 Measures of Central Tendency
每 Measures of Dispersion
- A distribution consists of values of a characteristic
and the frequency of their occurrence
每 Example: Serum Cholesterol levels (mmol/L)
6.8 5.1
6.1 4.4
5.0
7.1 5.5
3.8 4.4
5
6
Measures of Central Tendency (contd)
Measures of Central Tendency (contd)
Mean (Arithmetic Average)
Mean 每 used for numerical data and for
symmetric distributions
Median 每 used for ordinal data or for
numerical data where the distribution is
skewed
Mode 每 used primarily for multimodal
distributions
? Sensitive to extreme observations
7
? Replace 5.5 with, say, 12.0
The new mean = 54.7 / 9 = 6.08
8
2
Measures of Central Tendency (contd)
Measures of Central Tendency (contd)
Mode
Median (Positional Average)
? Middle observation: ? the values are less than and half the values
are greater than this observation
? Order the observations from smallest to largest
3.8 4.4 4.4 5.0 5.1 5.5 6.1 6.8
? The observation that occurs most frequently in the data
? Example: 3.8
Mode = 4.4
4.4
4.4
5.0
5.1
5.5
6.1
6.8
7.1
? Example: 3.8
Mode = 4.4; 6.1
4.4
4.4
5.0
5.1
5.5
6.1
6.1
7.1
7.1
? Median = middle observation = 5.1
? Less Sensitive to extreme observations
? Replace 5.5 with say 12.0
? New Median = 5.1
? Two modes 每 Bimodal distribution
9
10
Measures of Central Tendency (contd)
Measures of Central Tendency (contd)
Shape of the distribution
? Symmetric
Which measure do I use?
Depends on two factors:
1. Scale of measurement (ordinal or
numerical)) and
2. Shape of the Distribution of Observations
? Skewed to the Left (Negative)
? Skewed to the Right (Positive)
11
12
3
Measures of Dispersion
Measures of Dispersion (contd)
? Measures that describe the spread or variation in
the observations
? Common measures of dispersion
?
?
?
?
?
Range
Standard Deviation
Coefficient of Variation
Percentiles
Inter-quartile Range
Range = difference between the largest and the
smallest
ll t observation
b
ti
? Used with numerical data to emphasize
extreme values
? Serum cholesterol example
Minimum = 3.8, Maximum = 7.1
Range = 7.1 每 3.8 = 3.3
13
14
Measures of Dispersion (contd)
Measures of Dispersion (contd)
Standard Deviation
Standard Deviation
每 Measure of the spread of the observations about the mean
6.8
5.1
Mean = 5.35
6.1
4.4
5.0
7.1
5.5
3.8
4.4
n=9
每 Used as a measure of dispersion when the mean is used to
measure central tendency for symmetric numerical data
每 Standard
St d d deviation
d i ti like
lik the
th mean requires
i numerical
i l data
d t
每 Essential part of many statistical tests
每 Variance = s2
15
16
4
Measures of Dispersion (contd)
Measures of Dispersion (contd)
Coefficient of Variation
? Measure of the relative spread in data
? Used to compare variability between two numerical data
measured
d on different
diff
scales
l
? Coefficient of Variation (C of V) = (s / mean) x 100%
If the observations have a Bell-Shaped
Di ib i
Distribution,
th the
then
th following
f ll i is
i always
l
true
t
-
67% of the observations lie between X ?1s and X ?1s
? Example:
95% of the observations lie between X ? 2s and X ? 2s
99.7% of the observations lie between X ? 3s and X ? 3s
Mean
Std Dev (s)
Serum Cholesterol ((mmol/L))
5.35
1.126
Change in vessel diameter (mm)
0.12
0.29
C of V
The Normal (Gaussian) Distribution
17
18
Measures of Dispersion (contd)
Measures of Dispersion (contd)
Coefficient of Variation
? Measure of the relative spread in data
? Used to compare variability between two numerical data
measuredd on different
diff
scales
l
? Coefficient of Variation (C of V) = (s / mean) x 100%
e.g. DiMaio et al evaluated the use of the test measuring maternal
serum alphafetoprotein (for screening neural tube defects), in a
prospective study of 34,000 women.
Reproducibility of the test procedure was determined by
repeating the assay 10 times in each of four pools of serum. Mean
and s of the 10 assays were calculated in each of the 4 pools.
Coeffs of Variation were computed for each pool: 7.4%, 5.8%,
2 7% and 22.4%.
2.7%,
4% These values indicate relatively good
reproducibility of the assay, because the variation as measured by
the std deviation, is small relative to the mean. Hence readers of
their article can be confident that the assay results were
consistent.
? Example:
Mean
Std Dev (s)
Serum Cholesterol ((mmol/L))
5.35
1.126
C of V
21%
Change in vessel diameter (mm)
0.12
0.29
241.7%
? Relative variation in Change in Vessel Diameter is more
than 10 times greater than that for Serum Cholesterol
19
20
5
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related searches
- types of data analysis methods
- types of data analysis pdf
- types of data analysis techniques
- types of data sets in healthcare
- types of data file formats
- types of data continuous discrete
- types of data presentation
- types of data presentation methods
- types of data schema
- types of data distributions
- types of data collection
- types of data analysis