Biometry - STAT 305



Biometry - STAT 305 ~ Assignment #4  (51 points)

1.  Albino Rats

Albino rats used to study the hormonal regulation of a metabolic pathway are injected with a drug that inhibits body synthesis of protein.  Usually 4 out of 20 rats die from the

drug before the experiment is over.  If 10 animals are treated with the drug find the following:

a) The probability that at least 8 will be alive at the end of the experiment. (2 pts.)

If we let X = # of rats that die, then P(8 or more are alive) = P(X < 2) = .6778 by

using the binomial probability density function or the Binomial Table Generator from

JMP.

b) The probability that more than 4 die. (2 pts.)

P(X > 4) = P(X > 5) = .0328

c) The probability that at least one dies. (2 pts.)

P(at least one dies) = P(X > 1) = .8926 (Note: this is also 1 – P(X = 0))

d) If we used 100 rats in a study what is the mean and standard deviation of the number of rats that would die during the course of the experiment? Use the mean and standard deviation to give an interval that is very likely, say 95% chance, to cover the number rats that would actually die. (3 pts.)

E(X) = mean of X = np = 100(.20) = 20 rats

SD(X) = standard deviation of X = [pic]= [pic]rats

When n is “large” the binomial probability distribution is well approximated by a normal distribution. To get an interval that has roughly a 95% chance of covering the number of rats that would die out of 100 in a given experiment we simply take the mean plus or minus two standard deviations which gives the interval 12 to 28 rats.

2. Darwin’s Study of Cross- and Self-Fertilization (5 pts.)

These data are from Charles Darwin’s study of cross- and self-fertilization. Pairs of seedlings of the same age, one produced by cross-fertilization and other other by self-fertilization, were grown together so that members of each pair were reared under nearly identical conditions. The aim was to demonstrate the greater vigor of the cross-fertilized plants. The data given in pairs are the heights of each plant (in inches) after a fixed period of time.

| | | | | |

|Pair Number |Cross |Self |Difference |Sign of Difference |

|1 |23.5 |17.4 | | |

|2 |12.0 |20.4 | | |

|3 |21.0 |20.0 | | |

|4 |22.0 |20.0 | | |

|5 |19.1 |18.4 | | |

|6 |21.5 |18.6 | | |

|7 |22.1 |18.6 | | |

|8 | 20.4 |15.3 | | |

|9 |18.3 |16.5 | | |

|10 |21.6 |18.0 | | |

|11 |23.3 |16.3 | | |

|12 |21.0 |18.0 | | |

|13 |22.1 |12.8 | | |

|14 |23.0 |15.5 | | |

|15 |12.0 |18.0 | | |

Is there evidence to suggest that the cross-fertilized plants have greater height?

To answer this question, consider the sign (+ or - ) of the difference in the heights of the plants in each pair by first calculating Difference = Cross – Self and then looking at the sign of the difference (+ or -).

a) If there is no difference in the cross- and self-fertilized plants in terms of height what is P(Cross – Self > 0) = P(+)? Explain your answer.  (1 pts.)

b) Count how many positive differences there are and compute the probability of getting that many or more using the probability of a positive difference from part (a). (2 pts.)

In 13 out of the 15 plant pairs the cross-fertilized plant was taller. This has the same probability as obtaining 13 heads in 15 flips of a fair coin. P(X > 13) = .0037 or a .37% chance.

c) Use the probability found in part (b) to answer the question of interest to the researchers.  Explain your reasoning. (3 pts.)

If there was no difference in the height of the cross-fertilized and self-fertilized plants, i.e. P(+) = .50, the probability we would have 13 pairs of plants where the cross-fertilized plant was taller out of the 15 pairs is only .0037 or a .37% chance. Because the observed results are quite unlikely to occur assuming there is no tendency for one plant type to be taller than the other, we conclude that there must be a tendency toward cross-fertilized plants being taller than the self-fertilized plants when grown under the same conditions.

 3. Radioactive Gas Emissions at Prairie Island Reactor

The Prairie Island nuclear power plant releases a detectable amount of radioactive gases twice a month on average.

a) Find the probability that there will be no emissions during a 3-month period. You will need to adjust [pic] so that it reflects average number of emission during a 3-month period.

(2 pts.)

Looking at 3-month periods we expect there to be 6 detectable radioactive gas leaks, therefore we can use a Poisson with μ ’ 6 to model the probability associated with

X = # of detectable radioactive gas leaks per 3-month period.

[pic]

b) Find the probability there will be at most four such emissions during the 3-month period. (2 pts.)

Using Poisson Table Generator P(at most four) = P(X < 4) = .2851

c) If during the 3-month period 12 or more emissions were detected, do you feel that there is reason to suspect the reported average figure of twice a month? Explain/justify your answer. (3 pts.)

P(X > 12) = .0201, which tells us it is fairly unlikely to be observe 12 or more detectable radioactive gas leaks over a 3-month period. So either we have observed an event that will happen about 2% of the time, or the assumption that the mean rate of occurrence per 3-month period is greater than 6 which makes the 12 gas leaks observed more plausible.

4. Diabetes Screening Using Fasting Glucose Levels

A standard test for diabetes is based on glucose levels in the blood after fasting for prescribed period. For healthy people the mean fasting glucose level is found to be 5.31 [pic]mole/liter with a standard deviation of 0.58[pic]mole/liter. For untreated diabetics the mean is 11.74, and the standard deviation is 3.50. In both groups the levels appear to be approximately Normally distributed.

To operate a simple diagnostic test based on fasting glucose levels we need to set a cutoff point, C, so that if a patient’s fasting glucose level is at least C we say they have diabetes. If it is lower, we say they do not have diabetes. Suppose we use C = 6.5.

a) What is the probability that a diabetic is correctly diagnosed as having diabetes, i.e. what is the sensitivity of the test? (2 pts.)

Let X = glucose level of a randomly selected diabetic. We know X ~ N(11.74, 3.50), so P(X > 6.5) = P(Z > -1.50) = .9332 or a 93.32% chance.

b) What is the probability that a nondiabetic is correctly diagnosed as not having diabetes, i.e. what is the specificity? (2 pts.)

Let X = glucose level of a randomly selected non-diabetic.

We know X ~ N(5.31, .58), so P(X < 6.5) = P(Z < 2.05) = .9798 or a 97.98% chance.

Suppose we lower the cutoff value to C = 5.7.

c) What is the sensitivity now? (2 pts.)

P(X > 5.7) = P(Z > -1.73) = .9582 or a 95.82% chance.

d) What is the specificity now? (2 pts.)

P(X < 5.7) = P(Z < .67) = .7486 or a 74.86% chance.

In deciding what C to use, we have to trade off sensitivity for specificity. To do so in a reasonable way, some assessment is required of the relative “costs” of misdiagnosing a diabetic and misdiagnosing a nondiabetic. Suppose we required a 98% sensitivity.

e) What value of C gives a sensitivity of .98 or 98%? How specific is the test when C has this value? (4 pts.)

We want to find C, so that P(X > C) = .9800 for diabetic.

The z-score for C is found by finding z so that P(Z > z) = .9800 or equivalently the

P(Z < z) = .0200 which gives a z = -2.05. Thus C = 11.74 + 3.50(-2.05) = 4.56.

If we use C = 4.56 the specificity is found by using X ~ N(5.31, .58) which gives

P (X < 4.56) = P(Z < -1.29) = .0985 or only a 9.85% chance. Thus we obviously cannot use C = 4.56 as a cutoff.

5. Mercury Levels Found in Minnesota Walleyes (1990-1998)

Data Files: Walleyes (1990-1998).JMP

Walleyes (1990-1998) Major Waterways.JMP

Keywords: Assessing Normality, Transformations, Normal Probability Distribution, Comparative Displays

Assuming that log mercury levels are normally distributed with the mean and standard deviation equal to the sample based estimates do the following.

c) If a walleye is picked at random from a Minnesota waterway what is the probability that the mercury level found in its tissues exceeds the EPA safe limit of 1 ppm? (2 pts.)

Note: This cutoff is in the original scale, not the log scale!

First you must realize that Hg levels are NOT normally distributed so we cannot use a normal probability model to calculate this probability in the original scale. However, when we consider the log(Hg) levels they are approximately normally distributed. The sample mean and standard deviation of the log(Hg) levels are

-.584 log(ppm) and .354 log(ppm) respectively, so we have

X = log(Hg) level of a randomly selected MN walleye ~ N(-.584,.354)

To find P(Hg level > 1.00) = P(log(Hg) level > 0) = P(X > 0) = P(Z > 1.65) = .0495 or 4.95% chance.

d) How does your answer compare to proportion of fish in the data that have levels exceeding 1 ppm? (1 pt.)

The actual number fish out of n = 2824 that were sampled, 125 had Hg levels of 1 ppm or more, so P(X > 1) = 125/2824 = .0443 or 4.43% chance.

e) Estimate the 75th percentile of the mercury levels in the original scale assuming normality. How does this compare to the 75th percentile from the sample? (3 pts.)

Using the fact that log(Hg) is normal we can use this fact to find the 75th percentile of the log(Hg) levels and then convert this value back to the original scale.

We want to find a log(Hg) level, x, so that P(X < x) = .7500. This is done by first finding z, so that P(Z < z) = .7500 and using the fact that [pic]to find x.

z = .675 ( x = -.584+.354(.675) = -.3451, which gives [pic]ppm.

From the Quantiles for HGPPM in JMP we have .4700 ppm as the 75th percentile.

f) What is the probability a randomly selected walleye would have a mercury level between .5 ppm and 1 ppm? (2 pts.)

P(.5 < Hg Level < 1) = P(-.301 < log(Hg) level < 0) = P(-.301 < X < 0) =

P(.80 < Z < 1.65) = .9505 - .7881 = .1624 or a 16.24% chance.

g) What is the probability a randomly selected walleye would have a mercury level between 0 ppm and .5 ppm? (2 pts.)

P(0 < Hg level < .5) = P([pic]< log(Hg) level < - .301) = P(X < -.301) =

P(Z < .80) = .7881 or a 78.81% chance.

Now look at the file Walleyes (1990-1998) Major Waterways.

h) In which waterway do you have the greatest chance of catching a walleye exceeding the safe limit? What is this probability? (3 pts.)

Sand Point because is has the largest sample mean log(Hg) level. Let X = log(Hg) level for a randomly selected walleye from Sand Point, then if assume X has a normal distribution with mean -.086 log(ppm) and a standard deviation of .254 ppm, i.e. X ~ N(-.086,.254) we find...

P(X > 0) = P(Z > .34) = .3669 or a 36.7% chance.

i) In which waterway do you have the smallest chance of catching a walleye exceeding the safe limit? What is this probability? (3 pts.)

Mississippi River, using the same argument as above X ~ N(-.918, .270).

P(X > 0) = P(Z > 3.40) = .0003 or .03% chance.

j) Looking only at the data from Sand Point construct a plot of Log Scale Mercury (Y) vs. weight (X) using Fit Y by X, what weight limit would you set for eating walleyes caught in the Sand Point waterway? (2 pts.)

[pic]

Eyeballing it, we conclude a fish weighing approximately 2.5 lbs. or more is likely to have a Hg level exceeding 1 ppm.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download