Chapter 3 Handouts: Statistical Analysis of Economic Relations



Chapter 3 Handouts: Statistical Analysis of Economic Relations

(Some portions of the chapter are omitted and will be covered later).

Population Parameters vs. Sample Statistics

• Population Parameters: descriptive measures of the entire population that you’re interested in

o Ex: All US households

o Ex: All Illinois households with income over $25,000

o Ex: Nielson Ratings estimate the popularity of TV shows by viewership surveying about 5000 US households.

o In the absence of complete and detailed information on every household you are interested in you must estimate the population parameters. Most common way is using sample statistics.

• Sample Statistics: descriptive measures of a representative sample, or subset, of the population.

o Ex: instead of surveying every US household we send out surveys to a subset of the population and use that basic information to estimate what the values would be for the overall population.

Most Common Sample Statistics

Measures of Central Tendency:

1. Mean (or “arithmetic mean” or “average”): the sum of numbers included in the sample divided by the number of observations, n.

a.

b.

c. limitation of the mean: because it is only an average, you can expect that actual data will rarely coincide exactly with your estimate. If there is high variation in your data the average may not be very useful in estimation.

2. Median: is the middle observation in your data.

a. Indicates that half of your observations are above this value and half of your observations are below this value

b. to find the value of the median, rank in ascending or descending order your observations by value. The observation in the middle is the median.

c.

3. Mode: the most frequent value in the sample.

a. useful when there is little variation in the data (values tend to be continuous and close to one another e.g. sales)

• ex: sales data of ice cream in gallons over 8 weeks:

b. can identify the most common value for marketing purposes such as color or size of an item

c. if there are two modes called bimodal

• ex: grades in some upper-level economics courses

If the mean, median, and mode are all the same value then the data is called symmetrical.

• The distribution is symmetrical around the sample values so that the values above and below the mean, median, and mode are a mirror image of one another

• (example p. 75 in text).

• Example: course grades: 5A, 10B, 15C, 10D, 5F

If the mean, median, and mode are different, the data is skewed (skewness).

• If the greater bulk of the data is below the mean then we say the data is skewed downward or skewed rightward

o

o the median should be below the mean value or the mean is to the right of the median.

• If the greater bulk of the data is above the mean then we say the data is skewed upward or skewed leftward

o

o the median should be above the mean value or the mean is to the left of the median.

Measures of Dispersion

1. Range: difference between the largest and the smallest sample observation.

• Our firm’s highest profit this year was $20 million, and the lowest profit this year was $12 million. _________________________________________________.





• Limitation: only focuses on the extreme values and may not be really representative of the entire sample.

2. Variance (σ2 or s2): arithmetic mean of the squared deviation of each observation from the overall mean

• How far observation values are from the average or how far they deviate from the average value; whether they are above or below doesn’t matter; squaring the deviations makes sure positive and negative deviations don’t cancel out each other.

• Variance σ2 =[pic] or s2=[pic]

o Where x is the value in your sample; μ is the population average or mean so (x- μ) is how far your value deviates from the average; n is the number of observations.

• if you’re using sales data then calculating the variance will give you a number such as $24 million squared. Which isn’t very informative!

3. Standard Deviation (σ or s): is the square root of the variance

• if your variance is $24 million squared then the standard deviation is the square root of this number or $4898.98.

• Expressed in terms that are more convenient to use.

• Often used as a measure of potential risk when there is uncertainty.

4. Coefficient of Variation (V): compares the standard deviation to the mean.

• Used often by managers because the value is unaffected by the size or the unit of measure (such as thousands of dollars vs. millions of dollars).

• For example: a manager is comparing two projects: one that costs thousands of dollars and one that costs millions of dollars and projecting profits for each. Looking at standard deviations and comparing them doesn’t allow you to compare apples to apples. Need a measure that isn’t affected by the measurement unit. Coefficient of Variation is such a measure.

• V= σ/ μ or V=s/[pic]

• Numerator is a measure of risk; denominator is a central tendency measure—average outcome.

• Hence, in capital budgeting it is used to compare “risk-reward” ratios for different projects that differ widely in profitability or investment requirements.

Example with Uncertainty: (Not in Textbook)

10% probability of $4000 profit.

20% probability of $3000 profit

70% probability of $1000 profit

A. With uncertainty calculate the Expected Value (EV):



• Weight each outcome by the probability

o

B. Calculate the level of risk or standard deviation:

• Determine the deviation of each potential outcome from the mean (EV)

o

o

o

o

• Square each deviation

o

o

o

• Weight the squared deviation by the probability of occurring

o

o

• Take the square root to get your standard deviation

o

o The higher the number the more risk is involved.

o Used in comparison to other alternatives

o If two separate projects have the same EV but different standard deviations, the one with the higher standard deviation implies greater risk.

o If two projects have different EVs, you must use more complicated analysis to determine which has greater risk.

Regression Analysis: uses data to describe how variables are related to one another.

• Example: Q=f( P, I, Psub, r)

Where Q=sales of Dell computers (dependent variable)

P=price of a Dell computer

I=Income

Psub= price of a substitute, competing, brand

R=interest rate (for financing options)

The right-hand side variables are called “independent variables”

• Using data gathered on all variables, regression analysis allows us to see the relative importance of each independent variable (Price, income, etc) on the dependent variable, sales or quantity.

• Example: Sales Data on Dell Computers

|Month |Sales of Dell |Price of Dell |Income |Price of Sub |Interest Rate |

|January |20,000 |$700 |30,000 |$750 |2% |

|February |30,000 |$600 |30,000 |$725 |2% |

|March |22,500 |$700 |30,000 |$750 |1% |

|April |24,000 |$600 |30,000 |$710 |1% |

|May |21,250 |$700 |32,000 |$750 |2% |

• As a manager, we want to know what influences sales (Q) of Dell computers and how big of an impact each variable has.

o Without statistical models, you could gather some information from the data. For example, compare January to May. Sales of computers increases; price of Dell was constant, price of substitute was constant, and interest rate was constant. Hence, the only variable changing is income. Income increases and sales increased so we know computers are a normal good.

o In markets, many variables change simultaneously and regression analysis accounts for multiple changes.

o Linear regression: Q= a + bP + cI + dPsub + er

o

o

o c is coefficient for income

o d is coefficient for the price of substitutes

o e is coefficient for interest rate

o Show in excel how to run the regression

o Copy data into excel

o Under Data Tab use “Data Analysis”

o select regression from drop down list

o select y range of data (dependent variable Q—select only data not title)

o select x range of data (all independent variable data)

o click OK

o results pop into another window showing coefficients for our variables

| |-249583.3 |

| | |

|Intercept | |

|X Variable 1 |-241.6667 |

|X Variable 2 |0.625 |

|X Variable 3 |566.66667 |

|X Variable 4 |-250000 |

This means our regression is Q= -249583.3 – 241.67P + 0.625I + 566.57Psub – 250000r

• you may confirm this as the appropriate equation by using any row in your table, plugging in the values of the dependent variable (with interest rates use .02 or .01 for percents) and you will get the sales quantity listed in the table.

• Interpreting results:

o

o

o Interpret other variables similarly as a 1 unit change.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download