Statistics with Excel

Statistics with Excel

These instructions tell you how to do some basic statistics operations on Excel. There is much

better software for statistics, but Excel is commonly available, and can do the thins we need in this

course.

When I use the word select below, it means clicking on a cell, or to select a bunch of cells,

clicking on the first one and dragging down to the last one.

After you select a group of cells in Excel, they should be highlighted.

? DATA: The first thing you need to know how to do is how to get data into Excel. There are

three ways:

1. You may already have the data in an Excel compatiable spreadsheet, in which case you

can simply open the spreadsheet in Excel.

2. You may have the data in a text file, in which case you should follow the following steps:

- Start Excel.

- go to the FILE menu and choose OPEN .

- a dialogue window should come up. You will have to set the Enable line at the top

to either Enable Text Files or Enable All Documents.

- Find the document you want and OPEN it.

3. You may have a small list of numbers, in which case you can open an empty Excel

worksheet, and enter the numbers by hand. Click on the top cell in the column where

you wish to enter numbers, type the first number in, then hit down-arrow, type the

second number, etc.

? BASIC OPERATIONS I cant give an excel tutorial here, but there is one common thing

which you will need to do. Suppose you have a column of data in A2:A99, and another

column of data in B2:B99, and you wish to put the sum of the columns in column C.

Click on C2, and type =a2+b2 (without quotes) and then return .

Now click on C2 again, and drag the mouse down to C99 so that all those cells are selected.

Then go up to the menu bar, and then choose EDIT , FILL , DOWN . This will enter in

each cell Cn the sum An+Bn.

? STATISTICS The basic procedure is almost the same for most of the statistics you need to

computer (mean, median, quartiles, minimum, maximum, standard deviation, etc). Suppose

your data is in cells A1 through A67.

1. Click on an empty cell in which you wish to enter the statistic you are computing.

2. Now click on the = sign next to the area in the toolbar for entering data. (This tells

Excel you will be entering some mathematical expression.) After you do this, an equals

sign will appear in the input area in the toolbar, and you can type in the statistic you

need to compute.

3. Now just type in the right command to get the statistic you need.

MEDIAN:

MEAN:

median(A1:A67)

average (a1:a67)

MINIMUM:

MAXIMUM:

STANDARD DEVIATION:

FIRST QUARTILE:

THIRD QUARTILE:

min(a1:a67)

max(a1:a67)

stdev(a1:a67)

quartile(a1:a67,1)

quartile(a1:a67,3).

? HISTOGRAMS: Making a histogram is a little bit annoying, but not too hard. The main

problem is that the histogram package is an add-on which has to be turned on in Excel.

Again, I assume you are working on a spreadsheet where your data is already entered, say as

cells A1 to A67 again.

1. First you need to select your bins. Choose dividing points for your data. Enter them in

some empty column on the worksheet. Ill suppose they are D1 to D9

2. Click on Tools in the toolbar and go down to Data Analysis .

If Data Analysis doesnt appear, you have to click on Tools , then Add Ins and then

check the buttons next to the Analysis ToolPak entries on the menu which comes up,

and then click on OK.

3. Now choose Histogram from the menu which came up when you clicked on Tools and

4.

5.

6.

7.

8.

9.

Data Analysis in the step above. And click OK.

Now enter a1:a67 for the Input Range on the dialog box which comes up. Also click

on Chart output in that box, and to be safe, on New worksheet ply so that your

histogram wont cover up any of your data.

Enter d1:d9 for the Bin Range. (If you leave it blank, Excel will choose a bin range

for you.)

Click OK. Excel will now make a really bad histogram which you need to fix.

First notice that you can move the histogram by clicking on it and dragging. You can

also change the size by clicking on an edge or a corner and dragging.

You need to get rid of the gaps. Double-click on one of the bars. A menu will come

up. Click on Options button, set Gap Width to 0, and click OK.

Now you need to label and title things decently. You can change the name from histogram to something resonable by clicking on the title, and typing in what you want.

Similarly for other labels.

Finally, if you wish to print, click on the body of the histogram, and then choose Print

from the File menu.

? SCATTERPLOTS: To make a scatterplot, youll need to be examining two variables, x

and y. Typically x will be in one column of the spreadsheet, and y will be in another.

There are several ways to make a scatterplot.

C The easiest is if x and y are in adjoining columns with x to the left of y. Then select the

data from both columns, click on the Chart Wizard button in the toolbar, select XY

(Scatter) under Chart type and the picture which looks like a bunch of dots under

Chart sub-type. Then click Next. You can probably click next again until you get to

the page which allows you to enter a title and labels for the axes.

C If x and y are not in adjoining columns but x is still left of y, you can select the data

from both columns (wihout selecting the intermediate columns) and proceed as in the

previous step.

C If x is not to the left of y, or if you are having trouble selecting the data you can enter

the data ranges directly into the scatterplot command. Suppose x is in cells E6 through

E123 and y is in cells C6 through C123. Then click on the Chart Wizard button in

the toolbar, select XY (Scatter) under Chart type and the picture which looks like

a bunch of dots under Chart sub-type. Then click Next. Youll get a spot to enter

the x data (enter E6:E123 ) and the y data (enter Y6:Y123 ). Then continue as above

with the title and axes labels.

? CORRELATION COEFFICIENTS: Assuming, as above, that you have your data in E6

through E123 and in C6 through C123, double click on a cell where you wish to enter the

correlation coefficent and type =correl(e6:e123,c6:c123).

? REGRESSION LINE (least squares): Once you have made your scatterplot, you can click

on the plot, and go up to the Chart menu and select Add Trendline. . . . Click on Linear.

If you want to display the equation for the regression line or the r2 value, click on Options

and check the appropriate checkboxes.

? NORMAL DISTRIBUTION PROBABILITIES:

C To work with the standard normal distribution: if you double click on a cell, and enter

=normsdist(z) you will get the area under the standard normal curve to the left of z.

This is the equivalent of looking z up in Table A.

C To work with some other normal distribution: if you double click on a cell and enter

=normdist (x,?,,TRUE) you will get the area under the normal curve with mean ?

and standard deviation to the left of x.

C Double-clicking on a cell and entering =normsinv(p) will give the z-vaue so that the

part of the curve to the left of z has area p. (This is using Table A backwards.)

? t-DISTRIBUTION PROBABILITIES: For a t-distribution of n degrees of freedom, to

get the probability that t a (or that t ?a), double click on a cell, and enter =tdist

(a,n,1).

The 1 describes that you are asking for the area of one tail. If you want the area of two

tails, (this will be the probability of t a and t ?a) enter =tdist (a,n,2).

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download