Changes:



AP Statistics First Semester Project 2: Linear Regression

The Project: You and your partner (different from the first project), or you by yourself, will perform a linear regression analysis with residual diagnostics using a data set from Census at School. Once you have determined who you will be working with, you will be assigned a state to pull 25 data points from. With this data you will create and analyze two scatter plots: one with an underlying cause-effect relationship (ex: SAT scores and GPA) and one without (ex: AP Stat Exams and National Crime Rate). This project will be due on 12/12. The final product will be a 5 minute PowerPoint presentation given to the class.

Gathering Data:

• Log on to

• Select a sample size of 25

• Select your assigned state or territory

• Select ‘All Grade Levels,’ ‘All’ genders, and ‘All’ collection years

• Download the generated Excel file to your computer

Analyzing Data

• Find two quantitative variables in your data set that you believe should have an underlying relationship

• Find two quantitative variables in your data set, different from the first two you selected, that you believe have no underlying relationship

• Use this data to create and analyze two separate scatter plots and two separate residual plots using Microsoft Excel (or some similar spreadsheet program) and the steps we discussed in class

• An example can be found on the class webpage

The PowerPoint should include:

• Title slide that includes your names and class section

• Introduction slide that explains where your data came from and what two scatter plots you analyzed

• A slide with “fun facts” about your state

• Analysis of each scatter plot (to include):

o An image of the scatter plot with appropriate scales and labels, including a regression line and equation and r2 value.

o A description of the scatter plot (F.U.D.S.)

o A calculation and interpretation of s

o An interpretation of the slope and r2

o A residual plot and an analysis of whether a linear model is appropriate for the data

o Identification of outliers and influential data points

o Calculation and interpretation of your own residuals*

• A final summary slide that discusses whether or not your results were expected or surprising and makes some mention of the relationship between causation and correlation.

Presentation: Each pair (or individual) will be required to give a 5 minute oral presentation to the class. Both members need to participate equally and should be prepared to answer questions.

*Giving Back: Once you have completed your project, go back to the Census at School website and fill out the online survey to add your own data to the pool:



Class ID – F Block: 302067, G Block: 302068

Password: FUDS

Brief Overview of Scatter plots in Excel (ALSO SEE p.26-28 of document “How to Use Excel”)

1) Highlight the data columns. Use the CTRL key to highlight multiple columns that are not touching.

2) Click the image shortcut to create a chart (or go to Insert – Chart). Choose ‘Scatter’ and follow the prompts to create labels and scales.

[pic] [pic]

3) Once the chart is created, right click on a data point and select ‘Add Trend Line.’

[pic] [pic]

4) To create a residual plot, use your regression equation from step 3 to create a new column of data that calculates the residuals.

5) Repeat steps 1 and 2 using the x column and the residual column of data.

[pic] [pic]

|Rubric for Statistics Project 2 |Points Possible |Points Earned |

|Gathering Data: |

|The process of collecting data from the web was completed correctly |3 | |

|The amount of data gathered was appropriate (at least 25 points) |3 | |

|Quantitative and not categorical variables were used |2 | |

|The data hypothesized to have a relationship is reasonable |6 | |

|The data hypothesized to not have a relationship is reasonable |6 | |

| |20 | |

|Graphs: |

|Scatter plots are drawn correctly |5 | |

|LSRL lines are calculated correctly and included on the scatter plot |5 | |

|R squared and s are calculated correctly and included on the graph |5 | |

|Residual plots are drawn correctly |5 | |

| |20 | |

|Data Analysis: |

|Slope is interpreted correctly for both graphs |4 | |

|R squared is interpreted correctly for both graphs |4 | |

|S is interpreted correctly for both graphs |4 | |

|The appropriateness of a linear model is discussed correctly |4 | |

|Individual residuals are calculated and interpreted correctly |4 | |

| |20 | |

|Personal Contribution: |

|Completion of the Census at School Survey |10 | |

| |10 | |

|PowerPoint: |

|All required information is included on slides |5 | |

|Introduction relays appropriate information |5 | |

|Conclusion discusses results |5 | |

|Overall quality of slides demonstrates effort |5 | |

| |20 | |

|Oral Presentation: |

|Presentation is well organized |4 | |

|Questions are handled appropriately |2 | |

|Presentation is thorough |4 | |

| |10 | |

| | | |

|Total: |100 | |

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download