1. (5 points) Create a scatter plot of CO2 emissions per ...



ECOM2000: Econometric PrinciplesData Analysis ProjectLuke Johnstone183623741. (5 points) Create a scatter plot of CO2 emissions per capita (vertical axis) against per capita GDP (horizontal axis).??Please label each axis clearly.To create the scatter plot, columns D and E were highlighted, scatter plot was selected from the ‘insert’ tab, and data was appropriately selected. The chart title and axis titles were then added before changing the maximum axis values to improve the appearance of the scatter plot.2. (6 points) For each of the four variables, calculate sample statistics including mean, variance, standard deviation, minimum, maximum, and range.These values were derived using the ‘Data Analysis” tool in Excel. Descriptive statistics was selected, input range selected and the output was sent to a new worksheet named ‘descriptive statistics’. The Values specified in the question were then collected and moved into the table below.UnitKG per capita1000 USDhabitant per square kilometerPercentVariableCO2 emission per capitaGDP per capitaPopulation densityUrban populationMean4921.24038812.7886206295.925628156.80603015Variance43322463.94360.96980311904823.486558.3160341Std. Dev.6581.98024518.999205331380.15342923.62871207Minimum20.754716980.0910.19.4Maximum 59252.32558156.12717958.8100Range59231.57086156.03617958.790.63. (10 points) Under the assumption that CO2 emissions per capita are distributed independently and identically in population, construct a 90% confidence interval of the population mean of CO2 emissions per capita manually (that is, based on the sample mean and sample variance calculated in question 2 and not using “Descriptive Statistics” command of Data Analysis Tool). Interpret the calculated confidence interval.As this question is quite equation heavy and time consuming to prepare on Word, I have completed on pen and paper and pasted pictures below (as per email with unit coordinator).After the above calculations were conducted the confidence intervals at the 90% level for the population mean of CO2 Emissions per capita (KGs per capita) are 4377.194 and 5465.286. This 90% confidence interval is a range of values that we are 90% certain contains the true population mean, based on the data tasted. We cannot say what the exact value of the population mean is due to sampling uncertainty when analysing sample data (sample data does not contain every single population member).4. (8 points) Using Excel’s Regression command (from Data Analysis Tool), estimate a multiple regression model with CO2 emissions per capita as a dependent variable and GDP per capita, GDP per capita squared, population density, and percent of urban population as explanatory variables. Report the regression output from Excel and write the estimated sample regression equation.To get this regression model, first a new column was created displaying GDP per capita squared, as shown below. As one or more of the variables is now squared, the model is now considered quadratic and allows the model to follow a non-linear effect.The data analysis command was used and inputs added accordingly, the results from excel are reported below.According to the data derived from the data analysis command from excel, the estimated sample regression equation is:C02 Emissions(Per captia)i= -921.2413+332.287GDPpercapi -2.16GDPpercapi2-0.606Popdensi+51.114Urbpopi+εi5. (5 points) For the regression model estimated in question 4, interpret the reported R‐square value. Briefly comment on the model’s goodness of fit to the observed data. To describe it simply, the R-square value is a measure of how well the model fits the data. The R-square value for the model above is 0.46 or 46%. A score closer to zero indicates a ‘poor’ fit, while a score closer to 1 indicates a ‘good’ fit. Collectively, GDP per capita, GDP per capita squared, population density and the percentage of population living in an urban area collectively explain 46%. This result clearly doesn’t have a very high ‘goodness of fit’, this means that CO2 emissions per capita may not be so easily determined by the observed variables and implies that there may be other factors that influence this. Often when the variables on the left-hand side of the equation are squared and included in the regression equation, the R-square value rises. This is the same in this case, if the sample regression equation didn’t include GDP per capita squared, the R-square value would be lower, at 39%, as reported below. 6. (8 points) For the regression model estimated in question 4, provide economic interpretations of the coefficients for (i) population density and (ii) percent of urban population.Population DensityThe estimated sample regression model in in question 4 shows that the coefficient for population density takes the value of -0.606. In the data, population density is measured in habitants per square kilometre. Therefore, this value means that according to the estimated sample regression model, for every additional 1 habitant per square kilometre, the CO2 Emissions per capita will decrease by 0.606 KGs per capita, holding the other factors (GDP per capita, GDP per capita squared, and urban population percentage) constant. This is the marginal effect of population density on CO2 Emissions per capita.Percentage of Urban PopulationSimilarly to the answer above, the estimated sample regression model in question 4 shows that the coefficient for the Percentage of Urban Population takes the value of 51.114. This value means that according to the estimated sample regression model, for every additional 1 percent of population living in urban areas, the CO2 Emissions per capita will increase by 51.114 KG’s per capita, holding the other factors (GDP per capita, GDP per capita squared, and population density) constant. This is the marginal effect of percentage of urban population on CO2 Emission per capita.7. (10 points) For the regression model estimated in question 4, test if the coefficient of population density is negative at a 1% test size, using a critical value approach.??State clearly the null and alternative hypothesis.The critical value approach requires 4 steps, each of which has been listed below.State the null and alternative hypothesisSet significance level and find associated critical value. Note, this is a one-sided test (finding whether the coefficient is above or below 0) so the Excel command used is “=T.INV.2T(0.02,397)” at the 1% significance level. “0.02” is used in the excel command because the in this case is not being used for a 2-sided testCalculate F-test statistic or retrieve from regression statistics in ExcelMake a decision.The question has been completed on pen and paper and pasted below.8. (10 points) For the regression model estimated in question 4, construct a 90% confidence interval of the coefficient of percent of urban population manually (that is, by using the coefficient estimate and standard error reported by the regression command, not by using “Confidence Interval” option on regression command).??Interpret the confidence interval.To construct a confidence interval for a coefficient in a regression model, we must know the variable coefficient and the standard error of the coefficient. The regression statistics from Excel provide us with this information, highlighted in yellow below. The coefficient value is 51.114 and the standard error value is 14.017. The confidence level to be found is defined in the question as 90%. The critical values must now be found, to find this we must know alpha which is 0.10 (90% confidence interval), and the degree of freedom, which in an MLR is calculated by n-k-1. In this case the d.f= 398-4-1 = 393. Armed with alpha and the d.f the t statistic can be found through the excel command T.INV.2T(0.10,393). The next steps are outlined below:1. Find critical values (t statistic)T.INV.2T(0.10,392) = 1.6492. Calculate margin of error= critical value x standard error= 1.6489 x 14.017 = 23.1113. Specify the confidence intervalThe 90% confidence interval for the coefficient of the percentage of urban population is:51.114 ± 23.111= 28.003, 74.225.The confidence intervals calculated above correlate with the upper and lower confidence intervals calculated using the regression command in excel, shown below (highlighted in red):To provide interpretation of the confidence interval it must be known that our coefficients only provide an estimate based on certain samples, therefore they are subject to sampling uncertainty. The true exact value cannot be found from the data in question. However, we can say that we are 90% confident that the regression coefficient for percent of urban population will be in-between the calculated parameters 28.003 and 74.225. 9. (15 points) Using the regression model estimated in question 4, calculate the predicted values of the CO2 emissions per capita for a range of GDP per capita values observed in the sample (by 5,000 USD increment) while setting the other two explanatory variables at their respective sample means.??Plot the resulting predicted values in a two dimensional diagram (with CO2 emissions per capita and GDP per capita against the vertical and horizontal diagram, respectively) and illustrate the relationship of CO2 emissions per capita and GDP per capita as implied by the estimated regression model.Firstly, underneath the raw data from the original Excel file, new GDP per capita values were made, increasing the entire range of the sample, in $5000USD increments. Then, the other two exploratory variable means were added, as well as the GDP per capita squared value (as the question states that the regression model from question 4 should be used). After this, the regression model from question 4 was used and the appropriate coefficients from the regression statistics multiplied into the equation (shown below) to give the predicted values of CO2 emissions per capita for the ‘made up’ GDP per capita levels.A two-dimensional diagram was then created to illustrate the relationship of CO2 Emissions per capita and GDP per capita, as implied by the estimated regression model. The “scatter chart” option was selected in Excel, and the data was selected manually so that the correct variables were on the X and Y axis’, as specified in the question. The chart and axis titles were added, the maximum X value altered slightly to improve the appearance of the diagram and finally a line of best fit included to help better illustrate the relationship between the two variables. As hypothesised in the project introduction, a countries relationship between GDP and CO2 Emissions (per capita) takes the form of an “inverted U-shape”. The analysis so far in this report supports this theory, as the diagram below clearly displays an “inverted U-shape” or “hill shape” curve. By simply “eye-balling” the diagram, it can be seen that as a country’s GDP per capita rises, as does it’s CO2 Emissions per capita, but it only rises to a certain “turning point” and the increase in CO2 emissions is finishing. Put simply; increased GDP has an increasing, but diminishing effect on CO2 Emissions up to around $75,000USD.10. (10 points) For the model estimated in question 4, calculate the level of per capita GDP where the marginal effect of per capita GDP on CO2 emissions changes its sign (either from negative to positive or positive to negative).At the turning point, increasing GDP per capita starts to lower CO2 Emissions per capita, the question asks for the exact turning point (as opposed to the “around $75,000USD” mark found by eye in the previous question). Calculus can be used to find turning points, which in this case will be a maximum turning point.Again, due to the time-consuming nature of completing mathematical formulas on MS Word, the calculus has been completed on pen and paper and posted below.The calculus above shows that, according to the estimated model, when a country’s GDP per capita reaches $76,918USD, it’s CO2 emissions per capita begin to decrease. This is the maximum point on the diagram in the previous question, where the marginal effect of an additional unit of GDP per capita on CO2 per capita changes its sign from positive to negative.11. (13 points) On the regression model estimated in question 4, test if the slope coefficients for population density and share of urban population are both equal to zero at a 5% test size.?? State clearly the null and alternative hypothesis.Firstly, a new restricted model was estimated using population density and share of urban population as exploratory variables. This was done by creating a new worksheet, copying the original data, and removing the GDP per capita and GDP per capita squared. Regression statistics were calculated through the “data analysis” tool and a new restricted regression equation was created.Unrestricted Regression Model:C02 Emissions(Per captia)i= -921.2413+332.287GDPpercapi -2.16GDPpercapi2-0.606Popdensi+51.114Urbpopi+εiRestricted Regression Model:C02 Emissions(Per captia)i= -3548.574-0.444Popdensi+151.413Urbpopi+εiNow that the restricted regression model has been derived, we can test if the slope coefficients for population density and share of urban population are both equal to zero at a 5% test size.?? This is done in a 4-step process:State the null and alternative hypothesisSet significance level and find associated critical valueCalculate F-test statisticMake a decision.The listed steps have been completed and pasted below.Step 4: Since the f-test statistic exceeds the f-critical statistic, we reject the null hypothesis at the 5% significance level. By rejecting the null hypothesis, we indirectly accept that at least one of the slope coefficients for either population density or percent of urban population is not equal to zero.ReferencesSuenaga, Hiroaki. 2019. “ECOM2000: Economics Principles.” Modules 1-6. PowerPoint Slides. ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download