Course1.winona.edu



DSCI 415 – Multivariate Statistics – Assignment #1 (105 points)Let A= 9-2-26Is A symmetric? (1 pt.)Determine the eigenvalues and eigenvectors of A. (4 pts.)Write out the spectral decomposition of A=PΛP', i.e. find the matrices P and Λ . (4 pts.)Verify that PP'=P'P=I using P from part (c). (2 pts.)Find A-1. (2 pts.)Find the eigenvalues and eigenvectors of A-1. (4 pts.)Write out the spectral decomposition of A-1=PΛ-1P', i.e. find the matricesP and Λ-1. (4 pts.)Find the matrices A12 and A-12. Verify that A12A12=A and A-12A-12=A-1. (5 pts.)Let A=48836-9Calculate AA' and obtain its eigenvalues and eigenvectors. (4 pts.)Calculate A'A and obtain its eigenvalues and eigenvectors. Confirm that the nonzero eigenvalues are the same as in part (a). (4 pts.)Obtain the singular-value decomposition (SVD) of A, i.e. determine the matrices U, D, and V', then verify that A=UDV'. (6 pts.)Let A=112-222Calculate AA' and obtain its eigenvalues and eigenvectors. (4 pts.)Calculate A'A and obtain its eigenvalues and eigenvectors. Confirm that the nonzero eigenvalues are the same as in part (a). (4 pts.)Obtain the singular-value decomposition (SVD) of A, i.e. determine the matrices U, D, and V', then verify that A=UDV'. (6 pts.)Fatty Acid Analysis of Italian Olive OilsResearchers are interested in characterizing differences in the fatty acid content of olive oils made from olives grown in different regions of Italy. There are two geographic classifications in these data. The first classification is nine individual growing areas in Italy (Area Name) – East Liguria, West Liguria, Umbria, North-Apulia, South-Apulia, Sicily, Coastal Sardinia, Inland-Sardinia, and Calabria. A broader classification is the growing region in Italy (Region Name) – Northern, Southern, and Sardinia. The map below should help in your understanding of where these areas/regions are located in Italy. 4305300394335Puglia = ApuliaSardegna = SardiniaSicilia = SicilyThe bar graph above shows the number of olive oils in these data from each area.00Puglia = ApuliaSardegna = SardiniaSicilia = SicilyThe bar graph above shows the number of olive oils in these data from each area.The fatty acids measured are as follows:Palmitic C16H32O2Palmitoleic C16H30O2Stearic C18H36O2Oleic C18H34O2Linoleic C18H32O2Linolenic C18H30O2Arachadic C20H32O2Eicosenoic C20H38O2Molecular formulae taken from Wikipedia, so if these are wrong it is not my fault. I don’t understand how small differences in the number of carbon and hydrogen molecules make distinct fatty acids. Chemistry is weird!Use visualization methods to find identify the fatty acids that would be most useful in discriminating between olive oils grown in the nine different growing areas represented in these data. Include at least: one 1-D plot, one 2-D plot, and a pseudo 3-D plot that you found useful in justifying your choice of fatty acids that are good discriminators. (10 pts.)There are several outliers in these data but one olive oil (within a specific growing area) in particular stands out from the rest. What area is this olive from? What combination of characteristics makes this olive oil unique? (4 pts.)Which of the nine growing areas would you say produces the most homogenous olive oils in terms of their fatty acid composition? Include an appropriate plot or collection of plots to justify your answer. (4 pts.)Would we categorize the fatty acid compositions of the olive oils from each of the growing areas as having a multivariate normal distribution? Why or why not? Provide graphical evidence to support your answer. (5 pts.)Use methods in ggplot2 to produce the following plots from Section 1 – Basic and Advanced Graphics in R. (4 pts. each)The histogram of median income with the kernel density estimate added on the top of page 5.One of the comparative boxplots on the bottom of page 6.The violin plot on page 8. You do not need to add the annotations.The pie charts and bar graphs on page 12.A plot of Sepal Length vs. Petal Length, faceted by Species for Fisher’s Iris data like the one shown on pg. 23.The 2-D density estimate plot like one on the top of pg. 31 for the Swiss France data.The bubble plot on the bottom of pg. 35 for the NHL data. ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download