Studyfruit



X=sample meanLinear TransformationConstants a and b, if yi=axi+b?i={1,n}…then y=ax+b Ex temperature in F to C, compute average in F, then convert that average to Cs2=sample variance=1n-1i=1nxi-x2Elements of ProbabilitySample space S: set of all possible outcomesEvent E, Any subset of sample setUnion: E∪F, E || FIntersect: E∩F, E && F, EFNull event: ? has no outcomeCompliment: , Ec, all outcomes in S but not in EEc and E are mutually exclusive, Sc=? Containment ??? or ???all out comes in ? are also in ?Example: ?={(?,?)} ? ?={(?,?),(?,?), (?,?)}Occurrence of ? implies that of ?Example: getting two heads implies getting head at allEquality?=?, if ??? and ???Commutative law ?∪?=?∪?, ??=??Associative law (?∪?)∪?=?∪(?∪?), (??)?=?(??)Distributive law (?∪?)=??∪??, ??∪?=(?∪?)(?∪?)DeMorgan’s law (?∪?)^?=?^? ?^?, (??)^?=?^? ?^?Axioms of ProbabilityProbability of event ?=Proportion of times ? occurs (out come is in ?) in repeated experimentsAxiomatic definition: For each event ? of an experiment with sample space ?, assign a number (?). If (?) satisfies the following three axioms0≤(?)≤1, ?(?)=1For any sequence of mutually exclusive events ?_1,_2,…, P?i=1nEi=i=1nPEi, n=1,2,…,∞ We call (?) the probability of event ?Example: PEc=1-PEProofEc and E are mutually exclusive, PEc+PE=PEc∪E=PS=1Example: PE∪F=PE+PF-PEFProof: Use Venn diagramOdds of event EPEPEc=PE1-PEP=1/2, odds?=1, it is equally likelyP=3/4, odds = 3, it is 3 times as likely to occurConditional ProbabilityPE|F=PEFPF , PEF=PEPF|ELaw of Total ProbabilityPF=i=1nPEiPF|Ei PF=PF|EPE+PF|EcPEc Proof: PF=PFS=PFE∪EcFE∪Ec=FE∪FEc,???PFE∪Ec=PFE∪FEcFE and FEc are mutually exclusive PFE∪FEc=PFE+PFEc Bayes’ Formula PE|F=PEPF|EPF=PEPF|EPEPF|E+PEcPF|Ec PEi|F=PEiPF|Eij=1nPEjPF|Ej Independent EventsE⊥F, if PEF=PEPFPE|F=PEFPF=PE Discrete r.v.Pmf = px=PX=x=Pe∈S:Xe=xCdf = Fx=PX≤x=y≤xPX=y=y≤xp(y)Continuous r.v.Pdf = PX∈B=BfxdxCdf = Fa=-∞afxdxJoint r.v.joint CDF, Fx,y=PX≤x,Y≤yDiscrete RVs, joint PMF, pxi,yj=PX=xi,Y=yjMarginal CDF FXx=Fx,∞, i.e. Y can take any valueMarginal PMF pXx=j=1∞px,yjContinuous joint PDF: PX,Y∈C=x,y∈Cfx,yJoint CDF: Fa,b=-∞a-∞bfx,ydxdyMarginal PDF fXx=-∞∞f(x,y)dyIndependent r.v. X⊥Y, if for any two sets of real numbers A and B, PX∈A,Y∈B=PX∈AP{Y∈B}If X⊥Y: Fa,b=FXaFYbpx,y=pXxpYy, for discrete RVsfx,y=fXxfYy, for continuous RVsConditional DistributionDisc: pX|Yxy=PX=x|Y=y=PX=x,Y=yPY=y=p(x,y)pYyContinuous: fX|Yxy=fx,yfYyExpectation discrete RV: EX=ixipxiContinuous RV: EX=-∞∞xfxdxEX+Y=EX+EY Ei=1nXi=i=1nEXi If independence, EXY=EXEYTo solve optimization problems:-choose quantity c to represent r.v. XMSE=Mean Squared Error, MSEc=E[X-c2]Let EX=μ that minimizes MSE, MSEμ≤MSEc?cLaw of the Lazy Statistician Eg(X)=-∞∞gxfxdxg(X) unknown, but f(X) knownEaX+b=aEX+b EgX≠gEX except for linear caseVariance VarX=EX-μ2=EX2-μ2-always positive, 0 iff X is constantVaraX+b=a2VarX Covariance CovX,Y=EX-μXY-μY=EXY-μXμY CovX,Y= CovY,X,CovX,X=VarX≥0If X⊥Y, CovX,Y=0, but converse may be falseCovaX+b,Y=aCovX,YCovX+Z,Y=CovX,Y+CovZ,YVariance and CovarianceVariance of sum of RVsVarX+Y=VarX+VarY+2Cov(X,Y)If X⊥Y, VarX+Y=VarX+VarYFor X1,…,Xn, if Xi⊥Xj, for all i≠j, Vari=1nXi=i=1nVarXiMutually independentCorrelation: CorrX,Y=Cov(X,Y)VarXVarY=CovX,YσXσY-1≤CorrX,Y≤1, 0 correlation does not mean independenceMarkov’s Inequality X is a nonnegative random variable, then for any a>0, PX≥a≤E[X]aProof: EX=0∞xfxdx=0axfxdx+a∞xfxdx0axfxdx≥0?EX≥a∞xfxdxa∞xfxdx≥a∞afxdxChebyshev’s Inequality X is a random variable with mean μ and variance σ2, then PX-μ≥k≤σ2k2Proof: Consider random variable X-μ2, which is nonnegative. Apply Markov’s inequality with a=k2PX-μ2≥k2≤EX-μ2k2=σ2k2Weak Law of Large Numbers Let X1,X2,…,?be a sequence of independent and identically distributed (i.i.d.) random variables, each having mean EXi=μ. Then for any ε>0, PX1+X2+?+Xnn-μ>ε→0,???as?n→∞.Average of i.i.d. RV’s converges to their meanProof: (assuming VarX=σ2 exists)EX1+X2+?+Xnn=μVarX1+X2+?+Xnn=1n2VarX1+?+Xn=σ2nApply Chebyshev’s inequalityPX1+X2+?+Xnn-μ>ε≤σ2nε2→0, as n→∞Bernoulli μ=p, σ2=p(1-p)Binomial sum of n iid Bernoulli r.v.PX=k=nkpk1-pn-k, for k=0,1,…,nk=0npk=k=0nnkpk1-pn-k=p+1-pn=1 μ=EY1+Y2+…=npσ2=VarY1+Y2...=np(1-p)Ex. Acceptance SamplingGiven an Operating Characteristic curve, want to know probability of acceptance/rejectionSplit into 4 regions: accept acceptable, reject nonacceptable (hit), reject acceptable (supplier’s risk, false alarm/positive, type I error), accept nonacceptable (inspector’s risk, miss, type II error)As number of products inspected go up, curve gets steeper, ideally a step functionPoissonNumber of event occurrences during a time periodExample: arrival of customersWe observe that on average λ customers come to the store in 1 hourDivide 1 hour into n intervals (e.g. 3600 seconds)On average, λ out of n intervals have customer arriveWhen n is large, no interval has more than 1 arrival in itEach interval has arrival with probability λ/nNumber of arrivals in one hour: binomial distributionpk=nkλnk1-λnn-k? n→∞, pk≈e-λλkk!Pois replaces bin when n large, p smallμ=E[Binn,λn]=λ, σ2=λ1-λn≈λUniform If we want the range to be α,βconsider Y=α+β-αXCDF: X≤x?Y≤α+β-αX, Y≤y?x≤y-αβ-αFYy=FXy-αβ-α=y-αβ-α, if y∈[α,β]PDF: fYy=ddyFYy=1β-αMean: EY=α+β-αEX=α+β2Variance: VarY=β-α2VarX=β-α212Exponential has PDF fx=λe-λx, for x≥0CDF Fx=0xλe-λydy?=-e-λyx0=1-e-λxFor a non-negative random variable X, with tail distribution FxEX=0∞FxdxEX=0∞e-λxdx=-1λe-λx∞0=1λVarX=1λ2Normal PDF fx=12πσe-x-μ22σ2If X~N(μ,σ2), Y=a+bX, then Y~Na+μ,b2σ2Z=X-μσ, then Z~N0,?1, standard normal distCDF: ΦxFXx=PX≤x=PZ≤x-μσ=Φx-μσΦ-x=PZ≤-x=PZ≥x=1-Φxzα: 1-Φzα=Φzα=αThe 1-α100-th percentile of Zzα=Φ-11-αSum of independent normal RVs is still normalX1~Nμ1,?σ12, X2~Nμ2,?σ22, X1⊥X2X1+X2~Nμ1+μ2,σ12+σ22What about X1-X2?-X2~N-μ2,σ22X1-X2~Nμ1-μ2,σ12+σ22Samples sample mean X=X1+?+Xn/nX=1nX1+X2+?+XnEX=E1nX1+?+Xn=1nEX1+..+EXn=μVarX=Var1nX1+?+Xn =1n2VarX1+?+VarX1=σ2n Sample variance S2=1n-1i=1nXi-X2For an arbitrary distribution with mean μ and var σ2For the sample mean X, EX,??Var(X)Approximate distribution when n is large(Central Limit Theorem)For the sample variance S2 (E[S2])For a normal distribution Nμ,σ2, Exact distribution of X and S2Central Limit Theorem Let X1,X2,…,?Xn be a sequence of i.i.d. random variables with mean μ and variance σ2then the distribution of nX-μσ converges to standard normal as n→∞PnX-μσ≤x→Φx≈PZ<xnX-μσ→N0,1Generally works as long as n≥30Questions: approximate distribution of the sum?X1+X2+?+Xn≈Nnμ,nσ2Question: approximate distribution of the sample mean?X≈N(μ,σ2n)Ex The College of Engineering decided to admit 450 students, 0.3 probability that an admitted student will actually enroll. What is the probability that the incoming class has more than 150 students?Solution:Xi~Bernoulli0.3, S=X1+X2+?+X450~Binomial(0.3,?450)PS>150=PS=151+PS=152+?+PS=450X1+?+X450 is approximately normal with mean 450×0.3=135 and standard deviation 450×0.3×0.7=94.5Approximate a discrete r.v. (e.g. S=X1+?+X450) with a continuous r.v. (e.g. Y~N135,94.5)Question: PS=k≈PY∈??PS=k≈Pk-0.5≤Y≤k+0.5Continuity correctionPX1+?X450>150=k=151∞PX1+?+X450=k≈k=151∞Pk-0.5≤Y≤k+0.5=PY≥150.5 PY≥150.5=PZ≥150.5-13594.5=1-Φ150.5-13594.5≈1-Φ1.59≈0.06 Continuity Correction When using a continuous function to approximate a discrete one, ex. Normal approx.. Bin.If P(X=n) use P(n – 0.5 < X < n + 0.5)If P(X>n) use P(X > n + 0.5)If P(X≤n) use P(X < n + 0.5)If P (X<n) use P(X < n – 0.5)If P(X ≥ n) use P(X > n – 0.5)Sample VarianceS2=1n-1i=1nXi-X2 i=1nXi-X2=i=1nXi2-nX2 Ei=1nXi-X2=Ei=1nXi2-nX2=Ei=1nXi2-EnX2=nEX12-nEX2 VarX=EX2-EX2?EX2=VarX+EX2EX12=VarX1+EX12=σ2+μ2EX2=VarX+EX2=σ2n+μ2Ei=1nXi-X2=nσ2+nμ2-σ2-nμ2=n-1σ2 ES2=σ2, so unbiasedSampling from a Normal PopulationX~Nμ,σ2nChi-square distribution χn2=Z12+Z22+?+Zn2 with n degrees of freedom, sum of indep chi-squar is chi-squareS2=1n-1i=1nXi-X2 i=1nXi-X2=i=1nXi-μ-X-μ2 Yi=Xi-μ, Y=X-μi=1nXi-μ-X-μ2=i=1nYi-Y2=i=1nYi2-nY2 i=1nXi-X2=i=1nXi-μi2-nX-μ2 Rearrange the termsi=1nXi-μ2=i=1nXi-X2+nX-μ2 Decomposition of the sum of squaresDivide by σ2i=1nXi-μ2σ2=i=1nXi-X2σ2+X-μ2σ2n =Xn2 =??? +X12Zi=Xi-μσ, Z1,Z2,…,Zn are independent standard normali=1nXi-μ2σ2 is chi-square with n degrees of freedomIf sampling from normal population, then X⊥S2, sample mean and variance are indep r.v.X~Nμ,σ2nn-1S2σ2~χn-12n-1S2σ2=i=1nXi-X2σ2 CPU time for a type of jobs is normally distributed with mean 20 seconds and standard deviation 3 seconds15 such jobs are tested, what is the probability that the sample variance exceeds 12Solution15-1S232~χ142PS2>12=P149S2>149×12=Pχ142>18.57=0.1779 t-distribution If Z⊥Xn2 Tn=Zχn2/n with n deg of freedom Recall X-μσn~Zn-1S2σ2~χn-12nX-μσn-1S2σ2n-1=X-μSn~tn-1Parameter Estimation a population has certain distribution with unknown parameter θEstimate θ based on X1…Xn?Θ(X1,…,Xn) a function of sample, is a r.v.Is also a point estimatorMoment Generating Function?t=Eetxpx=etxfxdx nth moment is nth derivativeMethod of Moments Moments can be expressed as functions of parametersEXm=gkθ1,θ2,…,θkIf k unknown parameters, Xm=X1m+?+Xnmng1θ1,θ2,…,θk=X1g2θ1,θ2,…,θk=X2?gkθ1,θ2,…,θk=XkMaximum Likelihood EstimatorsGiven parameter θ, for each observation x1,x2,…,xn, we can calculate the joint density functionfx1,x2,…,xn;θx1,x2,…,xn are independentfx1,x2,…,xn;θ=fX1x1;θfX2x2;θ?fXnxn;θGiven observation x1,x2,…,xn, for each θ, define the likelihood Lθ=fx1,x2,…,xn;θMaximum likelihood Estimator MLE maximizes L(Θ)For MLE use the pdf of the distribution, using θ to replace as needed. Multiply together all n samples in this new pdf.It’s easier to maximize logLθ?lθMLE of exponential dist is the sample meanExample: Bernoulli distributionPX=1=θ=θ1PX=0=1-θ=1-θ1-0 pXx;θ=θx1-θ1-x logpXx;θ=xlogθ+1-xlog1-θ lθ=i=1nxilogθ+n-i=1nxilog1-θ ddθlθ=i=1nxiθ-n-i=1nxi1-θ 1-θi=1nxi=nθ-θi=1nxi θX1,…,Xn=X=i=1nxin=n1n Normal: μX1,…,Xn=XσX1,…,Xn=i=1nXi-X2n ,σ2=n-1S2/nUniform on [0,Θ], θX1,…,Xn=maxX1,…,XnPoisson: λX1,…,Xn=XPoint Estimators213398111239500Biased/unbiased and precise/impreciseBias of estimator θX1,…,Xn for parameter θbθ=EθX1,…,Xn-θ Confidence Interval2-sided, 1-sided upper/lowerEx. normal 2-sided 95% confidenceP-1.96<nX-μσ<1.96=2Φ1.96-1=0.95?P-1.96σn<X-μ<1.96σn=0.95?P-1.96σn<μ-X<1.96σn=0.95PX-1.96σn<μ<X+1.96σn=0.95If σ is given, and we have observed sample mean xx-1.96σn,x+1.96σn is a 95% confidence interval estimate of μOnce observed, x is a constant, and the interval is fixed. 95% is not a probabilityTwo-sided Confidence Interval 1-α for normal mean with known variance: P-zα2<nX-μσ<zα2=1-αPX-zα2σn<μ<X+zα2σn=1-α x-zα2σn,x+zα2σn One sided upper: x-zασn,?∞One sided lower: -∞,x+zασnEstimating Difference in MeansX-Y~N(μ1-μ2,σ12n+σ22m)Normalize X-Y-μ1-μ2σ12n+σ22m? ~N0,1x-y-zα2σ12n+σ22m,x-y+zα2σ12n+σ22m is a 1-αP-zα2≤?X-Y-μ1-μ2σ12n+σ22m?≤zα2=1-α PX-Y-zα2σ12n+σ22m≤μ1-μ2≤X-Y+zα2σ12n+σ22m=1-α C.I. for the difference in the means of two normal population when the variances are knownWhen unknown variances, assume to be equalUse Pooled Sample Variance Sp2=n-1S12+m-1S22n-1+m-1X-Y-μ1-μ2Sp21n+1m? ~tn+m-2?x-y-tα2,n+m-2sp1n+1m,x-y+tα2,n+m-2sp1n+1m is a 1-α C.I. for the difference in the means of two normal population when the variances are unknown but equalApproximate Confidence IntervalIf population not normal but relatively large sample size, use results for normal population mean with known variance to obtain approximate 1-α c.i.x-zα2sn,x+zα2sn Ex. Monte Carlo Simul. Evaluate a definite integral θ=abgxdxConsider an uniform random variable U in [a,b]Law of lazy statisticianEgU=abgxfUudu=1b-aabgudu=θb-a Estimate θSimulate uniform random variables U1,…,Unθ=b-ai=1ngUin Let θ and s be the observed sample mean and standard deviation of the g(ui)’sApproximate C.I. θ-zα2sn,θ+zα2sn ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download

To fulfill the demand for quickly locating and searching documents.

It is intelligent file search solution for home and business.

Literature Lottery

Related searches