Probability Theory: STAT310/MATH230 April15,2021
[Pages:409]Probability Theory: STAT310/MATH230 April 15, 2021
Amir Dembo
Email address: amir@math.stanford.edu Department of Mathematics, Stanford University, Stanford, CA 94305.
Contents
Preface
5
Chapter 1. Probability, measure and integration
7
1.1. Probability spaces, measures and -algebras
7
1.2. Random variables and their distribution
17
1.3. Integration and the (mathematical) expectation
30
1.4. Independence and product measures
54
Chapter 2. Asymptotics: the law of large numbers
71
2.1. Weak laws of large numbers
71
2.2. The Borel-Cantelli lemmas
77
2.3. Strong law of large numbers
85
Chapter 3. Weak convergence, clt and Poisson approximation
95
3.1. The Central Limit Theorem
95
3.2. Weak convergence
103
3.3. Characteristic functions
117
3.4. Poisson approximation and the Poisson process
133
3.5. Random vectors and the multivariate clt
141
Chapter 4. Conditional expectations and probabilities
153
4.1. Conditional expectation: existence and uniqueness
153
4.2. Properties of the conditional expectation
159
4.3. The conditional expectation as an orthogonal projection
166
4.4. Regular conditional probability distributions
171
Chapter 5. Discrete time martingales and stopping times
177
5.1. Definitions and closure properties
177
5.2. Martingale representations and inequalities
186
5.3. The convergence of Martingales
193
5.4. The optional stopping theorem
207
5.5. Reversed MGs, likelihood ratios and branching processes
213
Chapter 6. Markov chains
229
6.1. Canonical construction and the strong Markov property
229
6.2. Markov chains with countable state space
237
6.3. General state space: Doeblin and Harris chains
260
Chapter 7. Ergodic theory
275
7.1. Measure preserving and ergodic maps
275
7.2. Birkhoff's ergodic theorem
279
3
4
CONTENTS
7.3. Stationarity and recurrence
283
7.4. The subadditive ergodic theorem
286
Chapter 8. Continuous, Gaussian and stationary processes
293
8.1. Definition, canonical construction and law
293
8.2. Continuous and separable modifications
298
8.3. Gaussian and stationary processes
308
Chapter 9. Continuous time martingales and Markov processes
313
9.1. Continuous time filtrations and stopping times
313
9.2. Continuous time martingales
318
9.3. Markov and Strong Markov processes
342
Chapter 10. The Brownian motion
367
10.1. Brownian transformations, hitting times and maxima
367
10.2. Weak convergence and invariance principles
375
10.3. Brownian path: regularity, local maxima and level sets
393
Bibliography
401
Index
403
Preface
These are the lecture notes for a year long, PhD level course in Probability Theory that I taught at Stanford University in 2004, 2006 and 2009. The goal of this course is to prepare incoming PhD students in Stanford's mathematics and statistics departments to do research in probability theory. More broadly, the goal of the text is to help the reader master the mathematical foundations of probability theory and the techniques most commonly used in proving theorems in this area. This is then applied to the rigorous study of the most fundamental classes of stochastic processes.
Towards this goal, we introduce in Chapter 1 the relevant elements from measure and integration theory, namely, the probability space and the -algebras of events in it, random variables viewed as measurable functions, their expectation as the corresponding Lebesgue integral, and the important concept of independence.
Utilizing these elements, we study in Chapter 2 the various notions of convergence of random variables and derive the weak and strong laws of large numbers.
Chapter 3 is devoted to the theory of weak convergence, the related concepts of distribution and characteristic functions and two important special cases: the Central Limit Theorem (in short clt) and the Poisson approximation.
Drawing upon the framework of Chapter 1, we devote Chapter 4 to the definition, existence and properties of the conditional expectation and the associated regular conditional probability distribution.
Chapter 5 deals with filtrations, the mathematical notion of information progression in time, and with the corresponding stopping times. Results about the latter are obtained as a by product of the study of a collection of stochastic processes called martingales. Martingale representations are explored, as well as maximal inequalities, convergence theorems and various applications thereof. Aiming for a clearer and easier presentation, we focus here on the discrete time settings deferring the continuous time counterpart to Chapter 9.
Chapter 6 provides a brief introduction to the theory of Markov chains, a vast subject at the core of probability theory, to which many text books are devoted. We illustrate some of the interesting mathematical properties of such processes by examining a few special cases of interest.
In Chapter 7 we provide a brief introduction to Ergodic Theory, limiting our attention to its application for discrete time stochastic processes. We define the notion of stationary and ergodic processes, derive the classical theorems of Birkhoff and Kingman, and highlight few of the many useful applications that this theory has.
5
6
PREFACE
Chapter 8 sets the framework for studying right-continuous stochastic processes indexed by a continuous time parameter, introduces the family of Gaussian processes and rigorously constructs the Brownian motion as a Gaussian process of continuous sample path and zero-mean, stationary independent increments.
Chapter 9 expands our earlier treatment of martingales and strong Markov processes to the continuous time setting, emphasizing the role of right-continuous filtration. The mathematical structure of such processes is then illustrated both in the context of Brownian motion and that of Markov jump processes.
Building on this, in Chapter 10 we re-construct the Brownian motion via the invariance principle as the limit of certain rescaled random walks. We further delve into the rich properties of its sample path and the many applications of Brownian motion to the clt and the Law of the Iterated Logarithm (in short, lil).
The intended audience for this course should have prior exposure to stochastic processes, at an informal level. While students are assumed to have taken a real analysis class dealing with Riemann integration, and mastered well this material, prior knowledge of measure theory is not assumed.
It is quite clear that these notes are much influenced by the text books [Bil95, Dur10, Wil91, KaS97] I have been using.
I thank my students out of whose work this text materialized and my teaching assistants Su Chen, Kshitij Khare, Guoqiang Hu, Julia Salzman, Kevin Sun and Hua Zhou for their help in the assembly of the notes of more than eighty students into a coherent document. I am also much indebted to Kevin Ross, Andrea Montanari and Oana Mocioalca for their feedback on earlier drafts of these notes, to Kevin Ross for providing all the figures in this text, and to Andrea Montanari, David Siegmund and Tze Lai for contributing some of the exercises in these notes.
Amir Dembo
Stanford, California April 2010
CHAPTER 1
Probability, measure and integration
This chapter is devoted to the mathematical foundations of probability theory. Section 1.1 introduces the basic measure theory framework, namely, the probability space and the -algebras of events in it. The next building blocks are random variables, introduced in Section 1.2 as measurable functions X() and their distribution. This allows us to define in Section 1.3 the important concept of expectation as the corresponding Lebesgue integral, extending the horizon of our discussion beyond the special functions and variables with density to which elementary probability theory is limited. Section 1.4 concludes the chapter by considering independence, the most fundamental aspect that differentiates probability from (general) measure theory, and the associated product measures.
1.1. Probability spaces, measures and -algebras
We shall define here the probability space (, F , P) using the terminology of measure theory. The sample space is a set of all possible outcomes of some random experiment. Probabilities are assigned by A P(A) to A in a subset F of all possible sets of outcomes. The event space F represents both the amount of information available as a result of the experiment conducted and the collection of all subsets of possible interest to us, where we denote elements of F as events. A pleasant mathematical framework results by imposing on F the structural conditions of a -algebra, as done in Subsection 1.1.1. The most common and useful choices for this -algebra are then explored in Subsection 1.1.2. Subsection 1.1.3 provides fundamental supplements from measure theory, namely Dynkin's and Carath?eodory's theorems and their application to the construction of Lebesgue measure.
1.1.1. The probability space (, F , P). We use 2 to denote the set of all possible subsets of . The event space is thus a subset F of 2, consisting of all
allowed events, that is, those subsets of to which we shall assign probabilities. We next define the structural conditions imposed on F .
Definition 1.1.1. We say that F 2 is a -algebra (or a -field), if
(a) F , (b) If A F then Ac F as well (where Ac = \ A). (c) If Ai F for i = 1, 2, 3, . . . then also i Ai F .
Remark. Using DeMorgan's law, we know that ( i Aci )c = following is equivalent to property (c) of Definition 1.1.1: (c') If Ai F for i = 1, 2, 3, . . . then also i Ai F .
i Ai. Thus the
7
8
1. PROBABILITY, MEASURE AND INTEGRATION
Definition 1.1.2. A pair (, F ) with F a -algebra of subsets of is called a measurable space. Given a measurable space (, F ), a measure ? is any countably additive non-negative set function on this space. That is, ? : F [0, ], having the properties: (a) ?(A) ?() = 0 for all A F . (b) ?( n An) = n ?(An) for any countable collection of disjoint sets An F . When in addition ?() = 1, we call the measure ? a probability measure, and often label it by P (it is also easy to see that then P(A) 1 for all A F ).
Remark. When (b) of Definition 1.1.2 is relaxed to involve only finite collections of disjoint sets An, we say that ? is a finitely additive non-negative set-function. In measure theory we sometimes consider signed measures, whereby ? is no longer non-negative, hence its range is [-, ], and say that such measure is finite when its range is R (i.e. no set in F is assigned an infinite measure).
Definition 1.1.3. A measure space is a triplet (, F , ?), with ? a measure on the measurable space (, F ). A measure space (, F , P) with P a probability measure is called a probability space.
The next exercise collects some of the fundamental properties shared by all probability measures.
Exercise 1.1.4. Let (, F , P) be a probability space and A, B, Ai events in F . Prove the following properties of every probability measure.
(a) Monotonicity. If A B then P(A) P(B). (b) Sub-additivity. If A iAi then P(A) i P(Ai). (c) Continuity from below: If Ai A, that is, A1 A2 . . . and iAi = A,
then P(Ai) P(A). (d) Continuity from above: If Ai A, that is, A1 A2 . . . and iAi = A,
then P(Ai) P(A).
Remark. In the more general context of measure theory, note that properties (a)-(c) of Exercise 1.1.4 hold for any measure ?, whereas the continuity from above holds whenever ?(Ai) < for all i sufficiently large. Here is more on this:
Exercise 1.1.5. Prove that a finitely additive non-negative set function ? on a measurable space (, F ) with the "continuity" property
Bn F , Bn , ?(Bn) < = ?(Bn) 0
must be countably additive if ?() < . Give an example that it is not necessarily so when ?() = .
The -algebra F always contains at least the set and its complement, the empty set . Necessarily, P() = 1 and P() = 0. So, if we take F0 = {, } as our algebra, then we are left with no degrees of freedom in choice of P. For this reason we call F0 the trivial -algebra. Fixing , we may expect that the larger the algebra we consider, the more freedom we have in choosing the probability measure. This indeed holds to some extent, that is, as long as we have no problem satisfying the requirements in the definition of a probability measure. A natural question is when should we expect the maximal possible -algebra F = 2 to be useful?
Example 1.1.6. When the sample space is countable we can and typically shall take F = 2. Indeed, in such situations we assign a probability p > 0 to each
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
- basic concepts list tutor
- probability statistics and random processes for
- mathematics xi xii code no 041 session 2021 22
- conditional probability
- probability theory stat310 math230 april15 2021
- probability and statistics basic
- mathematics ii chapter
- an introduction to stochastic modeling
- mathcad users guide wayne state university
- stochastic processes stanford university
Related searches
- z score probability calculator
- z probability table
- probability worksheets 7th grade
- 7th grade probability test pdf
- 7th grade probability worksheets pdf
- 7th grade probability lesson pdf
- probability worksheet with answers pdf
- probability worksheet 4 answers
- probability worksheets and answer keys
- probability worksheets high school pdf
- mean of probability distribution calculator
- probability calculator from z score