Random numbers and stochastic simulations in Excel



Random numbers and stochastic simulations in Excel

• Functions in bold type require that PopTools be installed. When a PopTools function duplicates one of Excel’s built-in functions, the PopTools function is often easier to read and remember and is also based on algorithms that are more numerically robust.

• To generate a new set of random numbers, hit the ‘F9’ key. This replaces all the random numbers in the worksheet and updates any other calculations that depend on them.

Functions for generating random numbers

These are the distributions that you will most commonly be using in this class. There are a variety of other distributions available, both in Excel and PopTools, which use similar syntax. In the “Paste Function” dialog box, look under either “Statistical” or “PopTools random variables” to see what’s available.

Continuous distributions

RAND() Returns a pseudo-random number uniformly distributed between 0 and 1. Either implicitly or explicitly, this is the basis for all the other built-in random number functions.

dRandReal(l, u) Returns a pseudo-random number that is uniformly distributed between l and u. This is the basis (although you never see it) for all the other PopTools random number functions.

NORMINV(RAND(), m, s) Returns a pseudo-random number that is normally distributed with mean m and standard deviation s.

dNormalDev(m, s) Returns a pseudo-random number that is normally distributed with mean m and standard deviation s.

LNORMINV(RAND(), m, s) Returns a pseudo-random number that is log-normally distributed with mean m and standard deviation s.

dLogNormalDev(m, s) Returns a pseudo-random number that is log-normally distributed with mean m and standard deviation s.

BETINV(RAND(), m, s, l, u) Returns a pseudo-random number that is beta distributed with mean m and standard deviation s, and bounded between l and u (the beta distribution has a central tendency, like the normal, but only a finite range of allowable values; it is often used with l = 0 and u = 1).

dBetaDev(m, s) Returns a pseudo-random number that is beta distributed with mean m and standard deviation s, and bounded between zero and one.

Discrete distributions

BINOMINV(RAND(), N, p) Returns a pseudo-random number that is binomially distributed with sample size N and success probability p.

dBinomialDev(N, p) Returns a pseudo-random number that is binomially distributed with sample size N and success probability p.

POISINV(RAND(), m) Returns a pseudo-random number that is Poisson distributed with mean m.

dPoissonDev(m) Returns a pseudo-random number that is Poisson distributed with mean m.

Using PopTools to run a stochastic simulation

Building an Excel spreadsheet to do replicate simulations using built-in tools is tedious process, especially when you consider that instead of 100 replicates, we usually want 1000 or even 10,000! Fortunately, PopTools provides an easier way.

The model

We will use the Ricker model with the parameters given in Table 4.2 of the textbook, applicable to the checkerspot butterfly population. The following figure shows the initial setup for the model (note that I have named the cells that have the parameter values in them, so that the formulas are clearer; the parameter r0 is what is called r in the text – Excel does not allow us to use ‘r’ as a cell name). Cell B12 is the formula for the model. Since we are interested in how long it takes the population to cross a quasi-extinction threshold, we need to create two more columns: one giving the minimum value of N so far, and one just replicating the QE threshold.

[pic]

Using Monte Carlo analysis in PopTools

Now go to PopTools > Simulation tools > Monte Carlo analysis. Enter the range of the minimum population size into the “Dependent range” and the column of QE values into “test range”. Select “ Summary stats to get a histogram.

However, unless your QE threshold is set to zero, using the test column will not be useful, as a population could increase back over the threshold. The way to fix this is to modify your simulation model to force the population to stay extinct if it crosses the QE threshold, using the IF function. For example, if you replaced cell B12 with

=IF(B11>D11, B11*EXP(r0*(1-B11/K)+dNormalDev(0,s)), 0)

and carried this down, then the population would drop to zero the turn after it goes below the QE threshold. Then you can use the test column to create the CDF as you did before.

Bootstrapping the data to get confidence intervals

This is a somewhat tedious task in excel. The first step is to generate a set of bootstapped regression coefficients. We are going to do this by setting up a worksheet that uses dynamic resampling and regression tools (both from PopTools), and then use the Monte Carlo tool to replicate the output of this worksheet (the parameter estimates) many times.

Launch the Simple random sample tool from the PopTools→Sampling menu:

[pic]

The “Sample size” should be the same as the number of rows of data. Be sure to check “Sample with replacement” and uncheck “static result.”

Notice that the resulting output is an “array formula.” You can confirm that it’s dynamic by hitting F9 to update.

[pic]

Now we want to do a regression of column G on column F, using PopTools→Extra Stats→Regression:

[pic]

This also creates an array formula, and you can see that when the data in columns F and G are updated, so is the regression results.

[pic]

The quantities we are interested in are b0 (which is r0 in the model), b1 (which is -r0/K in the model) and Residual St. dev (which is s in the model). We will use the Monte Carlo tool to generate lots of replicate values of these:

[pic]

The summary output it creates isn’t useful; instead we are interested in the new sheet it creates (“Monte Carlo results x”). Let’s rename that sheet so that we remember what it is:

[pic]

We can now generate a simulation for each set of parameters – notice that time is now running from left to right rather than top to bottom:

[pic]

Fill this out to year 10 on the right and all the way down for the 1000 replicates:

[pic]

Now let’s use the Monte Carlo tool to determine the probability of quasi-extinction by year 10. First create a new column that is the minimum population size over the ten years (so we don’t have populations “recovering” from quasi-extinction. Then use this column as the “Dependent range” in the Monte Carlo tool.

It turns out we are limited by the number of columns, which is 256. This is the limit on the number of replicate parameter values we can use. So, we run using just the first 250 rows:

[pic]

This runs, and places values in the Monte Carlo results worksheet, but doesn’t generate the summary statistics. This is because, for a few bootstrap replicates, the estimate of b1 is positive (which corresponds to a negative value of K) – this positive density dependence leads to faster than exponential growth, and even within 10 years, the population often diverges to Excel’s version of infinity, generating “#NUM!” as the result… this causes the summary functions to fail. Since this is biologically unrealistic, it is best to go through and delete those rows with positive values of b1.

This run works. Now lets copy the column “ ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download