Introduction to Simulation



Introduction to Simulation

Q: What is simulation?

A: I guess I should start by saying it is an entirely different approach to solving problems.

Q: Different in what sense?

A: You know a host of ways to solve problems, from the straight-forward calculations you learned beginning in math class (that you now use in finance and accounting classes) to the decision analysis tools of tables and trees, up through the algorithms you have recently been mastering, but all of those focus on using what you know to reach a correct (or optimal) solution. Simulation focuses more on what you don’t know, and tries to make up for that.

Q: How can you focus on what you don’t know?

A: By acknowledging that you don’t know it and admitting that what you don’t know is the key to the decision situation you are facing.

Q: What good does that do?

A: Getting people to admit they don’t know something is fairly difficult. If you ask a question, even if they don’t know the answer, they will give you a guess – and then you tend to be stuck using that guess as if it were accurate.

Q: But how does confessing your ignorance help?

A: It is not just admitting you are less than omniscient; it is also accepting that the information you lack is vital to what you are trying to do.

Q: That’s a lot better. First I tell my boss that I don’t know something, and then I add that without that piece of information we can’t do anything. Are you trying to get me fired?

A: No, promoted. You’re not grasping everything I’ve said. Simulation involves doing something about your lack of knowledge. Whenever you present a problem, try to present a solution along with it.

Q: If it is something I don’t know (and I’m presuming that we can’t go look up whatever it is), how can we do anything about it?

A: Simulate it.

Q: Aren’t we working in circles here?

A: No. The above is an introduction, now we can talk a little about the mechanics of simulation.

Q: So, what is simulation?

A: You know the answer to that: making a copy of something and seeing how the copy reacts to changes.

D: Like most well-known answers, that one was has a flaw in it. It is true that you make a copy, but you don’t make an exact copy – typically (almost universally) the copy is simplified, sometimes enormously.

Q: Why don’t we make exact copies when doing simulations?

A: Often we can’t, and the rest of the time it would defeat the purpose of the simulation.

Q: What is the purpose of a simulation?

A: To get an answer to a question without the expense of trying it out in the real world.

Q: Wouldn’t an exact copy answer questions better than a simplified copy?

A: Absolutely, but the only way to create an exact copy is to duplicate the complete situation – which would be just as expensive as trying out your idea in the real world. Since the goal of simulation is to avoid that expense, we deal with simplified copies.

Q: Can’t computer models exactly copy the real world?

A: Not even close. Further, chaos theory (a mathematical model that claims small causes eventually have enormous effects) maintains that we never will be able to have computer models of sufficient complexity to exactly duplicate the real world. I’m not sure I agree with chaos theory (computer capabilities keep increasing geometrically), but they do make some strong arguments that apply to the current level of computer sophistication.

Q: Don’t we use simplified copies in all of our models (algorithms, decision trees, queuing theory, project management)?

A: Yes, but we use simulation when those models won’t work.

Q: Why wouldn’t the other models work?

A: Maybe the assumptions won’t fit or maybe one or more vital pieces of data are missing or maybe the output is wrong or maybe anything. Any time we can use one of the other models, we do. Simulation is usually a last resort, despite being superior to the other models.

Q: If simulation is so good, why don’t we use it all the time?

A: Allow me to answer your question with a question:

Q: Have you ever tried to mail some object in a box that was too small (or had the pleasure of watching someone else struggle with this problem)?

A: Most of us have, one or the other (and it is a lot more fun when it someone else’s problem).

Q: Why didn’t you (or the other person) simply custom-build a box that would perfectly fit the object to be mailed?

A: Usually three reasons: didn’t have the materials (or money to buy them), didn’t have the time, and doing that involved too much effort (that sounds a little better than saying we are too lazy).

Q: Does this have anything to do with simulation?

A: The reasons for not custom-building a box are the same three reasons simulation is not more widely used although it is vastly superior to the other models.

Q: Do you need special software to run a simulation?

A: It depends – some simulations can be run on a spreadsheet (we’ll do that later) but the best ones do require special software packages that very few companies have on the shelf.

Q: Why does a simulation take more time than using one of the other models?

A: All of the other models are pre-programmed; you simply gather the data, enter it into the model, and click “solve.” With a simulation, you still gather the data (and usually more), but then you must create the simulation model. Every simulation model is unique, because every business situation is unique, so there can be no pre-programming.

Q: Why does simulation take more effort than the other models?

A: Even with the sophisticated packages available today, it still involves a lot work to create the right relationships between the parts of the simulation and to get the computer to work through the relationships. Most of the time, the business world is unwilling to put forth all that effort.

Q: Are there any other problems with using simulation?

A: Just the normal problem we have when humans use a computer – programming errors. Just as most of you made mistakes when setting up the problems in Solver, you will make mistakes when setting up the relationships in a simulation. Some of the mistakes will be accidental (typographical errors, usually), but others will be deliberate.

Q: Why would I deliberately make a mistake?

A: If you don’t know something is a mistake, you can do it deliberately, but it is still wrong.

D: The simulation world has its own terminology, of course, and calls these two types of mistakes veracity and validity. Veracity refers to unintended errors (typographical, data entry, pointing to the wrong cell on a spreadsheet). You knew what to do, you simply made an error. Validity means that even if your simulation perfectly free of all veracity errors, it is still wrong because the model you choose to build is, itself, in error.

Q: Is there any way to catch veracity errors?

A: If you suspect this type of error and can’t find it, re-enter everything and then compare the two versions side-by-side. Where they differ is probably your mistake.

Q: Is there any way to catch validity errors?

A: One trick is to run the simulation for a data set where you know the outcome, such as last year. If the simulation fails to show something similar to what you know happened, the odds are good that your model is invalid. Unfortunately, this can test validity for only last year. If anything has changed since then, a model valid for last year may not be valid for next year.

Q: Are you saying we should never use simulation?

A: Yes, and that is the end of this lecture (don’t you wish). No, we’ve simply been focusing on the negatives, which got us sidetracked into the issues of veracity and validity. You haven’t asked me about any benefits from simulation.

Q: You said that simulation was superior to the other models. What makes simulation better?

A: Every simulation is custom designed, so it has no assumptions to meet before you begin to use it (I heard your sigh of relief, but hold on – it’s not as good as it sounds), the calculations are not based on averages and simulation gives better output.

Q: What makes custom design an advantage?

A: All of the other models are general models – when we use them we spend all our time trying to get the problem to fit the model (those are the assumptions that I have spent all semester explaining to you). With simulation, we work to get the model to fit the problem – custom designing the model to the situation. Thus, we eliminate a potential source of error.

Q: What’s the catch to having no assumptions?

A: I didn’t exactly say that simulation has no assumptions; I said it has no assumptions before you begin to use it. Above, I mentioned that all simulations are simplifications – and each simplification is an assumption. The difference is that you get to choose the assumptions as you build the simulation. That’s a little better than having the assumptions forced on you. You may still have to defend those assumptions to your boss, but at least you had a reason for selecting them in the first place.

Q: How did the other models base their decisions on averages?

A: Think ‘way back to the misty dawns of time, about eight weeks ago when we started the semester (remember how much younger you were then, full of energy and vitality, and look at you now - a few weeks of management science and you’re a new person). When we first talked about the assumption of determinism, I said that we would be a lot more accurate if we used data ranges instead of point estimates for each of the payoffs. We can’t use data ranges, though, because our models lack the mathematics to perform the calculations on data ranges, so we use a single number to describe the range and that single number is an average. Since then, every calculation we have made has been based on those averages, the single estimate to describe a range of values.

Q: How does simulation avoid the use of averages?

A: I’m so glad you asked. Simulation uses random numbers to select parameters from a data range.

Q: How can that be accurate?

A: It is accurate because simulation doesn’t do that once. It does it over and over, selecting a different value from the data range each time, and keeping track of the results for different value.

Q: Where does the better output come from?

A: For one thing, in the course of developing a simulation you get to ask for whatever information you want, and the simulation will calculate it for you. More importantly, though, it keeps track of the information you want through all the various values of the things you don’t know and then it shows you the results of all the different values (average result, max, min, variation, anything you want) and that helps you understand the problem you are facing.

Q: Is that what you meant earlier, about focusing on what you don’t know?

A: Exactly. If we knew a value for the number we need, we could simply use it. Even if we don’t know a precise value, though, we are often able to tell a range of values within which the true value will fall. Simulation tries out different values from the range, determines the results as if that were the true value, then tries a different value as if that were the true value, over and over. In the course of looking at hundreds of different values, you can often find a solution that will be satisfactory for a large number of values. That is the one you recommend.

Q: Would this be a good time to ask you to work through an example?

A: Probably. Let’s consider an example that you all hope to be facing in the near future: are you saving enough for retirement? This question comes up only when you have a job and are able to save for retirement, so, overall, it is a good question.

Q: Isn’t there a formula, from finance, that answers that question?

A: Oh, yes, it’s called the future value calculation. It is based on three parameters: interest rate, number of periods, and periodic payment. You can also allow for an initial lump sum investment. There are minor flaws with this formula, such as the periodic payment is not allowed to change and the number of periods to retirement might vary, but the major flaw is that the interest rate is assumed to be constant.

Q: What’s wrong with that assumption?

A: We all know that interest rates aren’t constant.

Q: Yeah, but can’t we just use an average value to …

A: Hopefully, you just stopped yourself from telling me to use an average value, since my diatribe above is all about how not using average values is one of the strengths of simulation.

Q: Does it really matter, in this case? Aren’t we talking about a long period of time, so that fluctuations in the interest rate will tend to average out?

A: That has always been the argument, although it does leave the question of exactly what value to use for the average interest rate. Given the importance of this question, you want to be sure the argument is correct. We can test it using a simulation. If the formula and simulation give similar results, then the argument is correct. If the results are significantly different, then we may not want to use the formula.

Q: So how do we start building a simulation?

A: First, let me tell you that this type of simulation is called a Monte Carlo simulation. Monte Carlo is a city in Europe famous for its casinos. All casinos have a roulette wheel. This is a wheel that spins around and a small marble is setting floating on the rim of the spinning wheel. As the wheel slows, the marble drops into one of the numbered slots on the wheel. In a fair game, this is about as random a way as you can get of selecting a numerical value and that was how early simulations were run and that is where the name comes from. Now-a-days we use computer generated random numbers, properly called pseudo-random numbers, but still keep the name.

Q: What does “pseudo” mean?

A: “Pseudo” is a Greek word meaning false. If you told your boss your simulation was based on fake random numbers, you might get fired, but if you say you are using pseudo-random numbers, they be impressed with you (for one thing, you must be able to spell “pseudo”). There are good points and bad points to using pseudo-random numbers, and if there is time (I doubt it), I’ll talk about them in a different set of notes. For now, we will simply use them.

Q: If Monte Carlo is one type of simulation, is there another?

A: Yes. It is called an event-driven simulation and you don’t want to touch that without a two-semester course dedicated to simulation. That is the type of simulation you use to copy a system, such as a hospital, of traffic flow patterns, or a factory or anything, really. The big difference between the two types of simulations is the trickiness of getting all the connections between the parts of the simulated system to mimic the real world. That takes a lot of work and a lot of experience. Hire a consultant (not me) if you ever need to do this. Oddly, though, the essence of an event-driven simulation is the same as a Monte Carlo simulation and you can learn to do that.

Q: If a Monte Carlo simulation doesn’t look at systems, what does it simulate?

A: Usually a single value, sometimes several related values, but the important point is that the values are an end in themselves. Those values, once simulated, can be used to directly calculate something else, and it pretty much ends there. A system, of course, would take a simulated value, perform a calculation, feed that result into something else, which leads to choices (different paths through the simulation) and the effects of any single simulated value are lost in the complexity of the simulation itself.

Q: So, how do we do simulate the interest rates for our future value problem?

A: By using a spreadsheet.

Q: And where do we get the pseudo-random numbers from?

A: Excel has a built-in function called “Rand()” that will generate a pseudo-random number from a uniform distribution between zero and one.

D: This might be a good time to tell you about @RISK, an Excel add-in that the College of Business has. It can be used to do simulations and has a lot of neat stuff that makes getting the output you want easier. For this semester, though, I think I’ve dragged you through enough by hitting you with Solver. Also, I didn’t warn my on-line students about the need for this add-in (you have to buy it), and it’s a little late to ask them to get it now. I may use that in future semesters, but we can do everything we need on our own, so we will.

Q: How do we begin setting up the spreadsheet for our simulation?

A: The same way we have begun every other problem: organizing our data.

Q: What data do we have?

A: We know the question deals with future values, so we need to know the same information we would use for the future value calculation: term, payment, rate and down payment. You should provide the information for your own situation. I am going to make up information for a pretend situation.

Q: What will be the term for our simulation?

A: I am going to pretend you are 30 years old and want to retire at age 60, so our term is 30 years or 360 months.

Q: What will be the payment for our simulation?

A: I am going to pretend you are putting aside $100 per month for retirement. I’m also pretending you are starting from scratch – no down payment.

Q: What will be the interest rate for our simulation?

A: That is what we are going to simulate.

Q: You said, above, that we should be able to fix a range for the value we are trying to simulate. What range so we have for the interest rate?

A: First of all, we aren’t really simulating an interest rate. Most retirement accounts are invested in the stock market, so what we are really simulating is the return on investment for a portfolio of stocks.

Q: OK, so how do we set a range for the return on the stock market?

A: Well, the number you hear most often quoted about the stock market’s return is that it averages 10% for any 30-year period. We know annual returns have been as high as 25% and we all know that return are sometimes negative, though it would unusual for the stock market to lose more than 5% of its value in a year. So we have a range of -5% to 25%, with an average value of 10%.

Q: What is the distribution within that range?

A: What an excellent question! It could be uniform (any value in the range has the exact same chance of occurring), it could be skewed to the upper end, or to the lower end, or it could be normally distributed. Of that list, I doubt the correct distribution is the uniform. I really don’t see a -5% return as being as likely as a 10% return. For that matter, I rather expect the return for any given year to be around 10%. Returns above or below that are usually seem as less likely. So, while we might like to think that the higher returns are more likely to occur, the best description is the normal curve, no bias up or down.

Q: When do we start making up assumptions?

A: You already have.

Q: Were the term and payment our first assumptions?

A: No, those were parameters. Once the simulation is up and running, we can change those very easily to see how the results change – in other words, simulate those values as well.

Q: What assumptions have we made, then?

A: Using the normal curve to describe the distribution of the returns is probably a pretty bad assumption. For one thing, the normal curve treats every new observation as an independent one, meaning that the return for one period has no effect on the return for the next period. Most of us probably guess that there are trends to the stock market, so there are inter-period links, but for this simple simulation, we’ll pretend there aren’t any. It as an assumption, though.

Q: How do we use this information?

A: We place it in a spreadsheet in a data area as in cells C2:F2 in the spreadsheet “Simulation Example” which you should have printed out when you printed these lecture notes.

Q: Now what?

A: Well, since Excel has a future value function, we could go ahead and work out the formula version of this set of data, just to have as basis of comparison. I’ve place that in cell H2.

Q: Now what?

A: Now we get to work. The first thing we have to do is set up a calendar. One of the hardest things to do in any simulation is to keep track of where you are. So, columns A and B, beginning in row 8, show a year and month counter. Next, we have to set up a series of month by month calculations to mimic the results of investing the money. Since we have chosen to have no lump sum available at the start, we start with a zero balance and make our first monthly payment ($100) into the retirement fund at the start of the month. The return is calculated at the end of the month, though, and for that we need a random number.

Q: Is this where we use the RAND() function?

A: Yes. RAND() will give you a value between 0 and 1, which means it gives you a decimal, such as .5 or .01 or .349287367261937463820. Unfortunately, that’s not what we need. We need an interest rate centered on a mean of 10%. That means we have to treat the random number as a probability. You should recall that the probability of being less than or equal to some given value form your normal curve is the same as the area under the curve to the left of the observation point. If that was clear as mud, look at Figure 1:

Figure 1: Area Under a Normal Curve

The line to the right of the midpoint represents a value from the normal distribution. The shaded area represents the probability of reaching that value. We let the pseudo-random number generated by Excel represent the area, or probability, and convert that into a normal distribution observation.

Q: How do we convert a pseudo-random number into a normal distribution observation?

A: Excel has another function called “NORMINV” which does this for you. It has three things it needs to know – the probability (we will use a pseudo-random number from the function RAND), the mean of the distribution (we are using 10%), and the standard deviation of the distribution. We don’t have one of those, yet.

Q: How do we estimate a standard deviation?

A: By remembering the +/- 1 standard deviation gets about 2/3’s of all observations with a normal distribution. If we used a standard deviation of 5%, then we would expect the stock market return to be between within +/- 5 points of 10% 2/3’s of the time. That is a range of 5% to 15%, which seems a little tight. A standard deviation of 10% would give us a range of 0% to 20%, which seems a little wide. Let’s try 8%, with a range of 2% to 18%. That seems a little off on the high end, but that’s the least of our problems, so we’ll make the assumption (another one!).

Q: So, how do we get a simulated rate of return?

A: The function “NORMINV(RAND(),mean,std dev)/12” will give it to us.

Q: Why did you divide by 12?

A: To convert the annual rate of return to a monthly rate of return.

Q: Now that we have the simulated rate of return, what do we do with it?

A: Multiply the current value of the retirement account (starting off, the first payment, after that whatever the total was at the end of the previous period plus that period’s payment) by the simulated rate of return to get the return on investment for that period (which could be negative). Add that return to the previous total plus the current period’s payment, and you have the simulated value of your retirement account after that period. Repeat this calculation (including the simulated rate of return) for the next period, and the next, and the next, until you reach the end of your term. The final value is the amount you have to retire on.

Q: Isn’t this a lot of work?

A: Not in Excel. We’ll set up the first calculation, then the second, then simply copy the calculation down (fill-down). That will give us one simulation of our retirement account. You can see this in column C of the spreadsheet.

Q: So what was our first simulated value?

A: I can’t tell you that.

Q: You mean you won’t tell us that, we have to go look for ourselves?

A: No, I mean I can’t tell you that. Every time you hit the “Enter” key in Excel, the RAND function re-calculates and gives you a new random value. I left all the formulae in place, so you could see what I did. That means that if you did almost anything to the spreadsheet, it is now showing a different value from the one I saw?

Q: What good is that to us?

A: If it is really important for you to have a particular value, then simply copy it and when you paste, pick “paste special” and “value only.” This gets rid of the RAND function and your values will stop changing. We don’t need to do that, though, because we still have a lot more work to do.

Q: Just for the sake of argument, will you tell us what total you got?

A: Just remember that you will not see this number anywhere on the spreadsheet. The number I got for the first simulation was $206,752.61.

Q: How did that compare to the future value calculation?

A: The future value calculation gave a value of $226,049, which was $19,296.19 higher than the simulated value.

Q: So, is the future value calculation wrong?

A: Not necessarily. This was only one simulation. A problem with simulation is that the random numbers are, well, random. That being true, it is possible that any given simulation is an outlier, not representative of the true value.

Q: How do we get a good representation of the true value for the retirement fund?

A: Repeat your 30-year simulation over and over, as often as you can stand (or your computer will allow) and then average together all the totals. That average should be a very good estimate of the true total value of the retirement fund.

Q: Wasn’t simulation supposed to get us away from using averages?

A: Simulation got us away from using averages in the calculations. An average can still be a good tool to for looking at a large group of numbers, like a lot of different simulation results.

Q: Are you saying we have to set up spreadsheet after spreadsheet to do this?

A: You can if you want to, but it would be simpler just to copy column C (which contains all our simulation calculations) over into column D, then into column E and F and … for as many times as we want to run this simulation.

Q: Is it really that easy?

A: Well, this does make for a pretty large spreadsheet, and with all those RAND functions in the cells, it can take a long time for the spreadsheet to recalculate (you will probably think the spreadsheet has locked up, but give it a few minutes and it will come back, maybe).

Q: How many times should we run the simulation?

A: At least thirty (remember the central limit theorem), but how about a nice round 50? Actually, you should run it for hundreds of columns, but I won’t do that to you. Notice I put a counter at the top of each column (C8) so we can keep track of how many simulation columns we have.

Q: So how far out does this spreadsheet run?

A: All the way out to column AZ. That really isn’t all that big, it just takes a while to recalculate.

Q: What do we learn from all this?

A: I’ve put a few results in cells E4:G6, the average value of the retirement fund, the percentage the value was below the future value calculation and the percentage the value was less than 90% of the future value calculation. While I was setting up these results, the numbers kept changing, of course, but the percentage below the future value calculation result hovered around 50%, +/- 10 points. Now all we have to do is decide whether or not we have a valid simulation.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download