Thursday, October 31, 2013

30-Year Return Probability Simulation


Ahh, the wonder of statistics; a great tool they can be while remaining a complete mystery to many. Now take basic statistical methods and couple them with the power of Excel and what do you get? An interesting simulation! Okay, some might really question my motives for completing this; however they are actually quite simple, I am seeking to improve my modeling and simulation skills within Excel and what a better way than to practice.

I began the process by determine what I wanted to simulate, in this case it was the expected return of five assets over a thirty year holding period. Why five assets over thirty years? In all honesty, they were arbitrary; essentially they served as a simple basis for the creation of my model. In addition, the inputs that found in the download file have also been arbitrarily chosen for illustrative purposes only.  

After getting my basic idea down, I outlined how the simulation would work. I wanted to use expected return and risk as the metrics for each individual investment. Expected return would be measured as a percent change on an annual basis. Risk would be measured as the standard deviation of the expected return. In addition to these two important asset specific measures, each asset would be assigned a weight based on their allocation in the portfolio as a whole. Lastly, to help better represent the expected return on a monetary basis, a starting portfolio balance would be necessary. While a monetary measure was not essential, I felt it would allow a better “feel” for the expected returns as opposed to solely a percentage change being displayed.

The next step would be to create a random expected return for a single year, then repeat this over thirty years and compound the growth. In Excel I used the “=NORM.INV” function with requires the following syntax (probability,mean,standard_dev). The probability portion corresponds to the location on a normal distribution. The mean in this case would be the represented by the expected annual return for each respective asset. Standard_dev would be the assigned risk measure of standard deviation of return for each respective investment. By using the “=RAND()” functions, I was able to generate random numbers between 0 and 1 which represent probabilities. Now would be a great time for a quick Statistics-101 review.

A normal distribution is a commonly occurring continuous probability distribution in which data is equally distributed above and below the mean in a bell curve pattern. The central limit theorem states that under mild conditions the mean of randomly selected independent variables will be approximately normally distributed. For this model we assume the returns of each asset will be normally distributed. This essentially means that 99% of all returns will fall within plus or minus three standard deviations from the mean or expected return. This theory is widely used in practice with what is referred to as a normal distribution table. This table shows the relationship between the number of standard deviations above or below the mean and the cumulative probability of this occurring. For example, there is a 1% chance of experiencing a return that is three standard deviations below the mean. Since cumulative probability is used, a 99% probability would represent three standard deviations above the mean; this reflects that 99% of all occurrences fall below this level.

Back to the simulation. By using the excel functions as described above, a random probability will be drawn for each annual return for each individual asset. The Excel formula then takes this random probability and matches it with the respective location on the cumulative normal distribution table; this will yield the number of standard deviations above or below the mean the random sample will fall. Next, this randomly determined number of standard deviation is then multiplied by the standard deviation of returns for each respective asset and then added to the expected (mean) return to yield a random annual return for each asset.

To help better explain this, let us walk through an example. We will use “asset X” which has an expected return of 10% and a standard deviation of returns of 3%. Following the central limits theorem, 99% of all returns for this asset will fall between plus or minus three standard deviation or in this case 9%. Thus, it can be said with 99% confidence that the expected return will be between 1% and 19% (10-9=1, 10+9=19). Now if our random number generator in Excel returns .24 we will look this number up on a cumulative normal distribution table and find the corresponding standard deviations to be negative 0.0948. Next, we will take this number and multiply it by the standard deviation of 3% to get negative 0.284%. Lastly, we will add negative 0.284% to the expected return of 10% to get a random return of 9.7156%.

Clearly, this would be a dunging task if done with manual calculations, luckily we have Excel. This simulation completed the aforementioned process of determining a random return for each asset each year over thirty years. For each year, the respective return on each individual asset is weighted by its allocation within the portfolio to determine the total expected annual return based on the random returns. Finally, the ending value of the portfolio is determined based on the starting balance multiplied by one plus the annual return. This is completed over thirty years to determine the ending portfolio balance after thirty years of random returns calculated from the expected return of each asset and the respective standard deviations of return.

This process of 150 random returns based on expected return and risk for each respective asset yielded a possible outcome for the portfolio. While this would give us a single possible estimate, I wanted to allow for a deeper analysis of total possible outcomes. To do this, I used Excel’s Data Table function to create 1,000 replications of this process. This causes Excel to create the 150 random variables needed to calculate a possible outcome for a single set of thirty years and record the ending value as one data point then complete the process again to get a second data point, so on and so forth. Clearly, this would not be feasible to complete this process 1,000 times without the use of Excel; a total of 150,000 calculations were made.

After completing the simulation of 1,000 possible ending balances for the portfolio, I then calculated the mean value of the possible ending balanced of the portfolio, this served as the expected ending balance; the standard deviation was also measured off this sample. The compound annual growth rate (CAGR) was calculated by taking the expected ending portfolio balance and dividing by the beginning balance to determine the delta then taking the 30th root (for the number of years of returns being calculated).


Lastly, I wanted to also include a visual illustration of the return distributions using a histogram. I chose to use 201 bins with the expected return being the 101st bin. By using 201 bins, I was able to have 100 bins above and below the expected return. To determine width of each bin, I took the calculated standard deviation of the returns and multiplied it by two then divided it by 100. This allows the histogram to display return that fall within plus or minus two standard deviations of the mean; according to the empirical rule of normal distribution, two standard deviations will represent approximately 95% of all data points. 

-----------------

While this serves as a simple explanation of the process used to complete this simulation, I intend on creating a second post analyzing the results. I have included a link to download the simulation below. Enjoy.

No comments:

Post a Comment