Econ201- Final Paper



Stochastic Volatility Model:

Bayesian Framework with High Frequency Data

Haolan Cai

Econ201- Spring 2009

Academic honesty pledge that the assignment is in compliance with the Duke Community Standard as expressed on pp. 5-7 of "Academic Integrity at Duke: A Guide for Teachers and Undergraduates "

____________________________________

1 Introduction

1.1 Motivation

Volatility or risk of an asset is an important feature that needs to be well understood for a variety of reasons, such as the need to account for risk while assembling a basket of assets and in the pricing of options. While some models, like the basic Black-Sholes options pricing model, assume constant volatility over time, the data suggests that allowing volatility to change over time is much more sensible. Just eyeballing the time-series of any financial asset’s returns will show patterns in the grouping of periods of high and low returns.

Stochastic volatility models have been introduced as a way of modeling this changing variance over time. While stochastic volatility models have been around since the 1980s, stochastic volatility models in a Bayesian framework are relatively newer and less explored. Thus, I chose a model using Gibbs-sampling with an underlying autoregressive process of order one to explore patterns in volatility.

The use of high frequency data is also relatively new. There are obvious advantages of using such datasets. For instance, one could use a smaller and more relevant time span of data but have the same number of data points to work with. In particular, using within-day returns to train a stochastic volatility model allows with-in day predictions. Using within-day returns, however, poses its own problems in the context of a stochastic volatility model.

1.2 Background

There has been significant work completed on assessing the validity of stochastic volatility models in the frequentist framework. Among the issues are a few from Gallant, Chieh, & Tauchen (1997) such as the need for an asymmetric thick-tailed distribution for innovations and long-term dependence in the model for volatility. Stochastic volatility models in the Bayesian framework solve some such issues while creating a few of their own. One such disadvantage of the model is that the likelihood function is not tractable and more careful estimation is needed.

Jacquier et al. (1994) developed a Bayesian method for estimating a simple, univariate stochastic volatility model. This was extended to use a Gibbs sampler for simpler and faster implementation by Carter and Kohn (1994). West & Aguilar (2000) furthered this work with extensions to improve model fit. My work builds on the current literature by applying West & Aguilar’s model to a real high frequency data set and assessing the model on two different rubrics for both in-sample fit and out-of-sample predictive power.

1.3 Data

The data comes from minute-by-minute prices of General Electric stock from 1997 to 2008. Using the last 2 years of data, I computed two-hourly returns on which to assess in-sample fit while reserving the last month of data for out-of-sample forecasting evaluation. This gives exactly 2000 time points on which to train the model and 60 time points for out-of-sample prediction. Figure 1 shows the 2-hourly returns of GE for the entire time period with the red lines marking the section used for training the model and the remaining data used for predictive fit.

2 Methods

2.1 Model

The model that I choose is from West and Aguilar (2000). The model takes a transformation of the returns so that the volatility can be modeled as an AR process with some latent mean volatility. The canonical model is as follows:

Applying a transformation of log(r2)/2 gives the following linearized model:

The parameters to be estimated are (mu, phi, nu). Thus, the interpretation of the model is a linear combination of some baseline volatility on the log scale and a latent AR(1) process with some added error. Gamma is the error term, which is taken to be distributed one-half of log of chi-squared with degrees of freedom one.

2.2 Differentiation from HMM

This model differs from the traditional Hidden Markov Model in two respects. The first, a relatively minor detail, is the baseline volatility as estimated by mu. Secondly, the gamma error replaces the usual Gaussian estimation for the observational noise. The one-half log chi-squared distribution with one degree of freedom is non-Gaussian and slightly left-skewed and can be approximated with a discrete mixture of normal distributions with known parameters, i.e.,

where q is the weight of the j-th normal distribution with mean b and standard deviation w.

Estimating the error with a normal distribution (or in this case a mixture of normals) puts the estimation algorithm into the Gaussian hidden AR model framework. The shape of the gamma parameter’s distribution has been well studied and can be very accurately approximated by a mixture of j = 7 normals with the following conditional moments:

[pic]

From Kim, Shephard, and Chib (1998). Minor changes to the error approximation such as adjustments in tail size or finer approximation with more components can be made without changing the Gibbs sampling structure.

2.3 Sampling Framework

I chose to run this model in a Bayesian framework to allow uncertainty in the parameters to carry through the model. This means running a Gibbs sampling algorithm in which the parameters are iteratively sampled from a combination of the data and prior to construct a posterior distribution for each parameter. The point estimates for the parameters is the maximum likelihood estimate from that posterior distribution. The following standard priors were specified for each parameter: mu is taken to be normal, phi is taken to be normal as well, and nu has the inverse gamma prior.

Sampling in this framework gives the following updating procedure (in no particular order) for each of the condition posterior distributions of the parameters:

1) [pic]

It is easy to see that the gammas are conditionally independent, and normalizing over j updates the probabilities qj at each time point t. This defines a posterior for gamma that can be sampled at each new time point to gain new mixture component indicators.

2) [pic]

This is the standard posterior for the AR(1) persistence coefficient under the normal prior. This can be trivially sampled at each new time point to generate new values of phi.

3) [pic]

This is the posterior for the AR(1) innovation variance under the inverse gamma prior. This distribution is also trivially sampled at each new time point for new values of nu.

4) [pic]

This is the posterior under the normal prior for mu and is based on conditionally normal, independent observed values of yt from the data. Thus, the posterior is also normal.

5) [pic]

Under the above conditioning the model is a linear (conditionally) normal AR(1) HMM, conditional on the error term. The error variances of the distribution for yt are known and differ over time (any one of the 7 normals in the mixture that approximates the log chi-squared distribution). To construct a posterior distribution, a forward-filtering, backward-sampling (FFBS) algorithm is used.

Following this sampling algorithm iteratively gives the Gibbs sampling process for this SV model.

2.4 Forward-Filtering Backward-Sampling

Since the error variance varies over time, a Kalman filter-like forward smoothing, from t = 0 to t = n, and a backward simulating, states xn, xn-1, …, x0, is needed to generate the full sample x0:n. It is possible (and simpler) to use a Gibbs sampler on each of the complete conditionals to generate latent xt values instead. However, that technique tends to be less effective in practice, especially where AR dependence is high (phi close to 1) as is such in financial data. Due to the high degree of dependence between successive iterations, the Gibbs sampler moves around the state space very slowly and thus also converges very slowly. The FFBS approach takes all xt variates together, moves though the state space quickly, and generally converges rapidly as well.

2.5 Predictive Model

The predictive model was built into the sampling algorithm to allow the uncertainty in the parameter estimates to carry through. For each iteration, the next 60 time steps were estimated using that iteration’s generated parameters. Each step of the AR(1) process is estimated using the following model:

[pic]

[pic]

The baseline volatility for each iteration was added after all time step estimates were generated.

2.6 Assessing Fit

Assessment of the model centers around two metrics. The in-sample fit is computed as the coefficient of determination by regressing the absolute value of real returns on the volatility estimates produced by the model for that period. While this R2 value is expected to be low, it can still be considered a valid metric following Anderson & Bollerslev (1998). The out-of-sample prediction value of the model is calculated using the correlation between the average predicted 60 time steps and the actual realized volatility during that time period. This is similar to a metric used in Anderson & Bollerslev (1998).

2.7 Normalization

As mentioned before, some problems arise when using high frequency data in stochastic volatility models. The volatility smile is well documented in the literature when computing intraday returns. To control for this pattern within the day, I normalized each of the returns by the average realized volatility for each two hour window. Figure 2 illustrates the normalized returns over the given time period.

2.8 Error Term

Initial implementation of the model revealed problems with the model specification, particularly the size of the tails in the error term. Analysis of the conditional posterior probabilities showed the left-most tail of the log chi-squared distribution being selected more often than expected. The model was then trivially modified to allow increased tail area by uniformly scaling the variances by a constant greater than one. The optimal scaling factor was found to be close to 1.75.

2.9 Random Noise

In theory, price movements are continuous and there should be no zero returns. In practice, due to events such as missing data and the discreteness of reported prices, we see zero returns from time to time. Failure to account for these zero returns leads to a breakdown in the model, as the log of zero is undefined. To deal with this issue, a small amount of random white noise is added to the end of all prices. The white noise has mean zero and variance 10xe-7. This preserves small (discretely indistinguishable) changes in price without affecting the return structure.

3 Results

The stochastic volatility model was estimated using the Gibbs sampling algorithm outlined above. A burn-in of 500 iterations was applied and then 5000 posterior samples were generated from the model. The posterior distributions for the model parameters calculated with the 2 years of 2-hourly normalized returns are shown in Figure 3. The corresponding MLEs of each distribution are: mu = .4535, phi = .9458, and rs = .3290. The high value of phi implies a high persistence in volatility, congruent with the volatility clustering observed in the data.

The in-sample fit is shown in Figure 4. The model shows a reasonable approximation of the movements in volatility throughout the time period. The importance of using a time-varying volatility model is clear from this figure, as there are distinct periods of high and low volatility.

The in-sample fit was calculated as the coefficient of determination from the following regression: [pic], where [pic]is the estimated volatility from the model. The estimated volatility is taken to be a biased estimator for the absolute value of the returns. The stochastic volatility model explains 10.39% of the variation in the absolute value of the returns. This value compares favorably to the range of values given in Anderson & Bollerslev (1998).

The out-of-sample predictive value was calculated as the correlation between realized volatility as seen in the data and the predicted volatility from the model. Estimating [pic] yields a correlation of .0860. Figure 5 shows the value of the true realized volatility on the x-axis and the predicted volatility on the y-axis.

The model provides a significantly lower range than the realized volatility. However, volatility is notoriously difficult to predict. The R2 value for the fit is .0074, comparing favorably to the range given by Anderson & Bollerslev (1998).

4 Conclusions

The model provides a good in-sample fit for high frequency data that is comparable to other assessments of stochastic volatility models with lower frequency returns in the literature. The model correctly represents the heteroskedastic nature of volatility and the high persistence of the volatility structure. The out-of-sample fit for intraday volatility is also comparable to other models in the literature. However, the model predicts much less volatility than is realized by the data.

Further work to be done includes updating the AR process to include a high order model for longer memory models to better approximate the volatility structure. A high order AR process or a mixture of AR(1)s could potentially greatly enhance the predictive and fit value of this model. AR processes modeling more than the price movement itself, perhaps extra macro-economic factors, could also enhance the predictive quality of the model. Further extensions include the development and application of new rubrics for assessing the performance of stochastic volatility models for greater comparative strength. Further work could be done to ascertain the optimal high frequency periods for predictive and modeling purposes.

Figures

[pic]

Figure 1: Daily GE returns from 1997 to 2008. Dashed lines indicate the start and finish of time period used for model training. Remaining data used for predictive comparison.

[pic]

Figure 2: Comparison of the normalized returns and original returns. Note the price structure has not been changed.

[pic][pic][pic]

Figure 3: The posterior distributions for mu, phi and variance of innovation in AR process

[pic]

Figure 4: Plot shows the mean estimated volatility from the model at each time point.

[pic]

Figure 5: Predicted volatility values vs. real volatility values.

References

Aguilar, O. & West, M. (2000). Bayesian dynamic factor models and portfolio allocation. Journal of Business and Economic Statistics 18, 338-357.

Anderson, T.G. & Bollerslev, T. (1998). Answering the skeptics: Yes, standard volatility models do provide accurate forecasts. International Economic Review 39, 885-905.

Carter, C. & Kohn, R. (1994). On Gibbs sampling for state space models. Biometrika 81, 541-553.

Gallant, A.R., Hiesh, D., & Tauchen, G. (1997). Estimation of stochastic volatility models with diagnostics. Journal of Econometrics 81, 159-192.

Jacquier, E., Polson, N.G. & Rossi, P.E. (1994). Bayesian analysis of stochastic volatility models. Journal of Business and Economic Statistics 12, 69-87.

Kim, S., Shephard, N., & Chib, S. (1998). Stochastic volatility: likelihood inference and comparison with ARCH models. Review of Economic Studies 65, 361-393.

-----------------------

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download