A Direct Monte Carlo Approach for Bayesian Analysis of the ...



A Direct Monte Carlo Approach for Bayesian Analysis of the Seemingly Unrelated Regression Model

By Arnold Zellner and Tomohiro Ando[1]

Abstract

Computationally efficient methods for Bayesian analysis of seemingly unrelated regression (SUR) models are described and applied that involve use of a direct Monte Carlo (DMC) approach to calculate Bayesian estimation and prediction results using diffuse or informative priors. This DMC approach is employed to compute Bayesian marginal posterior densities, moments, intervals and other quantities, using data simulated from known models and also using data from an empirical example involving firms’ sales. The results obtained by the DMC approach are compared to those yielded by use of a Markov Chain, Monte Carlo (MCMC) approach. It is concluded from these comparisons that the DMC approach is very worthwhile and applicable to many SUR and other problems.

Keywords; Bayesian multivariate analysis; Bayesian Monte Carlo techniques, MCMC, Direct MC methods.

1. INTRODUCTION

In many areas of economics and other sciences, the seemingly unrelated regression (SUR) model, introduced by Zellner (1962), is used as a tool to study a wide range of phenomena. Many studies have contributed to the development of estimation, testing, prediction and other inference techniques for analysis of SUR models including Zellner (1962, 1963), Gallant (1975), Rocke (1989), Neudecker and Windmeijer (1991), Mandy and Martins (1993), Kurata (1999), Liu (2002), Ng (2002), Carroll, et al. (2006). Also, the SUR model and inference techniques for analyzing it are described in almost all Bayesian and non-Bayesian textbooks that provide many references to the literature; see, e.g. Judge et al. (1988), Greene (2002), Geweke (2005), Lancaster (2004), Rossi, McCulloch and Allenby (2005) and other texts. The first analysis of the SUR model appeared in Zellner (1962, 1963) who employed a generalized least squares approach. Later, likelihood and traditional Bayesian approaches were developed followed by various other inference approaches; see ,e.g., the likelihood distributional approach (Frasera et al., 2005), Bayesian analyses, the Bayesian method of moments, van der Merwve and Viljoen (1988) and so on.

One of the most popular approaches for estimating the SUR model in a Bayesian framework involves use of a Markov-Chain Monte Carlo (MCMC) simulation approach to calculate posterior densities for parameters and Bayesian predictive density functions for future data. A Bayesian estimation approach for the SUR model was first introduced by Zellner (1971) who analytically derived conditional posterior densities for parameters that were later used in a Gibbs sampling approach to produce draws from the joint posterior density of the parameters. Later, applications of Markov-Chain Monte Carlo (MCMC) methodology to the SUR model, under various assumptions, have been considered in many studies, e.g. Percy (1992, 1996), Chib and Greenberg (1995), and Smith and Kohn (2000) and in recent Bayesian econometric and statistics texts.

Thanks to past and recent advancement of computer technology, Bayesian analyses of the SUR model using MCMC techniques have become feasible and widely employed in applied studies. However, it is the case that MCMC methods are rather complicated and involve many decisions to be made by users, e.g. the length of the burn in period, choice of an appropriate proposal density so that the MCMC algorithm will have a high acceptance rate, determination of the number of MCMC samples to employ, and how to check for not only convergence but also convergence to an appropriate result. With respect to all of these topics, much research has been done. For example, with respect to convergence issues, some of the work in this area has been reported in the following papers: Geweke (1992), Raftery and Lewis (1992), Heidelberger and Welch (1983), Schruben (1982), Gelman and Rubin (1992), Brooks and Gelman (1997) and Zellner and Min (1995).

In view of these widely recognized problems involved in the use of MCMC techniques in the Bayesian analysis of the SUR model, in this paper we develop a direct Monte Carlo (DMC) approach, see, e.g. Geweke (2005, p.106) for a brief discussion of this well known computational approach that has been available for many years, for Bayesian analysis of the SUR model. That is we show how the DMC approach can be employed efficiently to compute posterior and predictive density functions for the SUR model using diffuse or informative priors and usual normal likelihood functions. It is shown theoretically and in applications that the DMC approach is easily applicable and provides computational results directly without many of the concerns associated with MCMC approaches mentioned above. That is, marginal posterior densities for parameters and functions of parameters as well as predictive densities for future observations and functions of them are easily and accurately computed.

To summarize, the main objective of our work is to develop accurate Bayesian inference procedures for the SUR model using a DMC approach. We show that our DMC approach leads to easily computed posterior and predictive densities for parameters and future values of the output variables without the difficulties mentioned above that are associated with use of MCMC methods.

Below, in Section 2, we provide a brief review of the SUR model and several Bayesian estimation procedures. Section 3 presents our SUR model DMC computational procedures for obtaining Bayesian estimation and prediction results. Applications of these results are presented in Section 4 using data generated from known models and sample data pertaining to promotion performance and industrial sales in Japan. For comparative purposes, the performance the properties and results of our DMC SUR Bayesian algorithm are compared to those of the widely used MCMC algorithm for Bayesian analysis of the SUR model. In Section 5, some advantages of the proposed DMC approach are described. Section 6 presents a summary of our conclusions and some thoughts about future research on Bayesian analysis of the SUR model using a DMC approach..

2. PRELIMINARIES:THE SEEMINGLY UNRELATED REGRESSION MODEL

As has been widely appreciated in the literature, the seemingly unrelated regression (SUR) model is useful in analyzing a broad range of problems. The linear SUR model involves a set of regression equations with cross-equation parameter restrictions and correlated error terms having differing variances. Algebraically, the SUR model is given by:

[pic] (1)

with

[pic]

Here [pic] and [pic] are [pic] vectors, [pic] is the [pic] matrix of rank [pic], and [pic] is a [pic]-dimensional coefficient vector. As shown in the model (1), the equations of the model have different independent variables and error term variances. Also, the model permits error terms in different equations to be correlated. The model is also referred to as a “generalized multivariate regression model” in Press (1972) and Box and Tiao (1973) called it the “general linear multivariate model.” While many assume that the error terms are normally distributed, it is noteworthy that Kowalski et al. (1999) and Ng (2002) considered the SUR model with the errors assumed to have a heavy-tailed error distribution.

In matrix form, the SUR model in (1) is expressed as:

[pic]

where [pic] denotes the normal distribution with mean [pic] and covariance matrix [pic], [pic] is the tensor product, [pic] is an [pic] matrix with the diagonal elements [pic], and the off-diagonal [pic]th elements are [pic], [pic], [pic], [pic], [pic].

The likelihood function is, with D denoting the given data,

[pic]

where "tr" denotes the trace of a matrix, [pic] is the value of the determinant of [pic], the [pic]th elements of [pic] matrix [pic] is [pic].

Zellner (1971), Press (1972), Box and Tiao (1973), Percy (1992), and Srivastava and Giles (1987) studied the posterior distributions of parameters in the SUR model. In the absence of prior knowledge, Bayesian analysis with non-informative priors is very common in practice. One of the most widely used non-informative priors, introduced by Jeffreys (1946, 1961), is Jeffreys’s invariant prior:

[pic] (2)

which, as is well known, is proportional to the square root of the determinant of the Fisher information matrix. The joint posterior density function for the parameters is then:

[pic]

The conditional posteriors [pic] and [pic] are given by:

[pic] (3)

with

[pic]

Note that the conditional posterior pdfs of [pic] and [pic]depend upon each other. Currently, one of the most widely used Bayesian estimation methods for the SUR model is the MCMC approach that is described and applied in many recent Bayesian econometrics and statistics texts and will be employed below in our computed examples.

Maximum likelihood estimates of [pic] and [pic] are obtained by maximizing the likelihood function using appropriate interative computational techniques, say an “interative feasible generalized least squares” algorithm. Zellner (1962, 1963) considered the SUR parameter estimation problem from a frequentist point of view. If [pic] is known, a coefficient estimate, can be obtained by applying the well known generalized least squares (GLS) approach to obtain [pic]. In practice, however, [pic] that is in the expression for [pic] is usually unknown and a feasible generalized least squares estimators has been proposed. That is, the ordinary least squares residuals for each equation can be used to estimate[pic]consistently and this estimate is inserted in the expression for the GLS coefficient estimate. The maximum likelihood estimates of [pic] and [pic] can be obtained by using the iterative SUR approach, as has been recognized in the literature. When we use a flat prior, the posterior modal values for the coefficients are exactly equal to the maximum likelihood estimates. And it is well known that posterior modal values are optimal in terms of minimizing expected loss relative to a zero-one loss function in the Bayesian approach. Also, when the sample size is large, in general, under well known conditions, see, e.g. Jeffreys (1961), the posterior density for [pic] is normal with mean equal to the maximum likelihood estimate and variance-covariance matrix equal to the inverse of the estimated Fisher information matrix. Thus in large samples, both Bayesian and maximum likelihood inference techniques are generally available. However, in finite samples these large sample inference methods yield just approximate results that are often not very accurate. Thus, many have recognized a need for exact finite sample inference results for the SUR and other models.

Although the MCMC method is now one of the most popular computational approaches employed to obtain finite sample inference results, many disadvantages have been pointed out. First, the length of burn-in period is unclear. Second, we have to use an appropriate proposal density so that the MCMC algorithms have high acceptance rate. Third, there is no universal rule for determining the number of MCMC samples to generate. Last, we still have the difficulty of checking for convergence although many methods have been proposed, e.g., Geweke (1992), Raftery and Lewis (1992), Heidelberger and Welch (1983), Schruben (1982), Gelman and Rubin (1992), Brooks and Gelman (1997) and Zellner and Min (1995). Therefore, there is no guarantee that the MCMC algorithms always correctly sample from the desired posterior distributions in a finite run.

The predictive density function of [pic] at the point [pic], [pic], is given as

[pic] (4)

where [pic] is the [pic] dimensional vector, [pic] is the [pic] dimensional vector, and [pic] is the conditional density function of [pic]. With regard to prediction, Percy (1992) pointed out the difficulty in evaluating the Bayesian predictive density for the SUR model. In Percy's paper, Gibbs sampling and first order approximation methods are put forward as a solution to this problem.

As another case, use of the usual normal and inverse Wishart priors for [pic]and [pic], [pic], [pic] and [pic] leads to the following the conditional posteriors [pic] and [pic]

[pic]

where [pic] denotes the inverse Wishart distribution, [pic], and

[pic]

As well as in the above case, we can use the MCMC approach but, as already pointed out, its use involves some practical difficulties. One of the simulation approaches that does not suffer from the computational problems of MCMC is a direct Monte Carlo procedure (Zellner and Chen 2002). This procedure can be applied to AR models, simultaneous equation models and so on. In the next section, we develop a new Bayesian inference approach for the SUR model based on the direct Monte Carlo procedure.

3. BAYESIAN INFERENCE FOR THE SUR MODEL USING A SUR DMC PROCEDURE

In this section we develop a SUR DMC computational algorithm that permits a complete Bayesian analysis of the SUR model. The SUR DMC approach provides posterior densities for parameters and functions of parameters, posterior moments and intervals, and predictive densities, moments and intervals.

3.1. Transformed model

In the previous section, the SUR model was briefly reviewed. The SUR model is a set of [pic] regression equations which may seem unrelated, but where the error terms of different equations are assumed correlated, as is often the case in practice (See Zellner and Chen 2002). As is well known, when the error terms are correlated in non-Bayesian approaches, “optimal” estimates, say generalized least squares or maximum likelihood estimates of the coefficients [pic] depend on elements of the error term covariance matrix[pic]. Similarly, in Bayesian approaches to the analysis of the SUR model, posterior marginal densities for the regression coefficients are difficult to derive analytically and thus numerical methods are needed to derive marginal posterior densities and to obtain good estimates of the regression parameters. In what follows, we demonstrate how a DMC approach can be employed to compute posterior marginal densities, moments and intervals for the parameters of the SUR model as well as predictive densities for future observations.

In order to produce a direct Monte Carlo procedure, we reformulate the standard seemingly unrelated regression model (1) as follows:

[pic] (5)

with

[pic]

which is a direct re-parameterization of (1). Zellner, et al (1988), and Zellner and Chen (2002) used this transformation in the context of simultaneous equation modeling. A similar transformation was considered in Frasera et al. (2005). Note that the diagonal elements of [pic] and [pic] are different. We then have the following expressions:

[pic] (6)

where the [pic] matrices [pic] depend on [pic].

We emphasize the capability of transforming in a one to one way from the parameters of the transformed model in (6) back to the parameters of the original formulation in equation (1). As shown in the following section, there is a one to one relation between the SUR model (1) and the transformed model (6). It also should be mentioned that this transformation dramatically simplifies the estimation problem.

The likelihood function of the parameters [pic] is

[pic]

In contrast to the standard model in (1), we can decompose the likelihood function thanks to [pic].

3.2. Prior, posterior analysis

Exact posterior densities for the parameters can be calculated in the Bayesian approach by using the above likelihood function and a diffuse prior for the parameters. Taking the prior density to be proportional to the square root of the determinant of the Fisher information matrix of the model (6), that is Jeffreys’s (1946) diffuse prior, we obtain the following diffuse prior for the parameters of the model specified in (6):

[pic] (7)

The joint posterior parameter density of parameters is then:

[pic]

which is equivalent to the conditional normal inverse-gamma posterior

[pic] (8)

[pic] (9)

for [pic]. [pic] denotes the inverse Gamma distribution, and

[pic]

Here again [pic] is the dimension of[pic].

The direct Monte Carlo procedure can be repeated many times to yield draws from the joint posterior density in the following way. We can draw [pic] from the inverse gamma density [pic] and insert the drawn value in [pic] and make a draw from it. The [pic] value drawn now is then inserted in [pic] and a draw of [pic] made from it. Putting the draws [pic] and [pic] into [pic], we then draw [pic]. In the same way, this procedure is applied for[pic]. This procedure is then repeated many times. The algorithm is summarized as follows:

A direct Monte Carlo (DMC) sampling procedure:

1. (Initialization). Fix the order of a set of [pic] equations. Set the number of samples [pic] to be generated. Set [pic].

2. Generate [pic], [pic], and insert the drawn values in [pic]. Then make a draw [pic] from [pic], for [pic].

3. Increase the index [pic] by one[pic]. Draw [pic] from the conditional inverse gamma density [pic], and then generate [pic] from [pic], for [pic].

4. Repeat Step 2 sequentially until [pic].

Thus we obtain the quantities of interest, say a predictive density, a posterior confidence interval and so on. See also Supplements file Algorithm A, where we describe the DMC algorithm for the Bayesian analysis of general [pic]-system SUR model in (1) with Jeffreys’s invariant prior (2). The following sections explain the details of the DMC algorithm.

3.3. Some Remarks

Remark 1

In order to provide some intuition, we consider the simple case [pic]. The joint posterior density function of the parameters [pic] is

[pic]

Integrating [pic] with respect to [pic] and [pic] yields the marginal posterior

[pic] (10)

where [pic].

In a similar way as done in Zellner et al. (1988), we further investigate the properties of the marginal posterior (10).

By making use of the properties of the multivariate Student-[pic] density functions (Zellner 1971) one can re-express (10) as

[pic]

where

[pic]

with [pic], [pic] is the normalizing constant of the multivariate Student-[pic] density function (see Zellner 1971, appendix B), and

[pic]

Therefore, the conditional posterior density for [pic], given [pic], and [pic], is in the form of a [pic]-variate Student-[pic] probability density function with [pic] degrees of freedom with mean [pic], and covariance matrix [pic]. Noting that the matrix [pic] contains [pic], both moments depend on [pic].

The marginal density [pic] is given as

[pic] (11)

where[pic], [pic] is the normalizing constant of the [pic]-variate Student-[pic] probability density function with [pic] degrees of freedom with mean [pic], and covariance matrix [pic] with

[pic]

and

[pic]

Note that the marginal posterior density for [pic], [pic] given in (11) is written as [pic] times the[pic]-variate Student-[pic] probability density function. With regard to the normalizing constant [pic], it is not a well-known term (Zellner et al, 1988). Zellner et al, (1988) also proposed this method to obtain the unconditional posterior marginal moments of [pic].

We next investigate the relationship between the original model (1) and the transformed model (6), introduced in the previous section. The parameters in the original model (1) are: [pic]. The unknown parameters in the transformed model are [pic]. The relationship between [pic] in [pic] and [pic] is [pic]. The parameter [pic] can be expressed as[pic]. Therefore, we can make inferences about parameters in [pic] based on the generated samples [pic] by use of a direct Monte Carlo sampling procedure. We just use the following relations:

[pic]

Thus the posterior samples of [pic], [pic] can be easily obtained. With regard to the parameter [pic], note that there is one-to-one relationship between the sub-element of [pic]and [pic]. Therefore, using the relationship, we can generate the posterior samples of [pic].

Generally, there are the following recursive relations between [pic] and [pic]:

[pic] (12)

These equations are essential to implement the direct Monte Carlo sampling procedure for the Bayesian analysis of the general [pic]-system SUR model in (1) with Jeffreys’s invariant prior (2).

For example, consider the calculation of [pic] ,[pic] and [pic],

[pic]

As shown in the right hand side equations of these formula, we can see that the calculation of [pic] ,[pic] and [pic] depends only on [pic], [pic] and [pic]. Fortunately, the values of [pic], [pic] and [pic] based on [pic] and [pic] are already known. Therefore, we can calculate elements of [pic], recursively, whatever its dimension. We can conduct the Bayesian analysis of an[pic]-equation SUR model by using equation (12) for any value of m.

Remark 2

Consider again the case [pic]. We should notice that the original Jeffreys’s prior [pic] in (2) from the space [pic] to [pic] becomes

[pic]

where the Jacobian of transformation from [pic] to [pic] is given as,

[pic]

The derived prior from (2) is not equal to the Jeffrey’s prior [pic] in (7), say, [pic].

Consider the general case, where the number of seemingly related equations is [pic]. When we consider the use of the original diffuse prior assumptions about the parameters (2), the prior should be transformed from the prior density on [pic] to a prior density on the parameter set [pic]. The relevant part is the transformation from the off-diagonal set of [pic] to [pic] which gives as Jacobian,

[pic]

As a consequence, the prior information specified in (2) in terms of [pic] is

[pic] (13)

Using the same arguments as in the previous section, we obtain the following conditional inverse-gamma posterior

[pic] (14)

[pic] (15)

for [pic], with

[pic]

When we make inferences about the parameters of model (1) with Jeffreys’s prior (2) using a direct Monte Carlo approach, it is easily done by replacing the conditional posterior densities of (8) and (9) by (14) and (15) in a direct Monte Carlo sampling procedure. In Supplements file Algorithm A, the step by step guide for the use of the DMC approach is provided. The next section applies the developed method to simulated data and real data.

4. NUMERICAL RESULTS

In order to assess the performance of the proposed procedure, we present numerical results based on simulated data and real data applications. All calculations were performed on Microsoft Windows XP, Pentium-R 2.0 GHz, running R version 2.50. Each of the random draws from the probability density functions (normal, inverse-gamma, inverse-Wishart) are generated by using the elementary functions incorporated in R.

Below, we compare the properties of alternative Bayesian estimation procedures for the SUR model, namely the standard MCMC approach and the two DMC methods developed in the last section. The first DMC method is based on use of the transformed SUR model in (6) and Jeffreys's prior in (7), which we call the DMC algorithm 1, denoted by DMC1. The second DMC algorithm, denoted by DMC2, is based on use of SUR model (1), along with the conditional posterior densities in (14) and (15) along with Jeffreys’s prior in (2)

4.1. Simulation results

To compare the properties of the Bayesian model estimation procedures, we simulate data sets from the [pic] dimensional SUR model. Without loss of generality in the model structure, we set [pic], [pic] in model (1), which gives the simplest SUR model, corresponding to a bivariate response. This model can thus be written as follows:

[pic]

for [pic], where [pic] and [pic] are [pic] vectors, [pic] is an [pic] matrix and [pic] is a [pic]-dimensional vector. The elements of [pic] are given the following values:

[pic]

The covariate matrices [pic] [pic] were generated from a uniform density over the interval [pic] The coefficient vectors were set to be [pic] and [pic] This enabled us to generate the simulated response observations. In this simulation we set the number of observations to be [pic].

For the simulated data set, we calculated the posterior densities using two methods. In the MCMC method, the Metropolis-Hastings algorithm is used. The details of this MCMC algorithm is described in Supplements file Algorithm B. To save computational time, the initial values of the parameter are chosen to be generalized least squares estimates. For the proposal densities, the conditional posterior densities given in (3) were used. Since the conditional posterior of [pic] depends on [pic], we replaced [pic] by the maximum likelihood estimate [pic]. When we generate[pic], [pic] is also replaced by[pic].

In our application, we generated 10,000 posterior samples using the direct Monte Carlo approaches. The total number of Markov chain Monte Carlo iterations is chosen to be 11,000, of which the first 1,000 iterations are discarded. It is necessary to check whether the generated posterior sample is taken from a stationary distribution. We assessed the convergence of the MCMC simulation by calculating the convergence diagnostic (CD) test statistic (Geweke 1992). Geweke’s (1992) CD test statistic measures the equality of the means of the first and last part of a Markov chain. If the samples are drawn from a stationary distribution, these two means calculated from the first and the last part of a Markov chain are equal and the CD test statistic has an asymptotically standard normal distribution. All the results we report in this paper are based on samples that have passed Geweke’s (1992) convergence test at a significance level of 5% for all parameters. Also, there was no evidence of a lack of convergence based on an examination of trace plots.

Table 1 reports the posterior means, the standard errors and 95% confidence intervals. The inefficiency factor (Kim et al. 1998) and the CD test statistics of the MCMC algorithm are also reported. Using the posterior draws for each of the parameters, we calculated the posterior means, the standard errors and the 95% posterior intervals. The 95% posterior intervals are estimated using the 2.5th and 97.5th percentiles of the posterior samples. The inefficiency factor is useful to measure the efficiency of the MCMC sampling algorithm. This is defined as[pic], where [pic] is the sample autocorrelation at lag [pic] calculated from the sampled draws. We have used 1,000 lags in the estimation of the inefficiency factors. It can be seen that the results of the proposed method for estimating the parameters appear quite reasonable. For instance, the true model is estimated with reasonable accuracy using our estimated model. The 95% posterior intervals include the true parameter values. Figures S1 and S2 in supplements show the estimated posterior densities for each of the model parameters, from the proposed model and those of MCMC algorithm. As shown in our Tables and Figures below, similar results are obtained using each of the other methods.

Figure S3 in supplements shows the estimated predictive density based on DMC1. Because similar results are also obtained from DMC2 and MCMC, we just show the results from DMC1 By using the posterior samples [pic], the predictive density given in (4) can be approximated as

[pic]

The density is evaluated at points [pic] and [pic]. Because the actual predictive density is not known and thus we have no benchmark against which to compare them, we compare the estimated predictive density with the true sampling density of [pic] given [pic] and [pic]. For easy visual comparison, the results are also presented as contour plots, produced with R software. The scales are the same for both plots and the contours join points with equal probability density. As shown in the Tables and Figures, we see that the estimated predictive density is very close to the true density.

Given a large sample, the posterior density of [pic] can be approximated by a multivariate normal density with mean [pic] and variance [pic], where [pic] and [pic] are the maximum likelihood estimates. To investigate this property, we generated a set of [pic] samples. As shown below, the estimated posterior mode of [pic] and its covariance matrix [pic] based on the posterior sample[pic], are very similar. Compared to the variance matrix[pic], in which the true covariance [pic] is substituted, these two estimates are very close to the true value of[pic]. [pic]

We also repeated the same Monte Carlo experiments 100 times. Table 2 reports the means of the estimated parameters for each method. We also summarized the variation of the computed means in the 100 cases for each method. Based on 10,000 draws, we calculated the posterior means and the corresponding standard deviations. We also conducted the Monte Carlo experiments with [pic], for 100 times. As shown in the results in Table 2, the MCMC and DMC results are similar.

Table 3 reports the variation of the posterior means for 100 Monte Carlo trials. We repeated our calculations 100 times using a single set of Monte Carlo experimental data so as to investigate our procedures' properties in repeated trials using the same data. The aim is to compare the variation of each method and computational time. As shown in Table 3, the variation of the parameter estimates of our method is much smaller than that of the estimates produced by the MCMC method. This illustrates one of the main advantages of our method, namely better performance in repeated trials.

Table 4 reports the variation of the computational time for each method. For MCMC, the computational times are measured from the initialization of parameters to the end of posterior sampling[pic]. The computational times for our method are measured from the first posterior sampling to the end of posterior sampling[pic]. The results in Table 4 show that the mean computational times associated with our DMC methods are slightly smaller than those associated with the MCMC approach.

When one wants to select a set of variables that contribute to the prediction of [pic], usually a model selection criterion is utilized. As is well known, with diffuse priors, there are difficulties in deriving the marginal likelihood that is needed as an input to many model selection techniques. However, selection criteria have recently been developed that do not involve this difficulty; see, e.g. the deviance information criterion (DIC) of Spiegelhalter et al. (2002) and the Bayesian predictive information criterion (BPIC) of Ando (2007) and so on that can be employed. The general form of these model selection criteria is given by:

[pic]

where[pic]is the likelihood function, [pic]is the posterior density, and [pic] is the penalty term for model complexity. The model complexity of DIC, [pic], is called an effective number of parameters, defined as the difference between the posterior mean of the deviance and the deviance evaluated at the posterior mean of the parameters, i.e., [pic]where[pic] are the posterior means. As shown in Ando (2007), when we assume that the prior is dominated by the likelihood as [pic] increases and that the specified parametric models contain the true model, or are similar to the true model, the model complexity measure BPIC can be approximated as the number of model parameters[pic]. Therefore, as well as for DIC, we can easily compute the model selection score BPIC, defined as:

[pic]

Setting the coefficient vectors in the model to be [pic] and [pic], we conducted 100 repeated Monte Carlo trials. Considering the 9 combinations of variables (shown in Table S1), we tried to identify the set of predictors that best explains the response. Table S2 compares the variable selection results. The same settings are used for the DMC algorithm. As shown in this table, the method is also applicable to variable selection problem. The DIC tends to select the over-fitted model.

Note also that the proposed DMC sampling scheme is much easier to use because we are not subject to several drawbacks of the MCMC sampler discussed above. We also point out that the acceptance rate of MCMC was about 51.6% and 89.2% for [pic] and [pic] and not 100%. Also, we carefully selected the starting value, proposal density and burn-in period in the MCMC algorithm by hand. For example, following the suggestion for the MCMC method of Meyer et al. (2003), suppose we set the proposal density for [pic] as normal with mean equal to the posterior mode and the variance equal to [pic] so that we can achieve a high acceptance rate. We assessed the convergence of the MCMC simulation by calculating the CD test statistics. The results are given in Table 8. As shown in Table 8, although the convergence check and trace plot were fine, the results are different from Table 1. The standard deviations of the elements of [pic] are different from those provided by our DMC method. Table 5 summarizes the comparisons of the properties of our DMC approach and those of the MCMC approach. From the results in Table 5 we conclude that our method is a "user friendly" method that provides accurate results within a reasonable time and has strong practical advantages.

4.2. Real data application 1

In 2006, the size of the market for incense products in Japan was estimated to be about 30 billion yen. Although the market has been shrinking gradually (the 2006 size is only 88% of the size in 1980), the business of producing and selling incense products still provides an opportunity to earn a profit. The data analyzed here consist of the daily sales figures for incense products from April, 2006 to June, 2006. The data were collected from two department stores (hereafter Store 1 and Store 2), both located in Tokyo. In both stores, incense manufacturers sell two main products: traditional incense and lifestyle incense. In Japan, traditional incense is used differently from lifestyle incense. In addition to the daily sales data, the following information was tabulated: the weekly and holiday effect[pic], the sales promotion effect [pic], the weather effect [pic], and the event effect [pic]. Definitions of each variable are as follows:

[pic][pic]

[pic]

[pic]

[pic]

for[pic]Information on other variables, e.g., price levels, price discount percentages, features, displays, post promotion dips are important factors. Unfortunately, due to the limitations of the dataset, we employed only the variables mentioned above.

We fitted the SUR model,

[pic]

using our Bayesian SUR DMC approach. Table 8 reports the posterior means, posterior standard deviations, 95% posterior intervals using DMC1. Based on 10,000 draws for each of the parameters, we calculated the posterior means, the posterior standard deviations, and 95% Bayesian confidence intervals. The 95% Bayesian posterior intervals are estimated using the 2.5th and 97.5th percentiles of the drawn posterior samples.

As shown in Table S3, event[pic]appears to have impact on demand for both stores. One management strategy to reduce the volatility of total sales might be to organize several events for each time period.. Also, acquisition of additional loyal customers might also contribute to reducing volatility. The estimated coefficient for the weekly effect[pic]for e store 2 indicates that working days have a negative effect on sales. If the company wants to increase the sales of store 2, the introduction of a new product that attracts working people might be one way to do so.. There seems to be a large difference in the coefficients that measure the promotion effects in Store 1 and Store 2. The posterior mean of [pic] is close to zero, while that of[pic] is far from zero. In fact, the 95% posterior confidence interval for[pic] suggests that sales will increase if promotion is used for Store 2. Management could investigate why the promotion effect in Store 1 is relatively small and perhaps introduce policies that would increase it. We can also easily compute the posterior distributions of various functions of the parameters. For example, Figure S4 (c) shows the posterior distribution of[pic]. As shown in Figure S4, the 95% posterior interval of [pic] includes 0 that suggests the possibility that the promotion effect is approximately the same for both stores. However, the posterior variance of the effect for Store 1 is larger than that of it for Store 2. Perhaps some management strategies can be devised to reduce the volatility of the promotion effect of Store 1. We also obtained a positive posterior mean for the covariance parameter[pic] as shown in Table S3. Therefore, the sale figures of both stores seem to move in the same direction. This brief review of our results for this illustrative problem indicates that our proposed DMC approach can produce very useful information from analyses of empirical data.

4.3. Real data application 2

In this section, we apply our method to forecast the growth rates of real sales of several sectors of the Japanese economy, namely Agriculture, Automobile and Service sectors using quarterly data, 1977-2004. The forecasts are one-quarter ahead point predictions derived from our SUR model incorporating a set of predictor variables. Let [pic] be the growth rates of real sales for the Agriculture, Automobile and Service sectors in the [pic]’th quarter and let [pic] be candidate predictors. We consider the following one-step-ahead regression model

[pic]

where the five predictors are: [pic]= first difference of the log of the real stock price index, [pic]= growth rate of the real monetary base (M2), and [pic] = first difference of the logarithm of quarterly real GDP, at time [pic]. Tokyo stock price index is used for the stock return. Information on M2 and GDP are available from Bank of Japan and Cabinet Office, respectively. Using quarterly data, 1977 to the first quarter of 1999, [pic] quarters of data, the SUR model was estimated using our DMC1 algorithm. The predictive performance was evaluated by computing one quarter ahead forecast errors. The means of the predictive densities for the one quarter ahead output growth rates, [pic], that are optimal relative to quadratic predictive loss functions, are used to forecast one quarter ahead growth rates for each of the three sectors. These sector growth rate forecasts are then transformed into sales forecasts for each sector as follows: [pic], [pic], where [pic] is the actual output value at time [pic]. This procedure was implemented recursively. Forecasting started from the 2nd quarter of 1999 and ended in the 4th quarter of 2004.

Figure 1 compares the prediction results and the ex-post empirical distribution. Dashed lines in Figure 1 provide empirical values of quarterly real sales growth rates for three important sectors of the Japanese economy, from January 1993 through December 2006. Peaks and troughs in the plots occur roughly at each 4th quarter interval. Thin lines in Figure 1 are the ex-post empirical distribution of sales. We computed predictive means of quarterly rates of growth of real sales for each sector and also 95% predictive intervals, based on the 2.5th and 97.5th percentiles. It is seen from Figure 2 that our 95% predictive intervals include the observed outcome data in all cases.

Our approach can easily be adapted to compute not only predictive densities for individual variables (see e.g., Fig. 6 for the predictive density for the Automobile sector's sales) but also for functions of future of variables, e.g. the ratio of sales for the Agriculture and Auto sectors in future years.

We have also computed the root mean squared errors (RMSEs) and mean absolute errors (MAEs) associated with our sector forecasts that are given by:

[pic]

where [pic] and [pic] are starting and ending points of forecasting, respectively. In this paper, [pic] and [pic] are 2nd quarter of 1999 and 4th quarter of 2004.

In Table 6, the RMSEs and MAEs associated with our quarterly forecasts are presented. The RMSEs and MAEs are transformed by using log10. The results in

Table 6 indicate that the sales forecasts for the Agriculture sector are the most accurate of those for the three sectors. The corresponding RMSE and MAE are 5.0796 and 5.7836, respectively. An autoregressive (AR) model was employed as a benchmark model for each sector and an AIC score was employed to select the lag lengths for the benchmark models that led to a great improvement in forecast precision vis a vis that of a fixed lag AR model. For example, compared to the forecasting errors for the Agriculture sector, given above, the corresponding RMSE and MAE for the AR model, 5.0991 and 5.7887, are somewhat larger.. As shown in Table 6, the use of the growth rates of real GDP, real M2, and stock returns as input “leading indicator” variables yields better prediction results. Figure 5 also shows that the estimated model forecasts turning points successfully.

5. SOME REMARKS ON DMC vs MCMC

In the previous section, we considered a relatively simple Bayesian SUR estimation problem. Here we would like to show some clear advantages of our DMC methods developed in Section 3. Because the results from DMC1 and DMC2 described below are the same, we just provide the results from DMC1 and compare them to those provided by the standard Gibbs sampling algorithm. The details of this Gibbs sampling algorithm are described in Supplements file Algorithm C. To save computational time, the initial values of the parameters are chosen to be feasible generalized least squares estimates.

We simulated a data set from the [pic] dimensional SUR model used in Section 4.1. In this simulation, we set the number of observations to be[pic]. In this application, we generated 10,000 posterior samples using the direct Monte Carlo approaches. The total number of Gibbs sampling iterations is chosen to be 11,000, of which the first 1,000 iterations are discarded. Note that the number of posterior samples, 10,000, is larger than commonly employed in practice. We assessed the convergence of the MCMC simulation by calculating the convergence diagnostic (CD) test statistic (Geweke 1992). The samples from the Gibbs sampler passed Geweke’s (1992) convergence test at a significance level of 5% for all parameters.

Figure 2 plots the autocorrelation functions for the parameter[pic] obtained from the DMC1 and Gibbs sampling algorithms. The autocorrelation function was calculated using 10,000 samples. We can clearly see that the samples from the Gibbs sampling algorithm are autocorrelated, while those from DMC1 are not. For example, the values of the autocorrelation function of the MCMC output for this parameter at lags 1 and 2 are 0.285 and 0.106, respectively. On the other hand those from our DMC approach are -0.004 and -0.008, respectively. The posterior mean (PM) and standard deviation (SD) of the parameter[pic] from our DMC are PM=0.176 and SD=0.309, while those from Gibbs sampling algorithm are PM = 0.104 and SD = 0.102, respectively. Following usual practice, this estimate of the SD doesn’t take account of the autocorrelation, i.e., it is based on an iid assumption. Thus we see that many quantities computed from the MCMC output depend on the autocorrelation properties of the MCMC output. However, in practice, it is usually implicitly or explicitly assumed that the MCMC outputs are independent when calculating posterior moments of parameters, an erroneous assumption that can lead to erroneous estimates of posterior moments, as shown in the example above. This is one of the clear advantages of our DMC approach.

We also calculated the inefficiency factor (Kim et al. 1998): [pic], where[pic] is the sample autocorrelation at lag [pic]. If there is no autocorrelation, the inefficiency factor becomes 1, which is the theoretical value of our DMC algorithm. By setting the 5000 lags as the maximum, we calculated the inefficiency factor for the parameter[pic]. The value is 2.564, which is much larger than 1. Therefore an effective sample size is less than half of the number of posterior samples, while there is no such concern in our DMC approach. These results indicate that the standard Gibbs sampling algorithm for SUR estimation is much less efficient than our DMC algorithm.

Another important advantage of our DMC approach is that because it decomposes the joint conditional density of the coefficient vector[pic] into the set of [pic] dimensional conditional densities, say, [pic],…, [pic], we can avoid large scale matrix calculations. It is well known that the computation of the inverse of a large scale matrix is a very computational intensive task. For example, with a 1G memory PC, R version 1.7 does not allow us to use the Gibbs sampling algorithm to analyze the following SUR model;

[pic]

where the number of equations [pic], [pic] and [pic] are [pic] vectors (the number of observations [pic]), [pic] are [pic] matrix ([pic]) and [pic] is a 40-dimensional vector. This is because the PC system does not accept a 40000[pic]4000 dimensional design matrix[pic], although the standard Gibbs sampling algorithm requires it. Although the standard Gibbs sampling for SUR model looks very simple, we can not make inferences regarding the SUR model under this setting. Also, the Metropolis Hasting algorithm, used in our paper, can not be implemented under this setting. And the standard maximum likelihood method for estimating SUR models can not be implemented since the iterative maximum likelihood estimation technique also needs a 40000[pic]4000 dimensional design matrix. Instead, one has to use numerical optimization methods, e.g. grid search techniques. Such problems will occur in many situations, i.e., various combinations of [pic] and create a situation in which the implementation of Gibbs sampling, the Metropolis-Hasting algorithm and the iterative maximum likelihood estimation techniques for estimating the SUR model become infeasible.

In summary, in the above situation, and many other similar situations, we can not use the Gibbs sampling, the Metropolis Hasting and the iterative maximum likelihood techniques. But we can easily use our DMC approach to produce estimation, prediction and other results.

6. CONCLUSION

In a Bayesian modeling framework, a computationally efficient method for applying Bayesian inference techniques in analyses of seemingly unrelated regression models has been developed, applied and compared to a widely employed alternative computational approach, namely the MCMC approach. In particular, we developed a direct Monte Carlo (DMC) approach to compute various quantities of interest, e.g. posterior densities for parameters and associated intervals and regions, moments, etc., as well as predictive densities for future observations and associated quantities, e.g. moments, intervals, etc. In comparisons with the widely used MCMC approach for numerical analysis of the Bayesian SUR model, our DMC approach was shown not to have some of the drawbacks of the MCMC approach and to perform well in Monte Carlo experiments and applications using actual data. Thus we can recommend our DMC approach for the analysis and use of the SUR model.

As regards future work, there are many fruitful directions. First, we assumed a standard linear regression relationship in each equation of the SUR system. It is clear to us that the DMC approach that we have developed can be applied to more complex variants of the SUR model, for example those involving use of regression splines, B-splines, kernel bases, wavelet bases, and so on. It is straightforward to apply our DMC method in Bayesian analyses of these SUR model variants. In addition, it is probable that a DMC approach will be helpful in computing Bayesian posterior odds for alternative hypotheses relating to the parameters of SUR models. Further, our method can be applied to the widely used simultaneous equation model. See Zellner and Chen (2001, pp. 681-683 and pp.695-696) for some DMC procedures for computing posterior densities of parameters of a structural equation and predictive densities using a DMC approach. As is well known, posterior distributions of structural parameters and predictive densities for future values of variables generated by simultaneous equation models are in most cases not analytically tractable. Therefore development of efficient and accurate DMC procedures for Bayesian analysis of structural equation systems is very important and extensions of the results in Zellner and Chen (2001) will be a topic of our future research.

References

[1] T. Ando, Bayesian predictive information criterion for the evaluation of hierarchical Bayesian and empirical Bayes models, Biometrika 94 (2007) 443-458.

[2] G. E. P. Box, G. C. Tiao, Bayesian Inference in Statistical Analysis, Reading, Addison–Wesley, MA, 1973.

[3] S.P. Brooks, A. Gelman, General methods for monitoring convergence of iterative simulations, Journal of Computational and Graphical Statistics 7 (1997) 434-455.

[4] B.Carlin, T.Louis, Bayes and empirical Bayes methods for data analysis, Chapman and Hall, New York, 1996.

[5] R.J. Carroll, M. Doug, F. Larry, K. Victor, Seemingly unrelated measurement error models, with application to nutritional epidemiology, Biometrics 62 (2006) 75-84.

[6] S. Chib, E. Greenber, Hierarchical Analysis of SUR Models with Extensions to Correlated Serial Errors and Time-Varying Parameter Models, Journal of Econometrics 68 (1995) 339-360.

[7] D.A.S. Frasera, M. Rekkasb, A. Wong, Highly accurate likelihood analysis for the seemingly unrelated regression problem, Journal of Econometrics 127 (2005) 17–33.

[8] R. Gallant, Seemingly unrelated nonlinear regressions, Journal of Econometrics 3 (1975) 35–50.

[9] A. Gelman, D.B. Rubin, Inference from iterative simulation using multiple sequences, Statistical Science 7 (1992) 457-511.

[10] J. Geweke, Evaluating the accuracy of sampling-based approaches to calculating posterior moments, in: Bayesian Statistics 4, eds. J.M. Bernado, J.O. Berger, A.P. Dawid, and A.F.M. Smith, Clarendon Press, Oxford, 1992, pp. 169-193.

Geweke, J. (2005), Contemporary Bayesian Econometrics and Statistics, New York: Wiley.

[11] Greene, W.H. (2002), Econometric Analysis (5th ed.), New Jersey: Prentice-Hall.

Gilks, W. R., Richardson, S. and Spiegelhalter, D. J. (1996). Markov Chain Monte Carlo in Practice, New York: Chapman and Hall.

[12] Heidelberger P and Welch PD. (1983), “Simulation run length control in the presence of an initial transient,’’ Operations Research, 31, 1109-1144.

[13] Judge, G., Hill, R., Griffiths, W., Lutkepohl, H., Lee, T. (1988), Introduction to the Theory and Practice of Econometrics. New York: Wiley.

[14] Kim, J. K S., Shephard, N. and Chib, S. (1998), “Stochastic volatility likelihood inference and comparison with ARCH models,’’ Review of Economic Studies, 65, 361–393.

[15] Kowalski, J. R. Mendoza-Blanco, X. M. Tu, and L. J. Gleser (1999), “On the difference in inference and prediction between the joint and independent t-error models for seemingly unrelated regressions,’’ Communications in Statistics, Part A - Theory and Methods, 28, 2119–2140.

[16] Kurata, H. (1999), “On the efficiencies of several generalized least squares estimators in a seemingly unrelated regression model and a heteroscedastic model,’’ Journal of Multivariate Analysis, 70, 86–94.

[17] Jeffreys, H. (1946), “An Invariant Form for the Prior Probability in Estimation Problems,’’ Proceedings of the Royal Society of London, Series A, 196, 453-461.

[18] Jeffreys, H. (1961), Theory of Probability (3rd ed.), Oxford: Oxford University Press.

[19] Lancaster T. (2004), Introduction to Modern Bayesian Econometrics, New Jersey: Cambridge University Press.

[20] Liu, A. (2002), “Efficient estimation of two seemingly unrelated regression equations,’’ Journal of Multivariate Analysis, 82, 445–456.

[21] Mandy, D. M. and Martins-Filho, C. (1993), “Seemingly unrelated regressions under additive heteroscedasticity: theory and share equation applications,’’ Journal of Econometrics, 58, 315-346.

[22] Meyer et al. (2003), “Stochastic volatility: Bayesian computation using automatic differentiation and the extended Kalman filter,’’ Econometrics Journal, 6, 408-420.

Neudecker, H. and Windmeijer, F. A. G. (1991), “[pic]in seemingly unrelated regression equations,’’ Statistica Neerlandica, 45, 405–411.

[23] Ng, V. M. (2002), “Robust Bayesian Inference for Seemingly Unrelated Regressions with Elliptical Errors,’’ Journal of Multivariate Analysis, 83, 409–414.

Percy, D. (1992), “Predictions for Seemingly Unrelated Regressions,’’ Journal of the Royal Statistical Society B, 54, 243-252.

[24] Percy, D.F. (1996), “Zellner’s Influence on Multivariate Linear Models,’’ in Bayesian Analysis in Statistics and Econometrics: Essays in Honor of Arnold Zellner, eds. D.A. Berry, K.M Chaloner and J.K. Geweke, New York: John Wiley and Sons, pp. 203-214.

[25] Press, S. J. (1972), Applied Multivariate Analysis, New York: Holt, Rinehart and Winston, Inc.

[26] Raftery, A.E. and Lewis, S.M. (1992), “One long run with diagnostics: Implementation strategies for Markov chain Monte Carlo,’’ Statistical Science, 7, 493-497.

[27] Rocke, D. M. (1989), “Bootstrap Bartlett adjustment in seemingly unrelated regression,’’ Journal of the American Statistical Association, 84, 598-601.

Rossi, P.E, Allenby, G. and McCulloch, R. (2005), Bayesian Statistics and Marketing, NJ: John Wiley and Sons.

[28] Schruben L.W. (1982), “Detecting initialization bias in simulation experiments,’’ Operations Research, 30, 569-590.

[29] Smith, M. and R. Kohn, (2000), “Nonparametric Seemingly Unrelated Regression,’’ Journal of Econometrics, 98, 257-282.

[30] Spiegelhalter, D.J., Best, N.G., Carlin, B.P. and van der Linde, A. (2002), “Bayesian measures of model complexity and fit (with discussion),’’ Journal of the Royal Statistical Society, Series B, 64, 583–639.

[31] Srivastava, V. K. and Giles, D. E. A. (1987), Seemingly Unrelated Regression Equations Models, New York: Dekker.

[32] Tierney, L. (1994), “Markov chains for exploring posterior distributions (with discussion),’’ Annals of Statistics, 22, 1701–1762.

[33] van der Merwe, A., Viljoen, C. (1988), “Bayesian analysis of the seemingly unrelated regression model,’’ Manuscript, University of the Free State, Department of Mathematical Statistics.

[34] Zellner, A. (1962), “An efficient method of estimating seemingly unrelated regression equations and tests for aggregation bias,’’ Journal of the American Statistical Association, 57, 348–368.

[35] Zellner, A. (1963), “Estimators for seemingly unrelated regression equations: some exact finite sample results,’’ Journal of the American Statistical Association, 58, 977-992.

[36] Zellner, A. (1971), An introduction to Bayesian inference in econometrics, New York : Wiley.

[37] Zellner, A. Bauwens, L. and Van Dijk, H. K. (1988), “Bayesian specification analysis and estimation of simultaneous equation models using Monte Carlo Methods,’’ Journal of Econometrics, 38, 39-72.

[38] Zellner, A. and Chen, B, (2002), “Bayesian Modeling of Economies and Data Requirements,’’ Macroeconomic Dynamics, 5, 673–700.

[29] Zellner, A. and Min, C.K. (1995), “Gibbs Sampler Convergence Criteria,’’ Journal of the American Statistical Association, 90, 921-927.

Table 1. Simulated data: Summary of the parameter estimates for (a) the proposed algorithm DMC1, which employs the transformed SUR model (6) with corresponding Jeffreys’s prior (7), (b) for the proposed algorithm DMC2, which estimates the model (1) with Jeffreys’s prior (2) based on a direct Monte Carlo method and (c) for the MCMC algorithm. The posterior means, the standard deviations, and the 95% posterior intervals (95%PIs) are calculated. The 95% posterior intervals are estimated using the 2.5th and 97.5th percentiles of the drawn posterior samples. For MCMC, the inefficiency factors (INEFs), Geweke’s (1992) convergence diagnostic test statistic (CD) are also calculated.

(a): DMC1

| |Mean |SD |95%PI |

|[pic] |3.0581 |0.0597 |2.9400 |3.1750 |

|[pic] |-1.9675 |0.0543 |-2.0747 |-1.8613 |

|[pic] |1.9887 |0.0822 |1.8279 |2.1476 |

|[pic] |0.9418 |0.0816 |0.7784 |1.0994 |

|[pic] |0.1109 |0.0162 |0.0834 |0.1476 |

|[pic] |-0.0345 |0.0166 |-0.0689 |-0.0036 |

|[pic] |0.2196 |0.0323 |0.1641 |0.2889 |

(b): DMC2

| |Mean |SD |95%PI |

|[pic] |3.0580 |0.0597 |2.9409 |3.1737 |

|[pic] |-1.9672 |0.0549 |-2.0728 |-1.8595 |

|[pic] |1.9899 |0.0821 |1.8263 |2.1494 |

|[pic] |0.9438 |0.0814 |0.7818 |1.1011 |

|[pic] |0.1111 |0.0161 |0.0840 |0.1473 |

|[pic] |-0.0347 |0.0167 |-0.0695 |-0.0038 |

|[pic] |0.2224 |0.0328 |0.1676 |0.2956 |

(c): MCMC

| |Mean |SD |95%PI |INEF |CD |

|[pic] |3.0387 |0.0517 |2.9377 |3.1418 |1.4768 |1.2326 |

|[pic] |-1.9790 |0.0475 |-2.0738 |-1.8858 |0.4317 |0.8065 |

|[pic] |1.9264 |0.0711 |1.7849 |2.0660 |1.2698 |1.7746 |

|[pic] |0.9542 |0.0704 |0.8160 |1.0927 |1.2881 |-1.4312 |

|[pic] |0.1110 |0.0160 |0.0839 |0.1472 |0.5902 |1.1095 |

|[pic] |-0.0352 |0.0165 |-0.0701 |-0.0051 |0.6408 |0.0255 |

|[pic] |0.2176 |0.0319 |0.1636 |0.2861 |0.2162 |0.4975 |

Table 2. Simulated data: Means and the standard deviations (SDs) of the posterior means of each parameter are reported. The results are obtained based on 100 Monte Carlo trials.

|n=50 |DMC1 |DMC2 |MCMC |

| |Mean |SDs |Mean |SDs |Mean |SDs |

|[pic] |3.0011 |0.0804 |3.0010 |0.0803 |3.0009 |0.0801 |

|[pic] |-1.9881 |0.0782 |-1.9880 |0.0783 |-1.9878 |0.0780 |

|[pic] |1.9945 |0.1021 |1.9944 |0.1021 |1.9946 |0.1025 |

|[pic] |0.9911 |0.1113 |0.9910 |0.1115 |0.9907 |0.1116 |

|[pic] |0.1053 |0.0215 |0.1052 |0.0215 |0.1054 |0.0215 |

|[pic] |-0.0507 |0.0239 |-0.0508 |0.0239 |-0.0540 |0.0251 |

|[pic] |0.2186 |0.0484 |0.2166 |0.0482 |0.2149 |0.0478 |

|n=100 |DMC1 |DMC2 |MCMC |

| |Mean |SDs |Mean |SDs |Mean |SDs |

|[pic] |2.9977 |0.0484 |2.9977 |0.0485 |2.9972 |0.0483 |

|[pic] |-2.0045 |0.0520 |-2.0045 |0.0521 |-2.0048 |0.0519 |

|[pic] |1.9939 |0.0749 |1.9940 |0.0749 |1.9937 |0.0748 |

|[pic] |0.9919 |0.0704 |0.9920 |0.0705 |0.9918 |0.0704 |

|[pic] |0.1067 |0.0155 |0.1067 |0.0155 |0.1069 |0.0155 |

|[pic] |-0.0527 |0.0150 |-0.0526 |0.0150 |-0.0544 |0.0155 |

|[pic] |0.2071 |0.0253 |0.2062 |0.0252 |0.2056 |0.0252 |

Table 3. Simulated data: The variation of the posterior means for 100 Monte Carlo trials. The same [pic] data set generated from the true model were repeatedly used. The aims are to compare the variation of each method.

|n=50 |DMC1 |DMC2 |MCMC |

| |Mean |SDs |Mean |SDs |Mean |SDs |

|[pic] |2.9886 |0.00076 |2.9876 |0.00066 |2.9926 |0.00078 |

|[pic] |-1.9118 |0.00076 |-1.9113 |0.00071 |-1.9141 |0.00079 |

|[pic] |1.9800 |0.00085 |1.9800 |0.00091 |1.9810 |0.00108 |

|[pic] |0.9057 |0.00078 |0.9055 |0.00071 |0.9046 |0.00107 |

|[pic] |0.1091 |0.00024 |0.1092 |0.00020 |0.1087 |0.00030 |

|[pic] |-0.0667 |0.00022 |-0.0661 |0.00021 |-0.0673 |0.00025 |

|[pic] |0.1934 |0.00035 |0.1930 |0.00037 |0.1925 |0.00045 |

|n=100 |DMC1 |DMC2 |MCMC |

| |Mean |SDs |Mean |SDs |Mean |SDs |

|[pic] |3.0030 |0.00056 |3.0033 |0.00057 |3.0033 |0.00067 |

|[pic] |-1.9578 |0.00052 |-1.9571 |0.00060 |-1.9573 |0.00061 |

|[pic] |2.0045 |0.00066 |2.0043 |0.00064 |2.0035 |0.00085 |

|[pic] |1.0263 |0.00059 |1.0264 |0.00070 |1.0260 |0.00093 |

|[pic] |0.1081 |0.00014 |0.1081 |0.00014 |0.1084 |0.00020 |

|[pic] |-0.0352 |0.00016 |-0.0350 |0.00016 |-0.0344 |0.00020 |

|[pic] |0.1945 |0.00026 |0.1926 |0.00027 |0.1903 |0.00033 |

Table 4. Simulated data: The computational times (Sec) for each method. The computational time for DMC1 and DMC2 includes a time to make 1,000 posterior samples from the transformed SUR model and also a time for transforming the produced samples into the original SUR model parameter. The computational time for MCMC includes an initialization of starting value of the parameter, 1,000 burn-in period, and to produce 1,000 posterior samples from the original SUR model.

| |DMC1 |DMC2 |MCMC |

| |Mean |SDs |Mean |SDs |Mean |SDs |

|[pic] |7.5881 |0.0704 |7.5736 |0.0696 |14.2223 |0.2620 |

|[pic] |12.3236 |0.1266 |12.3309 |0.1214 |15.1566 |0.1609 |

Table 5. Comparison of DMC and MCMC

| |DMC |MCMC |

|Need to fix the number of samples drawn |Yes |Yes |

|100% acceptance of draws |Yes |No |

|Doesn’t require initial parameters value |Yes |No |

|Doesn’t need a burn-in period setting | Yes | No |

|Doesn’t need to check for convergence | Yes | No |

|Doesn’t need to select convergence check criteria | Yes | No |

|Doesn’t require selection of a proposal density | Yes | No |

|Doesn’t require use of a proposal density |Yes |Yes/No |

Table 6. Real data application 2: Summary of the RMSEs and MAEs. The SUR model with DMC1 algorithm is used. For the benchmark model, an aggregate AR model was employed. These errors are transformed by using log10.

| |RMSE |MAE |

| |SUR with DMC |AR |SUR with DMC |AR |

|Automobile |6.1469 |7.0990 |6.1201 |7.0965 |

|Agriculture |5.0991 |5.7887 |5.0796 |5.7836 |

|Service |6.6458 |7.5656 |6.6365 |7.5636 |

[pic][pic]

(a) Automobile (b) Agriculture

[pic]

(c) Service

Figure 1. Estimated predictive density for the sales of (a) Automobile (b) Agricultural and (c) Service sectors. The dashed lines are 2.5 and 97.5 percentiles of the predictive density. The solid line shows the realized sales. The dotted lines are the quarterly predictive means.

[pic][pic]

(a) MCMC (b) DMC

Figure 2. Simulated data. Autocorrelation function of the produced samples for the parameter[pic] from (a) MCMC and (b) DMC approach. The autocorrelation function was calculated by using 10,000 samples. The values of the autocorrelation function of MCMC at lag 1 and 2 are 0.285 and 0.106, respectively. On the other hand those of DMC are -0.004 and -0.008, respectively.

-----------------------

[1] Arnold Zellner is H.G.B., Alexander Distinguished Service Professor Emeritus of Economics and Statistics, Graduate School of Business, University of Chicago, Chicago, IL, 60637 (E-mail: arnold.zellner@chicagogsb.edu).

_a…†ˆ‰’ß

á

“-¡-jŽ?˜¤´¼½… † ã!ä!ø!ù!ú!û!" ""-"ùõðäõÝðõÙõÎõÇõ¾¾ÂõÂõÎõ±¨˜‰±õ|scj~ÍxL[pic]hU}?hÛb–EHÞÿU[pic]V[pic]hU}?hÛb–EHÞÿjhTomohiro Ando is Associate Professor, Graduate School of Business Administration, Keio University, Yokohama, Kanagawa, 223-8523, Japan (E-mail:andoh@kbs.keio.ac.jp).

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download