What Can Meta Analysis Tell Us about Hypothetical Bias for ...



What Can Meta Analysis Tell Us about Hypothetical Bias in Stated Preference Valuation Studies?

John Loomis, Department of Agricultural and Resource Economics, Colorado State University, Fort Collins, CO 80523-1172

September 1, 2010

Abstract

Hypothetical bias arises in stated preference valuation studies (e.g., contingent valuation) when respondents report a willingness to pay (WTP) that exceeds what experiments indicate is the actual amount that people would pay using their own money. While this bias is not found in all stated preference surveys, hypothetical WTP typically exceeds the actual value by a factor of two to three. Unfortunately, there is no widely accepted general theory of respondent behavior that explains hypothetical bias. Therefore, we review the current hypotheses about the causes this overstatement of WTP as some of these lead to recommendations to modify survey designs to mitigate the bias. Other hypotheses reviewed lead to ex-post calibration of WTP using certainty scales or simply using the average overstatement factor (e.g., divide hypothetical WTP by two). We review two meta-analyses that have been performed to test what features of a stated preference survey or experiment leads to hypothetical bias. Meta-regression analysis (MRA) can be used to calculate calibration factors, tailored to adjust WTP to the specific valuation circumstances. The resulting ‘calibration function’ is likely to more accurate approach to calibration than simple across-the-board calibrations based on aggregate averages.

1. Importance of the Problem

Valuing private and public dimensions of environmental quality is central to environmental economics and often the key to effective public policy. Examples include setting the level of pollution taxes and estimating natural resource damages from mining and oil spills. In some environmental applications, economists can rely on revealed preference techniques such as travel cost and hedonic methods (Parsons, 2002; Taylor, 2002). For some public goods there is little behavior upon which to base valuation. Protecting endangered species, removing a dam from a river, and the preservation of a remote natural environment are public goods, the benefits of which are obtained from just knowing these resources are protected (Krutilla, 1967). As formalized by Freeman (1993), revealed preference methods are of little help when there is no behavioral trail (Larson, 1993). In such cases, stated preference methods are needed where a survey is conducted and that records a respondent’s willingness to pay (WTP). While the economic theory underlying welfare measurement using stated preference methods is well developed (Freeman, 1993; Flores, 2003), there is no widely accepted theory of how people respond to questions about their WTP when it is hypothetical (Murphy et al, 2005). In particular, why people may give a different WTP on a survey than the same people would be actually willing to pay in an experiment that involves their money. Such differences are frequently observed and are termed, ‘hypothetical bias.’ Even recently, Mitani and Flores (2010:3) state “The underlying causes of hypothetical bias are not yet sufficiently understood, and the theoretical or systematic explanation remains as one of the major questions in the stated preference economic analysis.”

There are, however, several competing and plausible hypotheses about how a person may respond when asked how much they would, hypothetically, be willing to pay for a particular public good. These hypotheses have been put forward to help explain the usual (but not universal) finding of an overstatement of hypothetical WTP relative to respondents actual WTP (List and Gallet, 2001; Murphy, et al, 2005).

Each of these different hypotheses about how people respond to these WTP questions suggests a different approach for countering the hypothetical bias. Some suggest ex ante approaches that reduce hypothetical bias by survey design. Other approaches involve ex post recoding or calibration of WTP responses to correct or reduce the stated WTP for hypothetical bias. Each approach is discussed below.

However, what is missing is a general theory that can explain: (a) the conditions under which hypothetical bias is likely to be found and (b) when bias is expected, the magnitude of the expected bias as a function of the particular survey design characteristics.

Cummings, et al. (1986) provides an early attempt to develop such guidelines. They identify four conditions to minimize hypothetical bias: (a) subjects must be familiar with the commodity being valued; (b) subjects must have had prior choice experience with the good (e.g., visiting the area), (c) there must be little uncertainty in the survey’s scenario, outcomes, and provision rules; (d) WTP not willingness to accept (WTA) must be elicited (Cummings, et al. 1986: 104). These qualitative guidelines were consistent with the research literature of the early 1980’s, and they have intuitive appeal. In the ensuing 20 years, more specific hypotheses about the causes and cures of hypothetical behavior have been offered, but as discussed below, many of Cummings’, et al. (1986) overall recommendations are supported by last quarter century of stated preference research.

2. Addressing Hypothetical Bias

2.1 Ex-Ante Approaches to Reducing Hypothetical Bias

Three ex-ante survey design approaches have been offered to mitigate hypothetical bias. The first is Carson and Groves (2007) who suggest that a hypothetical constructed market or simulated voter referendum must be consequential to the respondent to be potentially incentive compatible (e.g. truth revealing). That is, it must have some potential effect on their future utility such as higher taxes for the increased probability that the public good will be supplied. One way to operationalize this design is to cast the survey as an ‘advisory’ referendum to public decision markers about a realistic policy. Carson and Groves (2007: 190-200) spend a significant amount of time in their paper discussing why only a binary choice or a choice structure equivalent to a binary choice can lead to responses that are incentive compatible to truth revelation. When there are just two choices, the respondent has no incentive not to pick the one that maximizes their utility, as the advisory referendum provision rule involves implementation of the alternative with the largest number of votes and this is a one time vote. However, with three or more alternatives, Gibbard (1973) and Satterthwaite (1975) show that it is not always in the respondent’s interest to reveal their most preferred alternative in this case. As Carson and Groves note (2007:199) in choice experiments with 3 or more alternatives, the respondent may want to signal the importance of the lowest price even if this is not their preferred overall bundle of program attributes. Econometrically, this is type of behavior is sometimes detected when the estimated indirect utility function violates the Independence of Irrelevant Alternatives (IIA). Thus, the choice of the preferred alternative is less likely to be consistent when three or more alternatives are present.

The use of a ‘cheap talk’ script provides a second type of ex ante approach to reduce respondent hypothetical bias (Cummings and Taylor, (1999)) A ‘cheap talk’ script tells the respondent that past surveys have shown that respondents overstate their WTP, and instructs them not do so. Rather, they are reminded to report what they would actually do if this were, in fact, a real decision using with their own money. Cummings and Taylor (1999) and others (Aadland and Caplan, 2003) have found this approach to be successful in the lab to obtain hypothetical WTP equal to actual. Others have had less success, and have found that cheap talk is most effective only for certain types of respondents (Blumenschein, et al. 2008; Aadland and Caplan, 2006; Champ, et al., 2009).

A third approach has recently been proposed by Mitani and Flores (2010). They hypothesize that hypothetical bias arises because some surveys do not discuss the public good provision decision rule and the realistic likelihood of payment. This can make the respondent uncertain about whether they will need to pay the full bid amount, and about the likelihood that of provision of the public good. It is hypothesized these uncertainties lead to hypothetical bias. This theory implies there will be no hypothetical bias if respondents are explicitly told that the good will be provided based on the results of the survey (with the probability of provision>0) and that the probability that they will have to pay is exactly the same as the probability of the good’s provision. Mitani and Flores (2010) confirm this result in an induced valuation experiment where probability pairs of provision and payment are made explicit. “Our results suggest implications for mitigation of hypothetical bias. First, it is essential to induce subjective probabilities so that the probability of payment equals the probability of provision. In the experimental or survey designs, it will be important to control both payment and provision sides in the same way” (Mitani and Flores; 2010, p 38).

2.2 Ex-post Approaches to Reducing Hypothetical Bias

Among the ex-post approaches to reduce hypothetical bias, Champ, et al. (1997) also views that such bias originates in respondent uncertainty. In their ex-post approach, a respondent answers the usual dichotomous choice valuation question, and then indicates how certain they are of their answer, usually on a 1-10 scale. Based on the polling literature and several past calibrations of actual and hypothetical WTP, uncertain responses are coded as ‘no’ in the WTP statistical analyses of the dichotomous choice data (see Champ, et al (1997)). Limited existing evidence suggests that recoding respondents who report a level of certainty less than ‘seven’ on a 1-10 certainty scale, or less than ‘definitely sure’ in the narrative scale, as ‘no’ responses yields a hypothetical WTP that matches actual cash WTP reasonably well (Either, et al., (2000); Morrison and Brown, 2009; Blumenshein, et al (2008). Of course, this ex post empirical fix begs for a more general theory of what determines respondents’ reported certainty levels, and why any particular level of certainty appears to yield valid statements of actual WTP.

A second ex-post approach calibrates (or deflates) stated WTP to bring it down to actual WTP. Fox, et al. (1998) proposes what they call the CVM-X approach which calculates the ratio of actual WTP to hypothetical WTP elicited from the same respondents in an experiment administered immediately prior to the CVM WTP question. The calibration ratio from the experiment is then used to deflate the hypothetical WTP estimates obtained from the CVM WTP question. However, Fox, et al. , (1998, p. 464) warn that calibration factors developed for deliverable goods typically used in the lab may not be transferrable to many public goods needing to be valued (Fox, et al., 1998: 464).

The National Oceanic and Atmospheric Administration (NOAA) in its draft Natural Resource Damage Assessment regulations proposes a simple and ad hoc, calibration of WTP (NOAA, 1994). NOAA suggested that hypothetical WTP should be divided by two to calibrate it down to actual WTP. It is not clear from the regulations the origin or empirical basis for this 50% deflation factor. In contrast, we propose a more “tailored calibration” approach that can be based on meta-regression analysis (MRA). This alternative approach is in the spirit of the Fox, et al., and NOAA regulations. However, the MRA approach may be better tailored to public goods than CVM-X, and based on available empirical evidence as compared to the NOAA regulations.

3. An Assessment of the State of the Theory Regarding Hypothetical Bias

As can be seen in this variety of approaches, each is motivated by a different hypothesis about the source of hypothetical bias. These hypotheses are not necessarily incompatible with one another, but do not arise from a general theory of respondent behavior, in which each of these is a special case. In fact, some of the hypotheses share common elements. For example, Mitani and Flores (2010) and Champ, et al. (1997) both hypothesize that a source of hypothetical bias arises from respondent uncertainty, although each suggest the uncertainty arises from different causes. Regardless of the source of respondent uncertainty, it might be modeled in an expected utility framework[1]. For example, one possibility might be to calculate the respondent’s expected WTP using their hypothetical WTP times their perception of the probability of payment and see if that better matches their actual WTP than their raw stated WTP.

Of course the goal is to develop a single broad theory of respondent behavior with respect to hypothetical bias that is based on combining elements of the individual theory, and having the individual hypotheses described above fall out as special cases of the broader theory. Unfortunately, at this time the field does not provide the desired unified or general theory of why respondents give stated WTP responses that usually (but not always) exceed their actual WTP as judged by cash payments in the lab or field experiments. Thus, what are policy analysts and practicing economists left with? Our proposed empirical approach utilizes meta analysis equations to synthesize the empirical findings from all these approach to more provide a more systematic understanding of the empirical basis of the sources of hypothetical bias. If there are certain types of WTP questions less prone to hypothetical bias as Carson and Groves would suggest for the dichotomous choice format, this could guide survey design. Also meta analysis appears to be promising to arrive at calibration factors that might be applied to rescale hypothetical WTP down to actual WTP, an approach to dealing with hypothetical bias advocated by some.

In the next section of this paper we review two meta analysis that evaluate hypothetical bias in stated preference surveys and experiments in order to: (a) provide guidance on survey design; and (b) suggest a potential new use for meta analysis—that of customizing the calibration factor for the specific types of goods to be valued (public versus private) and valuation questions. Thus rather than using “average or median point estimates” of calibration factors from a heterogeneous set of field studies and lab experiments, the analyst could calculate from a well specified meta equation a calibration factor for the specific types of public goods, types of beneficiaries (users versus non-users), the characteristics of their study site and surrounding population. This proposal is in the spirit of the transition in benefit transfer from point estimates to benefit function transfer using meta analysis in non market valuation for benefit cost analysis.

4. What Meta-Regression Analyses Reveals About Hypothetical Bias and How it might be Mitigated

Determination of hypothetical bias is primarily assessed either through lab experiments or field studies. Two past meta-analyses have focused on hypothetical bias as revealed in a mix of experiments and field studies. List and Gallet (2001) investigate 29 cash validity studies, split about equally between lab and field experiments providing a total of 174 observations, while Murphy et al (2005) utilize 28 studies with 83 observations. List and Gallet suggest that their analysis provides “important insights” insights to typical CVM studies, including various elicitation methods such as dichotomous choice, and factors influencing calibration factors between hypothetical and actual WTP (List and Gallet, 2001: 242). Murphy et al (2005) utilize 28 studies with 83 observations. The difference in the number of studies is due to Murphy, et al. not including WTA studies in their analysis and a few other minor selection criteria.

With respect to hypothetical bias, List and Gallet (2001) find a mean calibration factor of three (hypothetical payment divided by actual payment). Murphy et al. (2005) report a simple mean ratio of 2.6, but a median of only 1.35. The reduced degree of hypothetical bias in Murphy et al. (2005) is largely a result of excluding WTA studies, which typically contain a higher degree of hypothetical bias. Nonetheless, Murphy, et al.’s (2005) meta-regression ‘estimate’ of the fitted mean calibration factor is 3.0. Thus, even though there are some differences in their methods, variables and data, these two meta-analyses find a fairly consistent degree of hypothetical bias.

To illustrate how meta-regression analysis might provide a short term practical solution to this important issue of hypothetical bias, we turn to List and Gallet’s meta-regression equation for the median calibration factor (CF) from each study. Table 1 is from List and Gallet, (2001) Table III. The variables are defined in Table 1 along with estimated coefficients.

Table 1. List and Gallet Meta Analysis Regression of the Absolute Value of the natural log of the Calibration Factor and the Independent Variables.

|Variable Name and Definition |Coefficient |

|Constant |1.84 |

|Laboratory= 1 if elicitation of hypothetical and actual values took |-.28 |

|place in laboratory setting | |

|Willingness to Pay as Measure of Value (WTP)=1, if Willingness to |-.56 |

|Accept=0 | |

|Private Good= 1 if hypothetical and actual values asked for a private |-.56 |

|good; 0 if public good | |

|Within Group= 1 if both hypothetical and actual values obtained from |-.06 |

|the same respondents; 0 if from split samples | |

|Open-Ended = 1 if OE elicitation of value used |.12 |

|First Price Auction =1 if this form of auction used |-.93 |

|Provision Point elicitation of value used |.54 |

|Smith Auction used |.44 |

|Becker, Degroot & Marschak (BDM) mechanism used |-.31 |

|Dichotomous Choice elicitation of value used |-.20 |

Common findings of the two meta analyses will be presented and discussed first, and then the variables unique to each meta anlaysis follows. There is some difficulty in directly comparing the two meta regressions due to the fact that List and Gallet use the calibration factor (hypothetical divided by actual WTP) as their dependent variable and Murphy, et al, use actual value as a the dependent and include hypothetical value as an independent variable. Thus the two studies can only be compared qualitatively. Below, we go into more detail on the List and Gallet as it better illustrates the potential to use meta analysis to calculate tailored calibration factors to specific CVM surveys.

• Type of good: Public versus Private. The evidence from these two meta analyses indicates that using a private good reduces the hypothetical bias, i.e., reduces the calibration factor in List & Gallet; increases actual value in Murphy, which holding hypothetical value constant, also reduces the calibration factor.

• Value Elicitation Question Format: All of List and Gallet’s meta results indicate that using a dichotomous choice question format does not reduce the degree of hypothetical bias, and has no statistical effect on the calibration factor. Murphy, et al, group together closed-ended approaches such as dichotomous choice with conjoint and payment card (they call Choice). They find that in all four of their meta analyses that close-ended value elicitation questions increase actual value, hence reduces the hypothetical bias. It is not altogether clear why they obtain different results here, but it may be due to differences in the way dichotomous choice is coded in the two meta analyses. Empirical resolution of these disparate findings is an important avenue for future research. With respect to commonly used laboratory elicitation methods e.g., Smith auction, random price auction, etc., List and Gallet find no statistically significant effect on the calibration factor.

• WTP vs WTA: List and Gallet find use of WTP reduces the calibration factor

• Sample Pool: Murphy et al find use of students in the lab experiments significantly reduces actual value, hence increasing the hypothetical bias.

• Ex-ante and Ex-post calibration efforts; The use of cheap talk and uncertainty recoding has a significant effect on increasing actual WTP in two out of four of Murphy, et al’s meta regressions. By increasing actual WTP, ceteris paribus, hypothetical bias is reduced and the calibration factor is reduced.

At present, due to the differences in the two meta analyses definitions of the dependent variable and some of the independent variables, it is difficult to evaluate the magnitude of the effects between these two meta analyses. We hope authors of future meta analyses provide at least one of their meta analysis regressions that are similar to one or both of these past models so a greater degree of comparability can be obtained. This would aid in going beyond just qualitative comparisons of signs and significance to include the influence on the magnitude of the calibration factors. Hopefully, as more meta analyses are completed the resulting findings may also lead to empirical consensus on causes of hypothetical bias that can be used to aid survey designers. However, in order to illustrate our proposed use of meta analysis regression equation to calculate calibration factors that are tailored to the specifics of a new or existing CVM study we use the List and Gallet Meta analysis.

A Proposed Meta Analysis Calibration Approach

What is appealing about using a meta-regression analysis (MRA) of calibration factors is that it can be easily employed to estimate a calibration factor for specific circumstances relevant to the environmental application at hand. To illustrate how MRA could be used to calculate calibration factors tailored to specific types of CVM studies, we use List and Gallet’s MRA in Table 1. We recognize that it is quite premature to treat these as concrete findings given the insignificance of several variables in the List and Gallet model. In order to calculate specific calibration factors, we set the values of the independent variables to match a particular configuration of valuation question format and type of good. Table 2 presents the calculated calibration factors. Given the limitations noted above, it may be best to view the results in Table 2 as providing a ranking of calibration factors rather than precise numbers. Given this caveat, the calibration factors do vary systematically depending on obvious characteristics of the valuation survey and the good. For private goods, the calibration factor of hypothetical WTP to actual WTP ranges between 1.68 and 2.32. For WTP for public goods, the calibration factors appear to be between 2.5 and 3. If we accept the common wisdom that dichotomous choice and willingness to pay represent the ‘industry standard’ for the non-market valuation of public goods, then a calibration factor of three seems to be about correct, implying that two-thirds of stated-preference values are likely to be hypothetical bias.

Table 2: Illustrative Calibration Factors Calculated from List and Gallet (2001)

_______________________________________________

|Welfare measure, question format & type of|MRA Calibration Factors |

|good | |

|WTP, DC*, Private Good |1.68 |

|WTP OE**, Private Good |2.32 |

|WTP, DC, Public Good |2.95 |

|WTP OE, Public Good |4.05 |

|WTA, DC, Private Good |2.90 |

|WTA, OE, Private Good |4.06 |

|WTA, DC, Public Good |6.29 |

|WTA, OE, Public Good |7.10 |

____________________________________________________

* DC=Dichotomous Choice elicitation question format.

**OE=Open Ended elicitation question format.

5. Needed Future Research

There are two major avenues for needed future research. In the near term more empirical experiments comparing hypothetical and actual values would improve meta analysis’ ability to: (a) provide better guidance on survey design; (b) provide calibration factors. In particular meta analyses are quite thin on public goods and experiments using the provision point mechanism. Another short term area for future research is empirical implementation in a field experiment of Mitani and Flores (2010) ex ante approach to explicitly presenting respondent with the probability of provision and payment. Such an experiment would be quite informative regarding the potential of Mitani and Flores’ approach to reduce hypothetical bias in CVM surveys. Related research on respondents’ own perceived probability of payment and provision in a field experiment would allow exploration of whether an expected utility approach might explain the disparity between hypothetical and actual values.

However, in the long run, nothing replaces a well developed theory of respondent hypothetical bias. Several possibilities emerge here: (a) the incrementalist approach of drawing the “best” features of the existing theories that have been put forward; (b) an entirely new theory of respondent behavior that might draw from behavioral economics and psychology. There is clearly plenty of research opportunities for theorists and empiricists alike in providing an improved understanding of why there is a disparity between actual and hypothetical WTP and what can be done about it.

References

Aadland, D. and A. Caplan. 2003. Willingness to Pay for Curbside Recycling with Detection and Mitigation of Hypothetical Bias. American Journal of Agricultural Economics 85(2): 491-502.

Aadland, D. and A. Caplan. 2006. Cheap Talk Reconsidered: New Evidence from CVM. Journal of Economic Behavior and Organization 60(4): 562-578.

Blumenschein, K., G. Blomquist, M. Johanneson, N. Horn and P. Freeman. 2008. Eliciting Willingness to Pay without Bias: Evidence from a Field Experiment. Economic Journal 118(1): 114-137.

Brookshire, D., M. Thayer, W. Shulze and R. d’Arge. 1982. Valuing Public Goods: A Comparison of Survey and Hedonic Approaches. American Economic Review 72(1): 165-176.

Carson, R., N. Flores, K. Martin, and J. Wright. 1996. Contingent valuation and revealed preferences methodologies: Comparing the estimates for quasi-public goods. Land Economics 72(1):80-99.

Carson, R.T. and Groves, T. (2007), “Incentive and Informational Properties of Preference Questions,” Environmental and Resource Economics 37, 181-210.

Champ, P. A., Bishop, R. C., Brown, T. C., and McCollum, D. W. (1997), “Using Donation

Mechanisms to Value Nonuse Benefits from Public Goods,” Journal of Environmental

Economics and Management 33, 151-162.

Champ, P., R. Moore and R. Bishop. A Comparison of Approaches to Mitigate Hypothetical Bias. Agricultural and Resource Economics Review 38(2): 166-180.

Cummings, R., D. Brookshire and W. Schulze. 1986. Valuing Environmental Goods: An Assessment of the Contingent Valuation Method. Rowman & Allenheld, Totowa NJ.

Cummings, R. and L. Taylor. 1999. Unbiased Value Estimates fro Environmental Goods: A Cheap Talk Design for the Contingent Valuation Method. American Economic Review 89: 649-665.

Ethier, R. G. Poe, W. Schulze and J. Clark. 2000. A Comparison of Hypothetical Phone and Mail Contingent Valuation Responses for Green Pricing Electricity Programs. Land Economics 76(1): 54-67.

Fox, J., J. Shogren, D. Hayes and J. Kliebenstein. 1998. CVM-X: Calibrating Contingent Values with Experimental Auction Markets: American Journal of Agricultural Economics 80(3): 455-465.

Freeman, M. 1993. The Measurement of Environmental and Resource Values: Theory and Methods. Resources for the Future, Washington DC.

Gibbard, A. 1973. Manipulation of Voting Schemes: A General Result. Econometrica 41: 587-601.

Krutilla, J. 1967. Conservation Reconsidered. American Economic Review 57: 777-786.

Larson, D. 1993. On Measuring Existence Values. Land Economics 69(4): 377-388.

List, J. and C. Gallet. 2001. What Experimental Protocol Influences Disparities between Actual and Hypothetical Stated Values? Environmental and Resource Economics 20: 241-254.

Morisson, M. and T. Brown. 2009. Testing the Effectiveness of Certainty Scales, Cheap Talk, and Dissonance Minimization in Reducing Hypothetical Bias in Contingent Valuation Studies. Environmental and Resource Economics 44(3): 307-326.

Mitani, Y. and N. Flores. 2010. Hypothetical Bias Reconsidered: Payment and Provision Uncertainties in a Threshold Provision Mechanism. Paper presented at the World Congress on Environmental and Resource Economics, Montreal Canada. July 1, 2010

Murphy, J.J., Allen, P.G., Stevens, T.H., and Weatherhead, D. (2005), “A Meta-Analysis of

Hypothetical Bias in Stated Preference Valuation,” Environmental and Resource Economics 30:313-325.

National Oceanic and Atmopsheric Administration. 1994. Natural Resource Damage Assessment: Proposed Rules. Federal Register 59: 23098-23111. May 4, 1994.

National Oceanic and Atmopsheric Administration. 1996. Natural Resource Damage Assessment: Final Rules. Federal Register 61: 439-510. January 5 1996.

Parsons, G. 2003. The Travel Cost Model. 2003. In P.Champ, K. Boyle and T. Brown, eds. A Primer on Non Market Valuation. Kluwer Academic Publishers, The Netherlands.

Satterhwaite, M. 1975. Strategy-proofness and Arrow Conditions: Existence and Correspondence Theorems for Voting Procedures and Welfare Functions. Journal of Economic Theory 10: 187-217.

Taylor, L. 2003. The Hedonic Method. In P.Champ, K. Boyle and T. Brown, eds. A Primer on Non Market Valuation. Kluwer Academic Publishers, The Netherlands.

-----------------------

[1] The author wishes to thank Tom Stanley, the Associate Editor, for his suggestion that the expected utility approach to consumer decision making might provide some insights on the hypothetical versus actual WTP.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download