Multiple Imputation for Missing Data via Sequential ...

Multiple Imputation for Missing Data via Sequential Regression TreesLane F. Burgette and Jerome P. Reiter Multiple Imputation for Missing Data via Sequential Regression TreesAbstract: Multiple imputation is particularly well suited to deal with missing data in large epidemiological studies, since typically these studies support a wide range of analyses by many data users. Some of these analyses may involve complex modeling, including interactions and non-linear relationships. Identifying such relationships and encoding them in imputation models, e.g., in the conditional regressions for multiple imputation via chained equations, can be daunting tasks with large numbers of categorical and continuous variables. We present a non-parametric approach for implementing multiple imputation via chained equations by using sequential regression trees as the conditional models. This has the potential to capture complex relationships with minimal tuning by the data imputer. Using simulations, we demonstrate that the method can result in more plausible imputations, and hence more reliable inferences, in complex settings than the naive application of standard sequential regression imputation techniques. We apply the approach to impute missing values in data on adverse birth outcomes with more than 100 clinical and survey variables. We evaluate the imputations using posterior predictive checks with several epidemiological analyses of interest. Keywords: Missing data, Sequential imputation, Regression tree, Diagnostic check, Pregnancy outcomeIn large epidemiological studies, data collection almost inevitably is plagued by missing data, for example due to item non-response. One approach for handling missing data in such contexts is multiple imputation (MI) ADDIN REFMGR.CITE <Refman><Cite><Author>Rubin</Author><Year>1987</Year><RecNum>46</RecNum><IDText>Multiple imputation for nonresponse in surveys</IDText><MDL Ref_Type="Book, Whole"><Ref_Type>Book, Whole</Ref_Type><Ref_ID>46</Ref_ID><Title_Primary>Multiple imputation for nonresponse in surveys</Title_Primary><Authors_Primary>Rubin,D.</Authors_Primary><Date_Primary>1987</Date_Primary><Keywords>imputation</Keywords><Keywords>multiple imputation</Keywords><Keywords>nonresponse in surveys</Keywords><Reprint>Not in File</Reprint><Pub_Place>Hoboken, NJ</Pub_Place><Publisher>Wiley-IEEE</Publisher><ZZ_WorkformID>2</ZZ_WorkformID></MDL></Cite></Refman>(25). MI is appealing because it allows a team of researchers to address the missing data, after which any number of analyses may be performed using standard complete-data techniques. To carry out MI, the team fills in the missing values with draws from some predictive model m times, resulting in m completed datasets that can be used for the analysis. The analyst computes point and variance estimates of interest with each dataset and combines these estimates ADDIN REFMGR.CITE <Refman><Cite><Author>Rubin</Author><Year>1987</Year><RecNum>46</RecNum><IDText>Multiple imputation for nonresponse in surveys</IDText><MDL Ref_Type="Book, Whole"><Ref_Type>Book, Whole</Ref_Type><Ref_ID>46</Ref_ID><Title_Primary>Multiple imputation for nonresponse in surveys</Title_Primary><Authors_Primary>Rubin,D.</Authors_Primary><Date_Primary>1987</Date_Primary><Keywords>imputation</Keywords><Keywords>multiple imputation</Keywords><Keywords>nonresponse in surveys</Keywords><Reprint>Not in File</Reprint><Pub_Place>Hoboken, NJ</Pub_Place><Publisher>Wiley-IEEE</Publisher><ZZ_WorkformID>2</ZZ_WorkformID></MDL></Cite></Refman>(25). These formulas serve to propagate the uncertainty introduced by missing values through analysts' inferences ADDIN REFMGR.CITE <Refman><Cite><Author>Reiter</Author><Year>2007</Year><RecNum>43</RecNum><IDText>The multiple adaptations of multiple imputation</IDText><MDL Ref_Type="Journal"><Ref_Type>Journal</Ref_Type><Ref_ID>43</Ref_ID><Title_Primary>The multiple adaptations of multiple imputation</Title_Primary><Authors_Primary>Reiter,J.P.</Authors_Primary><Authors_Primary>Raghunathan,T.E.</Authors_Primary><Date_Primary>2007/12</Date_Primary><Keywords>confidentiality</Keywords><Keywords>expectation</Keywords><Keywords>FREEDOM</Keywords><Keywords>imputation</Keywords><Keywords>IMPUTED DATA</Keywords><Keywords>INFERENCE</Keywords><Keywords>measurement error</Keywords><Keywords>MEASUREMENT-ERROR</Keywords><Keywords>microdata</Keywords><Keywords>missing data</Keywords><Keywords>MISSING-DATA</Keywords><Keywords>MODELS</Keywords><Keywords>multiple imputation</Keywords><Keywords>MULTIPLY-IMPUTED DATA</Keywords><Keywords>REGRESSION</Keywords><Keywords>SIGNIFICANCE TESTS</Keywords><Keywords>SMALL-SAMPLE DEGREES</Keywords><Keywords>synthetic</Keywords><Reprint>Not in File</Reprint><Start_Page>1462</Start_Page><End_Page>1471</End_Page><Periodical>Journal of the American Statistical Association</Periodical><Volume>102</Volume><Issue>480</Issue><ISSN_ISBN>0162-1459</ISSN_ISBN><Misc_3>DOI 10.1198/016214507000000932</Misc_3><Address>Duke Univ, Dept Stat Sci, Durham, NC 27708 USA
Univ Michigan, Inst Social Res, Dept Biostat, Ann Arbor, MI 48109 USA</Address><Web_URL>ISI:000251829200039</Web_URL><ZZ_JournalFull><f name="System">Journal of the American Statistical Association</f></ZZ_JournalFull><ZZ_WorkformID>1</ZZ_WorkformID></MDL></Cite></Refman>(23). For reviews of MI, see PFJlZm1hbj48Q2l0ZT48QXV0aG9yPlJ1YmluPC9BdXRob3I+PFllYXI+MTk5NjwvWWVhcj48UmVj

TnVtPjE8L1JlY051bT48SURUZXh0Pk11bHRpcGxlIGltcHV0YXRpb24gYWZ0ZXIgMTgrIHllYXJz

PC9JRFRleHQ+PE1ETCBSZWZfVHlwZT0iSm91cm5hbCI+PFJlZl9UeXBlPkpvdXJuYWw8L1JlZl9U

eXBlPjxSZWZfSUQ+MTwvUmVmX0lEPjxUaXRsZV9QcmltYXJ5Pk11bHRpcGxlIGltcHV0YXRpb24g

YWZ0ZXIgMTgrIHllYXJzPC9UaXRsZV9QcmltYXJ5PjxBdXRob3JzX1ByaW1hcnk+UnViaW4sRC5C

LjwvQXV0aG9yc19QcmltYXJ5PjxEYXRlX1ByaW1hcnk+MTk5Ni82PC9EYXRlX1ByaW1hcnk+PEtl

eXdvcmRzPkFJRFM8L0tleXdvcmRzPjxLZXl3b3Jkcz5jb25maWRlbmNlIHZhbGlkaXR5PC9LZXl3

b3Jkcz48S2V5d29yZHM+SU1QVVRFRCBEQVRBPC9LZXl3b3Jkcz48S2V5d29yZHM+SU5EVVNUUlk8

L0tleXdvcmRzPjxLZXl3b3Jkcz5JTkZFUkVOQ0U8L0tleXdvcmRzPjxLZXl3b3Jkcz5KQUNLS05J

RkU8L0tleXdvcmRzPjxLZXl3b3Jkcz5MT0dJU1RJQy1SRUdSRVNTSU9OPC9LZXl3b3Jkcz48S2V5

d29yZHM+bWlzc2luZyBkYXRhPC9LZXl3b3Jkcz48S2V5d29yZHM+TUlTU0lORyBWQUxVRVM8L0tl

eXdvcmRzPjxLZXl3b3Jkcz5OT05JR05PUkFCTEUgTk9OUkVTUE9OU0U8L0tleXdvcmRzPjxLZXl3

b3Jkcz5ub25yZXNwb25zZSBpbiBzdXJ2ZXlzPC9LZXl3b3Jkcz48S2V5d29yZHM+T0NDVVBBVElP

TiBDT0RFUzwvS2V5d29yZHM+PEtleXdvcmRzPnB1YmxpYy11c2UgZmlsZXM8L0tleXdvcmRzPjxL

ZXl3b3Jkcz5zYW1wbGUgc3VydmV5czwvS2V5d29yZHM+PEtleXdvcmRzPlNBTVBMRS1TVVJWRVlT

PC9LZXl3b3Jkcz48S2V5d29yZHM+c3VwZXJlZmZpY2llbnQgcHJvY2VkdXJlczwvS2V5d29yZHM+

PFJlcHJpbnQ+Tm90IGluIEZpbGU8L1JlcHJpbnQ+PFN0YXJ0X1BhZ2U+NDczPC9TdGFydF9QYWdl

PjxFbmRfUGFnZT40ODk8L0VuZF9QYWdlPjxQZXJpb2RpY2FsPkpvdXJuYWwgb2YgdGhlIEFtZXJp

Y2FuIFN0YXRpc3RpY2FsIEFzc29jaWF0aW9uPC9QZXJpb2RpY2FsPjxWb2x1bWU+OTE8L1ZvbHVt

ZT48SXNzdWU+NDM0PC9Jc3N1ZT48SVNTTl9JU0JOPjAxNjItMTQ1OTwvSVNTTl9JU0JOPjxXZWJf

VVJMPklTSTpBMTk5NlVQNTUyMDAwMDg8L1dlYl9VUkw+PFpaX0pvdXJuYWxGdWxsPjxmIG5hbWU9

IlN5c3RlbSI+Sm91cm5hbCBvZiB0aGUgQW1lcmljYW4gU3RhdGlzdGljYWwgQXNzb2NpYXRpb248

L2Y+PC9aWl9Kb3VybmFsRnVsbD48WlpfV29ya2Zvcm1JRD4xPC9aWl9Xb3JrZm9ybUlEPjwvTURM

PjwvQ2l0ZT48Q2l0ZT48QXV0aG9yPkJhcm5hcmQ8L0F1dGhvcj48WWVhcj4xOTk5PC9ZZWFyPjxS

ZWNOdW0+NDwvUmVjTnVtPjxJRFRleHQ+QXBwbGljYXRpb25zIG9mIG11bHRpcGxlIGltcHV0YXRp

b24gaW4gbWVkaWNhbCBzdHVkaWVzOiBmcm9tIEFJRFMgdG8gTkhBTkVTPC9JRFRleHQ+PE1ETCBS

ZWZfVHlwZT0iSm91cm5hbCI+PFJlZl9UeXBlPkpvdXJuYWw8L1JlZl9UeXBlPjxSZWZfSUQ+NDwv

UmVmX0lEPjxUaXRsZV9QcmltYXJ5PkFwcGxpY2F0aW9ucyBvZiBtdWx0aXBsZSBpbXB1dGF0aW9u

IGluIG1lZGljYWwgc3R1ZGllczogZnJvbSBBSURTIHRvIE5IQU5FUzwvVGl0bGVfUHJpbWFyeT48

QXV0aG9yc19QcmltYXJ5PkJhcm5hcmQsSi48L0F1dGhvcnNfUHJpbWFyeT48QXV0aG9yc19Qcmlt

YXJ5Pk1lbmcsWC5MLjwvQXV0aG9yc19QcmltYXJ5PjxEYXRlX1ByaW1hcnk+MTk5OS8zPC9EYXRl

X1ByaW1hcnk+PEtleXdvcmRzPkFJRFM8L0tleXdvcmRzPjxLZXl3b3Jkcz5JTkZFUkVOQ0U8L0tl

eXdvcmRzPjxLZXl3b3Jkcz5MSUtFTElIT09EPC9LZXl3b3Jkcz48S2V5d29yZHM+bWlzc2luZyBk

YXRhPC9LZXl3b3Jkcz48S2V5d29yZHM+TU9ERUw8L0tleXdvcmRzPjxLZXl3b3Jkcz5tdWx0aXBs

ZSBpbXB1dGF0aW9uPC9LZXl3b3Jkcz48S2V5d29yZHM+Tk9OQ09NUExJQU5DRTwvS2V5d29yZHM+

PEtleXdvcmRzPlJBTkRPTUlaRUQgQ09OVFJPTExFRCBUUklBTFM8L0tleXdvcmRzPjxLZXl3b3Jk

cz5TVEFUSVNUSUNTPC9LZXl3b3Jkcz48S2V5d29yZHM+U1VSVklWQUw8L0tleXdvcmRzPjxLZXl3

b3Jkcz5UUkVORFM8L0tleXdvcmRzPjxSZXByaW50Pk5vdCBpbiBGaWxlPC9SZXByaW50PjxTdGFy

dF9QYWdlPjE3PC9TdGFydF9QYWdlPjxFbmRfUGFnZT4zNjwvRW5kX1BhZ2U+PFBlcmlvZGljYWw+

U3RhdGlzdGljYWwgTWV0aG9kcyBpbiBNZWRpY2FsIFJlc2VhcmNoPC9QZXJpb2RpY2FsPjxWb2x1

bWU+ODwvVm9sdW1lPjxJc3N1ZT4xPC9Jc3N1ZT48SVNTTl9JU0JOPjA5NjItMjgwMjwvSVNTTl9J

U0JOPjxBZGRyZXNzPlVuaXYgQ2hpY2FnbywgRGVwdCBTdGF0LCBDaGljYWdvLCBJTCA2MDYzNyBV

U0EmI3hBO0hhcnZhcmQgVW5pdiwgRGVwdCBTdGF0LCBDYW1icmlkZ2UsIE1BIDAyMTM4IFVTQTwv

QWRkcmVzcz48V2ViX1VSTD5JU0k6MDAwMDgzNjk5OTAwMDAzPC9XZWJfVVJMPjxaWl9Kb3VybmFs

RnVsbD48ZiBuYW1lPSJTeXN0ZW0iPlN0YXRpc3RpY2FsIE1ldGhvZHMgaW4gTWVkaWNhbCBSZXNl

YXJjaDwvZj48L1paX0pvdXJuYWxGdWxsPjxaWl9Xb3JrZm9ybUlEPjE8L1paX1dvcmtmb3JtSUQ+

PC9NREw+PC9DaXRlPjxDaXRlPjxBdXRob3I+U2NoYWZlcjwvQXV0aG9yPjxZZWFyPjE5OTk8L1ll

YXI+PFJlY051bT40ODwvUmVjTnVtPjxJRFRleHQ+TXVsdGlwbGUgaW1wdXRhdGlvbjogYSBwcmlt

ZXI8L0lEVGV4dD48TURMIFJlZl9UeXBlPSJKb3VybmFsIj48UmVmX1R5cGU+Sm91cm5hbDwvUmVm

X1R5cGU+PFJlZl9JRD40ODwvUmVmX0lEPjxUaXRsZV9QcmltYXJ5Pk11bHRpcGxlIGltcHV0YXRp

b246IGEgcHJpbWVyPC9UaXRsZV9QcmltYXJ5PjxBdXRob3JzX1ByaW1hcnk+U2NoYWZlcixKLkwu

PC9BdXRob3JzX1ByaW1hcnk+PERhdGVfUHJpbWFyeT4xOTk5LzM8L0RhdGVfUHJpbWFyeT48S2V5

d29yZHM+aW1wdXRhdGlvbjwvS2V5d29yZHM+PEtleXdvcmRzPklNUFVURUQgREFUQTwvS2V5d29y

ZHM+PEtleXdvcmRzPkxJS0VMSUhPT0Q8L0tleXdvcmRzPjxLZXl3b3Jkcz5NSVNTSU5HIFZBTFVF

UzwvS2V5d29yZHM+PEtleXdvcmRzPm11bHRpcGxlIGltcHV0YXRpb248L0tleXdvcmRzPjxLZXl3

b3Jkcz5URVNUUzwvS2V5d29yZHM+PFJlcHJpbnQ+Tm90IGluIEZpbGU8L1JlcHJpbnQ+PFN0YXJ0

X1BhZ2U+MzwvU3RhcnRfUGFnZT48RW5kX1BhZ2U+MTU8L0VuZF9QYWdlPjxQZXJpb2RpY2FsPlN0

YXRpc3RpY2FsIE1ldGhvZHMgaW4gTWVkaWNhbCBSZXNlYXJjaDwvUGVyaW9kaWNhbD48Vm9sdW1l

Pjg8L1ZvbHVtZT48SXNzdWU+MTwvSXNzdWU+PElTU05fSVNCTj4wOTYyLTI4MDI8L0lTU05fSVNC

Tj48QWRkcmVzcz5QZW5uIFN0YXRlIFVuaXYsIERlcHQgU3RhdCwgVW5pdmVyc2l0eSBQaywgUEEg

MTY4MDIgVVNBPC9BZGRyZXNzPjxXZWJfVVJMPklTSTowMDAwODM2OTk5MDAwMDI8L1dlYl9VUkw+

PFpaX0pvdXJuYWxGdWxsPjxmIG5hbWU9IlN5c3RlbSI+U3RhdGlzdGljYWwgTWV0aG9kcyBpbiBN

ZWRpY2FsIFJlc2VhcmNoPC9mPjwvWlpfSm91cm5hbEZ1bGw+PFpaX1dvcmtmb3JtSUQ+MTwvWlpf

V29ya2Zvcm1JRD48L01ETD48L0NpdGU+PENpdGU+PEF1dGhvcj5IYXJlbDwvQXV0aG9yPjxZZWFy

PjIwMDc8L1llYXI+PFJlY051bT4xMTwvUmVjTnVtPjxJRFRleHQ+TXVsdGlwbGUgaW1wdXRhdGlv

bjogUmV2aWV3IG9mIHRoZW9yeSwgaW1wbGVtZW50YXRpb24gYW5kIHNvZnR3YXJlPC9JRFRleHQ+

PE1ETCBSZWZfVHlwZT0iSm91cm5hbCI+PFJlZl9UeXBlPkpvdXJuYWw8L1JlZl9UeXBlPjxSZWZf

SUQ+MTE8L1JlZl9JRD48VGl0bGVfUHJpbWFyeT5NdWx0aXBsZSBpbXB1dGF0aW9uOiBSZXZpZXcg

b2YgdGhlb3J5LCBpbXBsZW1lbnRhdGlvbiBhbmQgc29mdHdhcmU8L1RpdGxlX1ByaW1hcnk+PEF1

dGhvcnNfUHJpbWFyeT5IYXJlbCxPLjwvQXV0aG9yc19QcmltYXJ5PjxBdXRob3JzX1ByaW1hcnk+

WmhvdSxYLkguPC9BdXRob3JzX1ByaW1hcnk+PERhdGVfUHJpbWFyeT4yMDA3LzcvMjA8L0RhdGVf

UHJpbWFyeT48S2V5d29yZHM+REFUQSBBVUdNRU5UQVRJT048L0tleXdvcmRzPjxLZXl3b3Jkcz5k

aWFnbm9zdGljIHRlc3RzPC9LZXl3b3Jkcz48S2V5d29yZHM+RElTRUFTRTwvS2V5d29yZHM+PEtl

eXdvcmRzPkRST1AtT1VUPC9LZXl3b3Jkcz48S2V5d29yZHM+RVNUSU1BVE9SUzwvS2V5d29yZHM+

PEtleXdvcmRzPklNUFVURUQgREFUQTwvS2V5d29yZHM+PEtleXdvcmRzPklOQ09NUExFVEUgREFU

QTwvS2V5d29yZHM+PEtleXdvcmRzPklORkVSRU5DRTwvS2V5d29yZHM+PEtleXdvcmRzPm1pc3Np

bmcgZGF0YTwvS2V5d29yZHM+PEtleXdvcmRzPm11bHRpcGxlIGltcHV0YXRpb248L0tleXdvcmRz

PjxLZXl3b3Jkcz5ub25yZXNwb25zZSBpbiBzdXJ2ZXlzPC9LZXl3b3Jkcz48S2V5d29yZHM+UEFU

VEVSTi1NSVhUVVJFIE1PREVMUzwvS2V5d29yZHM+PEtleXdvcmRzPlBPU1RFUklPUiBESVNUUklC

VVRJT05TPC9LZXl3b3Jkcz48S2V5d29yZHM+c2Vuc2l0aXZpdHkgYW5kIHNwZWNpZmljaXR5PC9L

ZXl3b3Jkcz48S2V5d29yZHM+VEVTVFM8L0tleXdvcmRzPjxLZXl3b3Jkcz5WRVJJRklDQVRJT048

L0tleXdvcmRzPjxSZXByaW50Pk5vdCBpbiBGaWxlPC9SZXByaW50PjxTdGFydF9QYWdlPjMwNTc8

L1N0YXJ0X1BhZ2U+PEVuZF9QYWdlPjMwNzc8L0VuZF9QYWdlPjxQZXJpb2RpY2FsPlN0YXRpc3Rp

Y3MgaW4gTWVkaWNpbmU8L1BlcmlvZGljYWw+PFZvbHVtZT4yNjwvVm9sdW1lPjxJc3N1ZT4xNjwv

SXNzdWU+PElTU05fSVNCTj4wMjc3LTY3MTU8L0lTU05fSVNCTj48TWlzY18zPkRPSSAxMC4xMDAy

L3NpbS4yNzg3PC9NaXNjXzM+PEFkZHJlc3M+VW5pdiBDb25uZWN0aWN1dCwgRGVwdCBTdGF0LCBT

dG9ycnMsIENUIDA2MjY5IFVTQSYjeEE7VkEgUHVnZXQgU291bmQgSGx0aCBDYXJlIFN5c3QsIEhT

UiZhbXA7RCBDdHIgRXhjZWxsZW5jZSwgU2VhdHRsZSwgV0EgOTgxMDggVVNBJiN4QTtVbml2IFdh

c2hpbmd0b24sIFNjaCBQdWJsIEhsdGgsIERlcHQgQmlvc3RhdCwgU2VhdHRsZSwgV0EgOTgxOTUg

VVNBPC9BZGRyZXNzPjxXZWJfVVJMPklTSTowMDAyNDc4MzY3MDAwMDE8L1dlYl9VUkw+PFpaX0pv

dXJuYWxGdWxsPjxmIG5hbWU9IlN5c3RlbSI+U3RhdGlzdGljcyBpbiBNZWRpY2luZTwvZj48L1pa

X0pvdXJuYWxGdWxsPjxaWl9Xb3JrZm9ybUlEPjE8L1paX1dvcmtmb3JtSUQ+PC9NREw+PC9DaXRl

PjxDaXRlPjxBdXRob3I+U3R1YXJ0PC9BdXRob3I+PFllYXI+MjAwOTwvWWVhcj48UmVjTnVtPjQ5

PC9SZWNOdW0+PElEVGV4dD5NdWx0aXBsZSBJbXB1dGF0aW9uIFdpdGggTGFyZ2UgRGF0YSBTZXRz

OiBBIENhc2UgU3R1ZHkgb2YgdGhlIENoaWxkcmVuJmFwb3M7cyBNZW50YWwgSGVhbHRoIEluaXRp

YXRpdmU8L0lEVGV4dD48TURMIFJlZl9UeXBlPSJKb3VybmFsIj48UmVmX1R5cGU+Sm91cm5hbDwv

UmVmX1R5cGU+PFJlZl9JRD40OTwvUmVmX0lEPjxUaXRsZV9QcmltYXJ5Pk11bHRpcGxlIEltcHV0

YXRpb24gV2l0aCBMYXJnZSBEYXRhIFNldHM6IEEgQ2FzZSBTdHVkeSBvZiB0aGUgQ2hpbGRyZW4m

YXBvcztzIE1lbnRhbCBIZWFsdGggSW5pdGlhdGl2ZTwvVGl0bGVfUHJpbWFyeT48QXV0aG9yc19Q

cmltYXJ5PlN0dWFydCxFLkEuPC9BdXRob3JzX1ByaW1hcnk+PEF1dGhvcnNfUHJpbWFyeT5BenVy

LE0uPC9BdXRob3JzX1ByaW1hcnk+PEF1dGhvcnNfUHJpbWFyeT5GcmFuZ2FraXMsQy48L0F1dGhv

cnNfUHJpbWFyeT48QXV0aG9yc19QcmltYXJ5PkxlYWYsUC48L0F1dGhvcnNfUHJpbWFyeT48RGF0

ZV9QcmltYXJ5PjIwMDkvNS8xPC9EYXRlX1ByaW1hcnk+PEtleXdvcmRzPmltcHV0YXRpb248L0tl

eXdvcmRzPjxLZXl3b3Jkcz5tZW50YWwgaGVhbHRoIHNlcnZpY2VzPC9LZXl3b3Jkcz48S2V5d29y

ZHM+bWlzc2luZyBhdCByYW5kb208L0tleXdvcmRzPjxLZXl3b3Jkcz5taXNzaW5nIGRhdGE8L0tl

eXdvcmRzPjxLZXl3b3Jkcz5NSVNTSU5HLURBVEE8L0tleXdvcmRzPjxLZXl3b3Jkcz5tdWx0aXBs

ZSBpbXB1dGF0aW9uPC9LZXl3b3Jkcz48S2V5d29yZHM+TVVMVElWQVJJQVRFIE1JU1NJTkctREFU

QTwvS2V5d29yZHM+PEtleXdvcmRzPk5PTlJFU1BPTlNFPC9LZXl3b3Jkcz48S2V5d29yZHM+UE9Q

VUxBVElPTjwvS2V5d29yZHM+PEtleXdvcmRzPlNUUkFURUdJRVM8L0tleXdvcmRzPjxLZXl3b3Jk

cz5WQUxVRVM8L0tleXdvcmRzPjxSZXByaW50Pk5vdCBpbiBGaWxlPC9SZXByaW50PjxTdGFydF9Q

YWdlPjExMzM8L1N0YXJ0X1BhZ2U+PEVuZF9QYWdlPjExMzk8L0VuZF9QYWdlPjxQZXJpb2RpY2Fs

PkFtZXJpY2FuIEpvdXJuYWwgb2YgRXBpZGVtaW9sb2d5PC9QZXJpb2RpY2FsPjxWb2x1bWU+MTY5

PC9Wb2x1bWU+PElzc3VlPjk8L0lzc3VlPjxJU1NOX0lTQk4+MDAwMi05MjYyPC9JU1NOX0lTQk4+

PE1pc2NfMz5ET0kgMTAuMTA5My9hamUva3dwMDI2PC9NaXNjXzM+PEFkZHJlc3M+Sm9obnMgSG9w

a2lucyBCbG9vbWJlcmcgU2NoIFB1YmwgSGx0aCwgRGVwdCBCaW9zdGF0LCBCYWx0aW1vcmUsIE1E

IDIxMjA1IFVTQSYjeEE7Sm9obnMgSG9wa2lucyBCbG9vbWJlcmcgU2NoIFB1YmwgSGx0aCwgRGVw

dCBNZW50YWwgSGx0aCwgQmFsdGltb3JlLCBNRCAyMTIwNSBVU0E8L0FkZHJlc3M+PFdlYl9VUkw+

SVNJOjAwMDI2NTI2NzEwMDAxMTwvV2ViX1VSTD48WlpfSm91cm5hbEZ1bGw+PGYgbmFtZT0iU3lz

dGVtIj5BbWVyaWNhbiBKb3VybmFsIG9mIEVwaWRlbWlvbG9neTwvZj48L1paX0pvdXJuYWxGdWxs

PjxaWl9Xb3JrZm9ybUlEPjE8L1paX1dvcmtmb3JtSUQ+PC9NREw+PC9DaXRlPjxDaXRlPjxBdXRo

b3I+S2xlYmFub2ZmPC9BdXRob3I+PFllYXI+MjAwODwvWWVhcj48UmVjTnVtPjE0PC9SZWNOdW0+

PElEVGV4dD5Vc2Ugb2YgbXVsdGlwbGUgaW1wdXRhdGlvbiBpbiB0aGUgZXBpZGVtaW9sb2dpYyBs

aXRlcmF0dXJlPC9JRFRleHQ+PE1ETCBSZWZfVHlwZT0iSm91cm5hbCI+PFJlZl9UeXBlPkpvdXJu

YWw8L1JlZl9UeXBlPjxSZWZfSUQ+MTQ8L1JlZl9JRD48VGl0bGVfUHJpbWFyeT5Vc2Ugb2YgbXVs

dGlwbGUgaW1wdXRhdGlvbiBpbiB0aGUgZXBpZGVtaW9sb2dpYyBsaXRlcmF0dXJlPC9UaXRsZV9Q

cmltYXJ5PjxBdXRob3JzX1ByaW1hcnk+S2xlYmFub2ZmLE0uQS48L0F1dGhvcnNfUHJpbWFyeT48

QXV0aG9yc19QcmltYXJ5PkNvbGUsUy5SLjwvQXV0aG9yc19QcmltYXJ5PjxEYXRlX1ByaW1hcnk+

MjAwOC84LzE1PC9EYXRlX1ByaW1hcnk+PEtleXdvcmRzPkFJUi1QT0xMVVRJT048L0tleXdvcmRz

PjxLZXl3b3Jkcz5BTEdPUklUSE08L0tleXdvcmRzPjxLZXl3b3Jkcz5CQUNLLVBBSU48L0tleXdv

cmRzPjxLZXl3b3Jkcz5DSElMRCBIRUFMVEg8L0tleXdvcmRzPjxLZXl3b3Jkcz5leHBlY3RhdGlv

bjwvS2V5d29yZHM+PEtleXdvcmRzPmltcHV0YXRpb248L0tleXdvcmRzPjxLZXl3b3Jkcz5taXNz

aW5nIGRhdGE8L0tleXdvcmRzPjxLZXl3b3Jkcz5tdWx0aXBsZSBpbXB1dGF0aW9uPC9LZXl3b3Jk

cz48S2V5d29yZHM+cHJvYmFiaWxpdHkgd2VpZ2h0aW5nPC9LZXl3b3Jkcz48S2V5d29yZHM+UkFU

SU88L0tleXdvcmRzPjxLZXl3b3Jkcz5SSVNLPC9LZXl3b3Jkcz48S2V5d29yZHM+VFJJQUw8L0tl

eXdvcmRzPjxSZXByaW50Pk5vdCBpbiBGaWxlPC9SZXByaW50PjxTdGFydF9QYWdlPjM1NTwvU3Rh

cnRfUGFnZT48RW5kX1BhZ2U+MzU3PC9FbmRfUGFnZT48UGVyaW9kaWNhbD5BbWVyaWNhbiBKb3Vy

bmFsIG9mIEVwaWRlbWlvbG9neTwvUGVyaW9kaWNhbD48Vm9sdW1lPjE2ODwvVm9sdW1lPjxJc3N1

ZT40PC9Jc3N1ZT48SVNTTl9JU0JOPjAwMDItOTI2MjwvSVNTTl9JU0JOPjxNaXNjXzM+RE9JIDEw

LjEwOTMvYWplL2t3bjA3MTwvTWlzY18zPjxBZGRyZXNzPkV1bmljZSBLZW5uZWR5IFNocml2ZXIg

TmF0bCBJbnN0IENoaWxkIEhsdGggJmFtcDsgSHVtLCBEaXYgRXBpZGVtaW9sIFN0YXQgJmFtcDsg

UHJldmVudCBSZXMsIE5JSCwgRGVwdCBIbHRoICZhbXA7IEh1bWFuIFNlcnYsIEJldGhlc2RhLCBN

RCBVU0EmI3hBO0pvaG5zIEhvcGtpbnMgQmxvb21iZXJnIFNjaCBQdWJsIEhsdGgsIERlcHQgRXBp

ZGVtaW9sLCBCYWx0aW1vcmUsIE1EIFVTQTwvQWRkcmVzcz48V2ViX1VSTD5JU0k6MDAwMjU4MzI5

NzAwMDAxPC9XZWJfVVJMPjxaWl9Kb3VybmFsRnVsbD48ZiBuYW1lPSJTeXN0ZW0iPkFtZXJpY2Fu

IEpvdXJuYWwgb2YgRXBpZGVtaW9sb2d5PC9mPjwvWlpfSm91cm5hbEZ1bGw+PFpaX1dvcmtmb3Jt

SUQ+MTwvWlpfV29ya2Zvcm1JRD48L01ETD48L0NpdGU+PC9SZWZtYW4+AG==

ADDIN REFMGR.CITE PFJlZm1hbj48Q2l0ZT48QXV0aG9yPlJ1YmluPC9BdXRob3I+PFllYXI+MTk5NjwvWWVhcj48UmVj

TnVtPjE8L1JlY051bT48SURUZXh0Pk11bHRpcGxlIGltcHV0YXRpb24gYWZ0ZXIgMTgrIHllYXJz

PC9JRFRleHQ+PE1ETCBSZWZfVHlwZT0iSm91cm5hbCI+PFJlZl9UeXBlPkpvdXJuYWw8L1JlZl9U

eXBlPjxSZWZfSUQ+MTwvUmVmX0lEPjxUaXRsZV9QcmltYXJ5Pk11bHRpcGxlIGltcHV0YXRpb24g

YWZ0ZXIgMTgrIHllYXJzPC9UaXRsZV9QcmltYXJ5PjxBdXRob3JzX1ByaW1hcnk+UnViaW4sRC5C

LjwvQXV0aG9yc19QcmltYXJ5PjxEYXRlX1ByaW1hcnk+MTk5Ni82PC9EYXRlX1ByaW1hcnk+PEtl

eXdvcmRzPkFJRFM8L0tleXdvcmRzPjxLZXl3b3Jkcz5jb25maWRlbmNlIHZhbGlkaXR5PC9LZXl3

b3Jkcz48S2V5d29yZHM+SU1QVVRFRCBEQVRBPC9LZXl3b3Jkcz48S2V5d29yZHM+SU5EVVNUUlk8

L0tleXdvcmRzPjxLZXl3b3Jkcz5JTkZFUkVOQ0U8L0tleXdvcmRzPjxLZXl3b3Jkcz5KQUNLS05J

RkU8L0tleXdvcmRzPjxLZXl3b3Jkcz5MT0dJU1RJQy1SRUdSRVNTSU9OPC9LZXl3b3Jkcz48S2V5

d29yZHM+bWlzc2luZyBkYXRhPC9LZXl3b3Jkcz48S2V5d29yZHM+TUlTU0lORyBWQUxVRVM8L0tl

eXdvcmRzPjxLZXl3b3Jkcz5OT05JR05PUkFCTEUgTk9OUkVTUE9OU0U8L0tleXdvcmRzPjxLZXl3

b3Jkcz5ub25yZXNwb25zZSBpbiBzdXJ2ZXlzPC9LZXl3b3Jkcz48S2V5d29yZHM+T0NDVVBBVElP

TiBDT0RFUzwvS2V5d29yZHM+PEtleXdvcmRzPnB1YmxpYy11c2UgZmlsZXM8L0tleXdvcmRzPjxL

ZXl3b3Jkcz5zYW1wbGUgc3VydmV5czwvS2V5d29yZHM+PEtleXdvcmRzPlNBTVBMRS1TVVJWRVlT

PC9LZXl3b3Jkcz48S2V5d29yZHM+c3VwZXJlZmZpY2llbnQgcHJvY2VkdXJlczwvS2V5d29yZHM+

PFJlcHJpbnQ+Tm90IGluIEZpbGU8L1JlcHJpbnQ+PFN0YXJ0X1BhZ2U+NDczPC9TdGFydF9QYWdl

PjxFbmRfUGFnZT40ODk8L0VuZF9QYWdlPjxQZXJpb2RpY2FsPkpvdXJuYWwgb2YgdGhlIEFtZXJp

Y2FuIFN0YXRpc3RpY2FsIEFzc29jaWF0aW9uPC9QZXJpb2RpY2FsPjxWb2x1bWU+OTE8L1ZvbHVt

ZT48SXNzdWU+NDM0PC9Jc3N1ZT48SVNTTl9JU0JOPjAxNjItMTQ1OTwvSVNTTl9JU0JOPjxXZWJf

VVJMPklTSTpBMTk5NlVQNTUyMDAwMDg8L1dlYl9VUkw+PFpaX0pvdXJuYWxGdWxsPjxmIG5hbWU9

IlN5c3RlbSI+Sm91cm5hbCBvZiB0aGUgQW1lcmljYW4gU3RhdGlzdGljYWwgQXNzb2NpYXRpb248

L2Y+PC9aWl9Kb3VybmFsRnVsbD48WlpfV29ya2Zvcm1JRD4xPC9aWl9Xb3JrZm9ybUlEPjwvTURM

PjwvQ2l0ZT48Q2l0ZT48QXV0aG9yPkJhcm5hcmQ8L0F1dGhvcj48WWVhcj4xOTk5PC9ZZWFyPjxS

ZWNOdW0+NDwvUmVjTnVtPjxJRFRleHQ+QXBwbGljYXRpb25zIG9mIG11bHRpcGxlIGltcHV0YXRp

b24gaW4gbWVkaWNhbCBzdHVkaWVzOiBmcm9tIEFJRFMgdG8gTkhBTkVTPC9JRFRleHQ+PE1ETCBS

ZWZfVHlwZT0iSm91cm5hbCI+PFJlZl9UeXBlPkpvdXJuYWw8L1JlZl9UeXBlPjxSZWZfSUQ+NDwv

UmVmX0lEPjxUaXRsZV9QcmltYXJ5PkFwcGxpY2F0aW9ucyBvZiBtdWx0aXBsZSBpbXB1dGF0aW9u

IGluIG1lZGljYWwgc3R1ZGllczogZnJvbSBBSURTIHRvIE5IQU5FUzwvVGl0bGVfUHJpbWFyeT48

QXV0aG9yc19QcmltYXJ5PkJhcm5hcmQsSi48L0F1dGhvcnNfUHJpbWFyeT48QXV0aG9yc19Qcmlt

YXJ5Pk1lbmcsWC5MLjwvQXV0aG9yc19QcmltYXJ5PjxEYXRlX1ByaW1hcnk+MTk5OS8zPC9EYXRl

X1ByaW1hcnk+PEtleXdvcmRzPkFJRFM8L0tleXdvcmRzPjxLZXl3b3Jkcz5JTkZFUkVOQ0U8L0tl

eXdvcmRzPjxLZXl3b3Jkcz5MSUtFTElIT09EPC9LZXl3b3Jkcz48S2V5d29yZHM+bWlzc2luZyBk

YXRhPC9LZXl3b3Jkcz48S2V5d29yZHM+TU9ERUw8L0tleXdvcmRzPjxLZXl3b3Jkcz5tdWx0aXBs

ZSBpbXB1dGF0aW9uPC9LZXl3b3Jkcz48S2V5d29yZHM+Tk9OQ09NUExJQU5DRTwvS2V5d29yZHM+

PEtleXdvcmRzPlJBTkRPTUlaRUQgQ09OVFJPTExFRCBUUklBTFM8L0tleXdvcmRzPjxLZXl3b3Jk

cz5TVEFUSVNUSUNTPC9LZXl3b3Jkcz48S2V5d29yZHM+U1VSVklWQUw8L0tleXdvcmRzPjxLZXl3

b3Jkcz5UUkVORFM8L0tleXdvcmRzPjxSZXByaW50Pk5vdCBpbiBGaWxlPC9SZXByaW50PjxTdGFy

dF9QYWdlPjE3PC9TdGFydF9QYWdlPjxFbmRfUGFnZT4zNjwvRW5kX1BhZ2U+PFBlcmlvZGljYWw+

U3RhdGlzdGljYWwgTWV0aG9kcyBpbiBNZWRpY2FsIFJlc2VhcmNoPC9QZXJpb2RpY2FsPjxWb2x1

bWU+ODwvVm9sdW1lPjxJc3N1ZT4xPC9Jc3N1ZT48SVNTTl9JU0JOPjA5NjItMjgwMjwvSVNTTl9J

U0JOPjxBZGRyZXNzPlVuaXYgQ2hpY2FnbywgRGVwdCBTdGF0LCBDaGljYWdvLCBJTCA2MDYzNyBV

U0EmI3hBO0hhcnZhcmQgVW5pdiwgRGVwdCBTdGF0LCBDYW1icmlkZ2UsIE1BIDAyMTM4IFVTQTwv

QWRkcmVzcz48V2ViX1VSTD5JU0k6MDAwMDgzNjk5OTAwMDAzPC9XZWJfVVJMPjxaWl9Kb3VybmFs

RnVsbD48ZiBuYW1lPSJTeXN0ZW0iPlN0YXRpc3RpY2FsIE1ldGhvZHMgaW4gTWVkaWNhbCBSZXNl

YXJjaDwvZj48L1paX0pvdXJuYWxGdWxsPjxaWl9Xb3JrZm9ybUlEPjE8L1paX1dvcmtmb3JtSUQ+

PC9NREw+PC9DaXRlPjxDaXRlPjxBdXRob3I+U2NoYWZlcjwvQXV0aG9yPjxZZWFyPjE5OTk8L1ll

YXI+PFJlY051bT40ODwvUmVjTnVtPjxJRFRleHQ+TXVsdGlwbGUgaW1wdXRhdGlvbjogYSBwcmlt

ZXI8L0lEVGV4dD48TURMIFJlZl9UeXBlPSJKb3VybmFsIj48UmVmX1R5cGU+Sm91cm5hbDwvUmVm

X1R5cGU+PFJlZl9JRD40ODwvUmVmX0lEPjxUaXRsZV9QcmltYXJ5Pk11bHRpcGxlIGltcHV0YXRp

b246IGEgcHJpbWVyPC9UaXRsZV9QcmltYXJ5PjxBdXRob3JzX1ByaW1hcnk+U2NoYWZlcixKLkwu

PC9BdXRob3JzX1ByaW1hcnk+PERhdGVfUHJpbWFyeT4xOTk5LzM8L0RhdGVfUHJpbWFyeT48S2V5

d29yZHM+aW1wdXRhdGlvbjwvS2V5d29yZHM+PEtleXdvcmRzPklNUFVURUQgREFUQTwvS2V5d29y

ZHM+PEtleXdvcmRzPkxJS0VMSUhPT0Q8L0tleXdvcmRzPjxLZXl3b3Jkcz5NSVNTSU5HIFZBTFVF

UzwvS2V5d29yZHM+PEtleXdvcmRzPm11bHRpcGxlIGltcHV0YXRpb248L0tleXdvcmRzPjxLZXl3

b3Jkcz5URVNUUzwvS2V5d29yZHM+PFJlcHJpbnQ+Tm90IGluIEZpbGU8L1JlcHJpbnQ+PFN0YXJ0

X1BhZ2U+MzwvU3RhcnRfUGFnZT48RW5kX1BhZ2U+MTU8L0VuZF9QYWdlPjxQZXJpb2RpY2FsPlN0

YXRpc3RpY2FsIE1ldGhvZHMgaW4gTWVkaWNhbCBSZXNlYXJjaDwvUGVyaW9kaWNhbD48Vm9sdW1l

Pjg8L1ZvbHVtZT48SXNzdWU+MTwvSXNzdWU+PElTU05fSVNCTj4wOTYyLTI4MDI8L0lTU05fSVNC

Tj48QWRkcmVzcz5QZW5uIFN0YXRlIFVuaXYsIERlcHQgU3RhdCwgVW5pdmVyc2l0eSBQaywgUEEg

MTY4MDIgVVNBPC9BZGRyZXNzPjxXZWJfVVJMPklTSTowMDAwODM2OTk5MDAwMDI8L1dlYl9VUkw+

PFpaX0pvdXJuYWxGdWxsPjxmIG5hbWU9IlN5c3RlbSI+U3RhdGlzdGljYWwgTWV0aG9kcyBpbiBN

ZWRpY2FsIFJlc2VhcmNoPC9mPjwvWlpfSm91cm5hbEZ1bGw+PFpaX1dvcmtmb3JtSUQ+MTwvWlpf

V29ya2Zvcm1JRD48L01ETD48L0NpdGU+PENpdGU+PEF1dGhvcj5IYXJlbDwvQXV0aG9yPjxZZWFy

PjIwMDc8L1llYXI+PFJlY051bT4xMTwvUmVjTnVtPjxJRFRleHQ+TXVsdGlwbGUgaW1wdXRhdGlv

bjogUmV2aWV3IG9mIHRoZW9yeSwgaW1wbGVtZW50YXRpb24gYW5kIHNvZnR3YXJlPC9JRFRleHQ+

PE1ETCBSZWZfVHlwZT0iSm91cm5hbCI+PFJlZl9UeXBlPkpvdXJuYWw8L1JlZl9UeXBlPjxSZWZf

SUQ+MTE8L1JlZl9JRD48VGl0bGVfUHJpbWFyeT5NdWx0aXBsZSBpbXB1dGF0aW9uOiBSZXZpZXcg

b2YgdGhlb3J5LCBpbXBsZW1lbnRhdGlvbiBhbmQgc29mdHdhcmU8L1RpdGxlX1ByaW1hcnk+PEF1

dGhvcnNfUHJpbWFyeT5IYXJlbCxPLjwvQXV0aG9yc19QcmltYXJ5PjxBdXRob3JzX1ByaW1hcnk+

WmhvdSxYLkguPC9BdXRob3JzX1ByaW1hcnk+PERhdGVfUHJpbWFyeT4yMDA3LzcvMjA8L0RhdGVf

UHJpbWFyeT48S2V5d29yZHM+REFUQSBBVUdNRU5UQVRJT048L0tleXdvcmRzPjxLZXl3b3Jkcz5k

aWFnbm9zdGljIHRlc3RzPC9LZXl3b3Jkcz48S2V5d29yZHM+RElTRUFTRTwvS2V5d29yZHM+PEtl

eXdvcmRzPkRST1AtT1VUPC9LZXl3b3Jkcz48S2V5d29yZHM+RVNUSU1BVE9SUzwvS2V5d29yZHM+

PEtleXdvcmRzPklNUFVURUQgREFUQTwvS2V5d29yZHM+PEtleXdvcmRzPklOQ09NUExFVEUgREFU

QTwvS2V5d29yZHM+PEtleXdvcmRzPklORkVSRU5DRTwvS2V5d29yZHM+PEtleXdvcmRzPm1pc3Np

bmcgZGF0YTwvS2V5d29yZHM+PEtleXdvcmRzPm11bHRpcGxlIGltcHV0YXRpb248L0tleXdvcmRz

PjxLZXl3b3Jkcz5ub25yZXNwb25zZSBpbiBzdXJ2ZXlzPC9LZXl3b3Jkcz48S2V5d29yZHM+UEFU

VEVSTi1NSVhUVVJFIE1PREVMUzwvS2V5d29yZHM+PEtleXdvcmRzPlBPU1RFUklPUiBESVNUUklC

VVRJT05TPC9LZXl3b3Jkcz48S2V5d29yZHM+c2Vuc2l0aXZpdHkgYW5kIHNwZWNpZmljaXR5PC9L

ZXl3b3Jkcz48S2V5d29yZHM+VEVTVFM8L0tleXdvcmRzPjxLZXl3b3Jkcz5WRVJJRklDQVRJT048

L0tleXdvcmRzPjxSZXByaW50Pk5vdCBpbiBGaWxlPC9SZXByaW50PjxTdGFydF9QYWdlPjMwNTc8

L1N0YXJ0X1BhZ2U+PEVuZF9QYWdlPjMwNzc8L0VuZF9QYWdlPjxQZXJpb2RpY2FsPlN0YXRpc3Rp

Y3MgaW4gTWVkaWNpbmU8L1BlcmlvZGljYWw+PFZvbHVtZT4yNjwvVm9sdW1lPjxJc3N1ZT4xNjwv

SXNzdWU+PElTU05fSVNCTj4wMjc3LTY3MTU8L0lTU05fSVNCTj48TWlzY18zPkRPSSAxMC4xMDAy

L3NpbS4yNzg3PC9NaXNjXzM+PEFkZHJlc3M+VW5pdiBDb25uZWN0aWN1dCwgRGVwdCBTdGF0LCBT

dG9ycnMsIENUIDA2MjY5IFVTQSYjeEE7VkEgUHVnZXQgU291bmQgSGx0aCBDYXJlIFN5c3QsIEhT

UiZhbXA7RCBDdHIgRXhjZWxsZW5jZSwgU2VhdHRsZSwgV0EgOTgxMDggVVNBJiN4QTtVbml2IFdh

c2hpbmd0b24sIFNjaCBQdWJsIEhsdGgsIERlcHQgQmlvc3RhdCwgU2VhdHRsZSwgV0EgOTgxOTUg

VVNBPC9BZGRyZXNzPjxXZWJfVVJMPklTSTowMDAyNDc4MzY3MDAwMDE8L1dlYl9VUkw+PFpaX0pv

dXJuYWxGdWxsPjxmIG5hbWU9IlN5c3RlbSI+U3RhdGlzdGljcyBpbiBNZWRpY2luZTwvZj48L1pa

X0pvdXJuYWxGdWxsPjxaWl9Xb3JrZm9ybUlEPjE8L1paX1dvcmtmb3JtSUQ+PC9NREw+PC9DaXRl

PjxDaXRlPjxBdXRob3I+U3R1YXJ0PC9BdXRob3I+PFllYXI+MjAwOTwvWWVhcj48UmVjTnVtPjQ5

PC9SZWNOdW0+PElEVGV4dD5NdWx0aXBsZSBJbXB1dGF0aW9uIFdpdGggTGFyZ2UgRGF0YSBTZXRz

OiBBIENhc2UgU3R1ZHkgb2YgdGhlIENoaWxkcmVuJmFwb3M7cyBNZW50YWwgSGVhbHRoIEluaXRp

YXRpdmU8L0lEVGV4dD48TURMIFJlZl9UeXBlPSJKb3VybmFsIj48UmVmX1R5cGU+Sm91cm5hbDwv

UmVmX1R5cGU+PFJlZl9JRD40OTwvUmVmX0lEPjxUaXRsZV9QcmltYXJ5Pk11bHRpcGxlIEltcHV0

YXRpb24gV2l0aCBMYXJnZSBEYXRhIFNldHM6IEEgQ2FzZSBTdHVkeSBvZiB0aGUgQ2hpbGRyZW4m

YXBvcztzIE1lbnRhbCBIZWFsdGggSW5pdGlhdGl2ZTwvVGl0bGVfUHJpbWFyeT48QXV0aG9yc19Q

cmltYXJ5PlN0dWFydCxFLkEuPC9BdXRob3JzX1ByaW1hcnk+PEF1dGhvcnNfUHJpbWFyeT5BenVy

LE0uPC9BdXRob3JzX1ByaW1hcnk+PEF1dGhvcnNfUHJpbWFyeT5GcmFuZ2FraXMsQy48L0F1dGhv

cnNfUHJpbWFyeT48QXV0aG9yc19QcmltYXJ5PkxlYWYsUC48L0F1dGhvcnNfUHJpbWFyeT48RGF0

ZV9QcmltYXJ5PjIwMDkvNS8xPC9EYXRlX1ByaW1hcnk+PEtleXdvcmRzPmltcHV0YXRpb248L0tl

eXdvcmRzPjxLZXl3b3Jkcz5tZW50YWwgaGVhbHRoIHNlcnZpY2VzPC9LZXl3b3Jkcz48S2V5d29y

ZHM+bWlzc2luZyBhdCByYW5kb208L0tleXdvcmRzPjxLZXl3b3Jkcz5taXNzaW5nIGRhdGE8L0tl

eXdvcmRzPjxLZXl3b3Jkcz5NSVNTSU5HLURBVEE8L0tleXdvcmRzPjxLZXl3b3Jkcz5tdWx0aXBs

ZSBpbXB1dGF0aW9uPC9LZXl3b3Jkcz48S2V5d29yZHM+TVVMVElWQVJJQVRFIE1JU1NJTkctREFU

QTwvS2V5d29yZHM+PEtleXdvcmRzPk5PTlJFU1BPTlNFPC9LZXl3b3Jkcz48S2V5d29yZHM+UE9Q

VUxBVElPTjwvS2V5d29yZHM+PEtleXdvcmRzPlNUUkFURUdJRVM8L0tleXdvcmRzPjxLZXl3b3Jk

cz5WQUxVRVM8L0tleXdvcmRzPjxSZXByaW50Pk5vdCBpbiBGaWxlPC9SZXByaW50PjxTdGFydF9Q

YWdlPjExMzM8L1N0YXJ0X1BhZ2U+PEVuZF9QYWdlPjExMzk8L0VuZF9QYWdlPjxQZXJpb2RpY2Fs

PkFtZXJpY2FuIEpvdXJuYWwgb2YgRXBpZGVtaW9sb2d5PC9QZXJpb2RpY2FsPjxWb2x1bWU+MTY5

PC9Wb2x1bWU+PElzc3VlPjk8L0lzc3VlPjxJU1NOX0lTQk4+MDAwMi05MjYyPC9JU1NOX0lTQk4+

PE1pc2NfMz5ET0kgMTAuMTA5My9hamUva3dwMDI2PC9NaXNjXzM+PEFkZHJlc3M+Sm9obnMgSG9w

a2lucyBCbG9vbWJlcmcgU2NoIFB1YmwgSGx0aCwgRGVwdCBCaW9zdGF0LCBCYWx0aW1vcmUsIE1E

IDIxMjA1IFVTQSYjeEE7Sm9obnMgSG9wa2lucyBCbG9vbWJlcmcgU2NoIFB1YmwgSGx0aCwgRGVw

dCBNZW50YWwgSGx0aCwgQmFsdGltb3JlLCBNRCAyMTIwNSBVU0E8L0FkZHJlc3M+PFdlYl9VUkw+

SVNJOjAwMDI2NTI2NzEwMDAxMTwvV2ViX1VSTD48WlpfSm91cm5hbEZ1bGw+PGYgbmFtZT0iU3lz

dGVtIj5BbWVyaWNhbiBKb3VybmFsIG9mIEVwaWRlbWlvbG9neTwvZj48L1paX0pvdXJuYWxGdWxs

PjxaWl9Xb3JrZm9ybUlEPjE8L1paX1dvcmtmb3JtSUQ+PC9NREw+PC9DaXRlPjxDaXRlPjxBdXRo

b3I+S2xlYmFub2ZmPC9BdXRob3I+PFllYXI+MjAwODwvWWVhcj48UmVjTnVtPjE0PC9SZWNOdW0+

PElEVGV4dD5Vc2Ugb2YgbXVsdGlwbGUgaW1wdXRhdGlvbiBpbiB0aGUgZXBpZGVtaW9sb2dpYyBs

aXRlcmF0dXJlPC9JRFRleHQ+PE1ETCBSZWZfVHlwZT0iSm91cm5hbCI+PFJlZl9UeXBlPkpvdXJu

YWw8L1JlZl9UeXBlPjxSZWZfSUQ+MTQ8L1JlZl9JRD48VGl0bGVfUHJpbWFyeT5Vc2Ugb2YgbXVs

dGlwbGUgaW1wdXRhdGlvbiBpbiB0aGUgZXBpZGVtaW9sb2dpYyBsaXRlcmF0dXJlPC9UaXRsZV9Q

cmltYXJ5PjxBdXRob3JzX1ByaW1hcnk+S2xlYmFub2ZmLE0uQS48L0F1dGhvcnNfUHJpbWFyeT48

QXV0aG9yc19QcmltYXJ5PkNvbGUsUy5SLjwvQXV0aG9yc19QcmltYXJ5PjxEYXRlX1ByaW1hcnk+

MjAwOC84LzE1PC9EYXRlX1ByaW1hcnk+PEtleXdvcmRzPkFJUi1QT0xMVVRJT048L0tleXdvcmRz

PjxLZXl3b3Jkcz5BTEdPUklUSE08L0tleXdvcmRzPjxLZXl3b3Jkcz5CQUNLLVBBSU48L0tleXdv

cmRzPjxLZXl3b3Jkcz5DSElMRCBIRUFMVEg8L0tleXdvcmRzPjxLZXl3b3Jkcz5leHBlY3RhdGlv

bjwvS2V5d29yZHM+PEtleXdvcmRzPmltcHV0YXRpb248L0tleXdvcmRzPjxLZXl3b3Jkcz5taXNz

aW5nIGRhdGE8L0tleXdvcmRzPjxLZXl3b3Jkcz5tdWx0aXBsZSBpbXB1dGF0aW9uPC9LZXl3b3Jk

cz48S2V5d29yZHM+cHJvYmFiaWxpdHkgd2VpZ2h0aW5nPC9LZXl3b3Jkcz48S2V5d29yZHM+UkFU

SU88L0tleXdvcmRzPjxLZXl3b3Jkcz5SSVNLPC9LZXl3b3Jkcz48S2V5d29yZHM+VFJJQUw8L0tl

eXdvcmRzPjxSZXByaW50Pk5vdCBpbiBGaWxlPC9SZXByaW50PjxTdGFydF9QYWdlPjM1NTwvU3Rh

cnRfUGFnZT48RW5kX1BhZ2U+MzU3PC9FbmRfUGFnZT48UGVyaW9kaWNhbD5BbWVyaWNhbiBKb3Vy

bmFsIG9mIEVwaWRlbWlvbG9neTwvUGVyaW9kaWNhbD48Vm9sdW1lPjE2ODwvVm9sdW1lPjxJc3N1

ZT40PC9Jc3N1ZT48SVNTTl9JU0JOPjAwMDItOTI2MjwvSVNTTl9JU0JOPjxNaXNjXzM+RE9JIDEw

LjEwOTMvYWplL2t3bjA3MTwvTWlzY18zPjxBZGRyZXNzPkV1bmljZSBLZW5uZWR5IFNocml2ZXIg

TmF0bCBJbnN0IENoaWxkIEhsdGggJmFtcDsgSHVtLCBEaXYgRXBpZGVtaW9sIFN0YXQgJmFtcDsg

UHJldmVudCBSZXMsIE5JSCwgRGVwdCBIbHRoICZhbXA7IEh1bWFuIFNlcnYsIEJldGhlc2RhLCBN

RCBVU0EmI3hBO0pvaG5zIEhvcGtpbnMgQmxvb21iZXJnIFNjaCBQdWJsIEhsdGgsIERlcHQgRXBp

ZGVtaW9sLCBCYWx0aW1vcmUsIE1EIFVTQTwvQWRkcmVzcz48V2ViX1VSTD5JU0k6MDAwMjU4MzI5

NzAwMDAxPC9XZWJfVVJMPjxaWl9Kb3VybmFsRnVsbD48ZiBuYW1lPSJTeXN0ZW0iPkFtZXJpY2Fu

IEpvdXJuYWwgb2YgRXBpZGVtaW9sb2d5PC9mPjwvWlpfSm91cm5hbEZ1bGw+PFpaX1dvcmtmb3Jt

SUQ+MTwvWlpfV29ya2Zvcm1JRD48L01ETD48L0NpdGU+PC9SZWZtYW4+AG==

ADDIN EN.CITE.DATA (2, 13, 16, 27-29). A popular approach for implementing MI is sequential regression modeling, also called multiple imputation by chained equations (MICE) PFJlZm1hbj48Q2l0ZT48QXV0aG9yPlZhbiBCdXVyZW48L0F1dGhvcj48WWVhcj4xOTk5PC9ZZWFy

PjxSZWNOdW0+NTE8L1JlY051bT48SURUZXh0PkZsZXhpYmxlIG11bHRpdmFyaWF0ZSBpbXB1dGF0

aW9uIGJ5IE1JQ0U8L0lEVGV4dD48TURMIFJlZl9UeXBlPSJCb29rLCBXaG9sZSI+PFJlZl9UeXBl

PkJvb2ssIFdob2xlPC9SZWZfVHlwZT48UmVmX0lEPjUxPC9SZWZfSUQ+PFRpdGxlX1ByaW1hcnk+

RmxleGlibGUgbXVsdGl2YXJpYXRlIGltcHV0YXRpb24gYnkgTUlDRTwvVGl0bGVfUHJpbWFyeT48

QXV0aG9yc19QcmltYXJ5PlZhbiBCdXVyZW4sUy48L0F1dGhvcnNfUHJpbWFyeT48QXV0aG9yc19Q

cmltYXJ5Pk91ZHNob29ybixLLjwvQXV0aG9yc19QcmltYXJ5PjxEYXRlX1ByaW1hcnk+MTk5OTwv

RGF0ZV9QcmltYXJ5PjxLZXl3b3Jkcz5pbXB1dGF0aW9uPC9LZXl3b3Jkcz48UmVwcmludD5Ob3Qg

aW4gRmlsZTwvUmVwcmludD48U3RhcnRfUGFnZT4xPC9TdGFydF9QYWdlPjxFbmRfUGFnZT4yMDwv

RW5kX1BhZ2U+PFB1Yl9QbGFjZT5MZWlkZW4sIFRoZSBOZXRoZXJsYW5kczwvUHViX1BsYWNlPjxQ

dWJsaXNoZXI+VE5PIFByZXZlbnRpb24gQ2VudGVyPC9QdWJsaXNoZXI+PFpaX1dvcmtmb3JtSUQ+

MjwvWlpfV29ya2Zvcm1JRD48L01ETD48L0NpdGU+PENpdGU+PEF1dGhvcj5SYWdodW5hdGhhbjwv

QXV0aG9yPjxZZWFyPjIwMDI8L1llYXI+PFJlY051bT40MTwvUmVjTnVtPjxJRFRleHQ+QSBtdWx0

aXZhcmlhdGUgdGVjaG5pcXVlIGZvciBtdWx0aXBseSBpbXB1dGluZyBtaXNzaW5nIHZhbHVlcyB1

c2luZyBhIHNlcXVlbmNlIG9mIHJlZ3Jlc3Npb24gbW9kZWxzPC9JRFRleHQ+PE1ETCBSZWZfVHlw

ZT0iSm91cm5hbCI+PFJlZl9UeXBlPkpvdXJuYWw8L1JlZl9UeXBlPjxSZWZfSUQ+NDE8L1JlZl9J

RD48VGl0bGVfUHJpbWFyeT5BIG11bHRpdmFyaWF0ZSB0ZWNobmlxdWUgZm9yIG11bHRpcGx5IGlt

cHV0aW5nIG1pc3NpbmcgdmFsdWVzIHVzaW5nIGEgc2VxdWVuY2Ugb2YgcmVncmVzc2lvbiBtb2Rl

bHM8L1RpdGxlX1ByaW1hcnk+PEF1dGhvcnNfUHJpbWFyeT5SYWdodW5hdGhhbixULjwvQXV0aG9y

c19QcmltYXJ5PjxBdXRob3JzX1ByaW1hcnk+U29sZW5iZXJnZXIsUC48L0F1dGhvcnNfUHJpbWFy

eT48QXV0aG9yc19QcmltYXJ5PlZhbiBIb2V3eWssSi48L0F1dGhvcnNfUHJpbWFyeT48RGF0ZV9Q

cmltYXJ5PjIwMDI8L0RhdGVfUHJpbWFyeT48S2V5d29yZHM+TUlTU0lORyBWQUxVRVM8L0tleXdv

cmRzPjxLZXl3b3Jkcz5NT0RFTDwvS2V5d29yZHM+PEtleXdvcmRzPk1PREVMUzwvS2V5d29yZHM+

PEtleXdvcmRzPlJFR1JFU1NJT048L0tleXdvcmRzPjxSZXByaW50Pk5vdCBpbiBGaWxlPC9SZXBy

aW50PjxTdGFydF9QYWdlPjg1PC9TdGFydF9QYWdlPjxFbmRfUGFnZT45NjwvRW5kX1BhZ2U+PFBl

cmlvZGljYWw+U3VydmV5IE1ldGhvZG9sb2d5PC9QZXJpb2RpY2FsPjxWb2x1bWU+Mjc8L1ZvbHVt

ZT48SXNzdWU+MTwvSXNzdWU+PFpaX0pvdXJuYWxGdWxsPjxmIG5hbWU9IlN5c3RlbSI+U3VydmV5

IE1ldGhvZG9sb2d5PC9mPjwvWlpfSm91cm5hbEZ1bGw+PFpaX1dvcmtmb3JtSUQ+MTwvWlpfV29y

a2Zvcm1JRD48L01ETD48L0NpdGU+PENpdGU+PEF1dGhvcj5TdTwvQXV0aG9yPjxZZWFyPjIwMDk8

L1llYXI+PFJlY051bT41MDwvUmVjTnVtPjxJRFRleHQ+TXVsdGlwbGUgaW1wdXRhdGlvbiB3aXRo

IGRpYWdub3N0aWNzIChtaSkgaW4gUjogT3BlbmluZyB3aW5kb3duIGludG8gdGhlIGJsYWNrIGJv

eDwvSURUZXh0PjxNREwgUmVmX1R5cGU9IkpvdXJuYWwiPjxSZWZfVHlwZT5Kb3VybmFsPC9SZWZf

VHlwZT48UmVmX0lEPjUwPC9SZWZfSUQ+PFRpdGxlX1ByaW1hcnk+TXVsdGlwbGUgaW1wdXRhdGlv

biB3aXRoIGRpYWdub3N0aWNzIChtaSkgaW4gUjogT3BlbmluZyB3aW5kb3duIGludG8gdGhlIGJs

YWNrIGJveDwvVGl0bGVfUHJpbWFyeT48QXV0aG9yc19QcmltYXJ5PlN1LFkuPC9BdXRob3JzX1By

aW1hcnk+PEF1dGhvcnNfUHJpbWFyeT5HZWxtYW4sQS48L0F1dGhvcnNfUHJpbWFyeT48QXV0aG9y

c19QcmltYXJ5PkhpbGwsSi48L0F1dGhvcnNfUHJpbWFyeT48QXV0aG9yc19QcmltYXJ5Pllhamlt

YSxNLjwvQXV0aG9yc19QcmltYXJ5PjxEYXRlX1ByaW1hcnk+MjAwOTwvRGF0ZV9QcmltYXJ5PjxL

ZXl3b3Jkcz5pbXB1dGF0aW9uPC9LZXl3b3Jkcz48S2V5d29yZHM+bXVsdGlwbGUgaW1wdXRhdGlv

bjwvS2V5d29yZHM+PFJlcHJpbnQ+Tm90IGluIEZpbGU8L1JlcHJpbnQ+PFN0YXJ0X1BhZ2U+MTwv

U3RhcnRfUGFnZT48RW5kX1BhZ2U+Mjc8L0VuZF9QYWdlPjxQZXJpb2RpY2FsPkpvdXJuYWwgb2Yg

U3RhdGlzdGljYWwgU29mdHdhcmU8L1BlcmlvZGljYWw+PFZvbHVtZT4yMDwvVm9sdW1lPjxJc3N1

ZT4yPC9Jc3N1ZT48WlpfSm91cm5hbEZ1bGw+PGYgbmFtZT0iU3lzdGVtIj5Kb3VybmFsIG9mIFN0

YXRpc3RpY2FsIFNvZnR3YXJlPC9mPjwvWlpfSm91cm5hbEZ1bGw+PFpaX1dvcmtmb3JtSUQ+MTwv

WlpfV29ya2Zvcm1JRD48L01ETD48L0NpdGU+PC9SZWZtYW4+

ADDIN REFMGR.CITE PFJlZm1hbj48Q2l0ZT48QXV0aG9yPlZhbiBCdXVyZW48L0F1dGhvcj48WWVhcj4xOTk5PC9ZZWFy

PjxSZWNOdW0+NTE8L1JlY051bT48SURUZXh0PkZsZXhpYmxlIG11bHRpdmFyaWF0ZSBpbXB1dGF0

aW9uIGJ5IE1JQ0U8L0lEVGV4dD48TURMIFJlZl9UeXBlPSJCb29rLCBXaG9sZSI+PFJlZl9UeXBl

PkJvb2ssIFdob2xlPC9SZWZfVHlwZT48UmVmX0lEPjUxPC9SZWZfSUQ+PFRpdGxlX1ByaW1hcnk+

RmxleGlibGUgbXVsdGl2YXJpYXRlIGltcHV0YXRpb24gYnkgTUlDRTwvVGl0bGVfUHJpbWFyeT48

QXV0aG9yc19QcmltYXJ5PlZhbiBCdXVyZW4sUy48L0F1dGhvcnNfUHJpbWFyeT48QXV0aG9yc19Q

cmltYXJ5Pk91ZHNob29ybixLLjwvQXV0aG9yc19QcmltYXJ5PjxEYXRlX1ByaW1hcnk+MTk5OTwv

RGF0ZV9QcmltYXJ5PjxLZXl3b3Jkcz5pbXB1dGF0aW9uPC9LZXl3b3Jkcz48UmVwcmludD5Ob3Qg

aW4gRmlsZTwvUmVwcmludD48U3RhcnRfUGFnZT4xPC9TdGFydF9QYWdlPjxFbmRfUGFnZT4yMDwv

RW5kX1BhZ2U+PFB1Yl9QbGFjZT5MZWlkZW4sIFRoZSBOZXRoZXJsYW5kczwvUHViX1BsYWNlPjxQ

dWJsaXNoZXI+VE5PIFByZXZlbnRpb24gQ2VudGVyPC9QdWJsaXNoZXI+PFpaX1dvcmtmb3JtSUQ+

MjwvWlpfV29ya2Zvcm1JRD48L01ETD48L0NpdGU+PENpdGU+PEF1dGhvcj5SYWdodW5hdGhhbjwv

QXV0aG9yPjxZZWFyPjIwMDI8L1llYXI+PFJlY051bT40MTwvUmVjTnVtPjxJRFRleHQ+QSBtdWx0

aXZhcmlhdGUgdGVjaG5pcXVlIGZvciBtdWx0aXBseSBpbXB1dGluZyBtaXNzaW5nIHZhbHVlcyB1

c2luZyBhIHNlcXVlbmNlIG9mIHJlZ3Jlc3Npb24gbW9kZWxzPC9JRFRleHQ+PE1ETCBSZWZfVHlw

ZT0iSm91cm5hbCI+PFJlZl9UeXBlPkpvdXJuYWw8L1JlZl9UeXBlPjxSZWZfSUQ+NDE8L1JlZl9J

RD48VGl0bGVfUHJpbWFyeT5BIG11bHRpdmFyaWF0ZSB0ZWNobmlxdWUgZm9yIG11bHRpcGx5IGlt

cHV0aW5nIG1pc3NpbmcgdmFsdWVzIHVzaW5nIGEgc2VxdWVuY2Ugb2YgcmVncmVzc2lvbiBtb2Rl

bHM8L1RpdGxlX1ByaW1hcnk+PEF1dGhvcnNfUHJpbWFyeT5SYWdodW5hdGhhbixULjwvQXV0aG9y

c19QcmltYXJ5PjxBdXRob3JzX1ByaW1hcnk+U29sZW5iZXJnZXIsUC48L0F1dGhvcnNfUHJpbWFy

eT48QXV0aG9yc19QcmltYXJ5PlZhbiBIb2V3eWssSi48L0F1dGhvcnNfUHJpbWFyeT48RGF0ZV9Q

cmltYXJ5PjIwMDI8L0RhdGVfUHJpbWFyeT48S2V5d29yZHM+TUlTU0lORyBWQUxVRVM8L0tleXdv

cmRzPjxLZXl3b3Jkcz5NT0RFTDwvS2V5d29yZHM+PEtleXdvcmRzPk1PREVMUzwvS2V5d29yZHM+

PEtleXdvcmRzPlJFR1JFU1NJT048L0tleXdvcmRzPjxSZXByaW50Pk5vdCBpbiBGaWxlPC9SZXBy

aW50PjxTdGFydF9QYWdlPjg1PC9TdGFydF9QYWdlPjxFbmRfUGFnZT45NjwvRW5kX1BhZ2U+PFBl

cmlvZGljYWw+U3VydmV5IE1ldGhvZG9sb2d5PC9QZXJpb2RpY2FsPjxWb2x1bWU+Mjc8L1ZvbHVt

ZT48SXNzdWU+MTwvSXNzdWU+PFpaX0pvdXJuYWxGdWxsPjxmIG5hbWU9IlN5c3RlbSI+U3VydmV5

IE1ldGhvZG9sb2d5PC9mPjwvWlpfSm91cm5hbEZ1bGw+PFpaX1dvcmtmb3JtSUQ+MTwvWlpfV29y

a2Zvcm1JRD48L01ETD48L0NpdGU+PENpdGU+PEF1dGhvcj5TdTwvQXV0aG9yPjxZZWFyPjIwMDk8

L1llYXI+PFJlY051bT41MDwvUmVjTnVtPjxJRFRleHQ+TXVsdGlwbGUgaW1wdXRhdGlvbiB3aXRo

IGRpYWdub3N0aWNzIChtaSkgaW4gUjogT3BlbmluZyB3aW5kb3duIGludG8gdGhlIGJsYWNrIGJv

eDwvSURUZXh0PjxNREwgUmVmX1R5cGU9IkpvdXJuYWwiPjxSZWZfVHlwZT5Kb3VybmFsPC9SZWZf

VHlwZT48UmVmX0lEPjUwPC9SZWZfSUQ+PFRpdGxlX1ByaW1hcnk+TXVsdGlwbGUgaW1wdXRhdGlv

biB3aXRoIGRpYWdub3N0aWNzIChtaSkgaW4gUjogT3BlbmluZyB3aW5kb3duIGludG8gdGhlIGJs

YWNrIGJveDwvVGl0bGVfUHJpbWFyeT48QXV0aG9yc19QcmltYXJ5PlN1LFkuPC9BdXRob3JzX1By

aW1hcnk+PEF1dGhvcnNfUHJpbWFyeT5HZWxtYW4sQS48L0F1dGhvcnNfUHJpbWFyeT48QXV0aG9y

c19QcmltYXJ5PkhpbGwsSi48L0F1dGhvcnNfUHJpbWFyeT48QXV0aG9yc19QcmltYXJ5Pllhamlt

YSxNLjwvQXV0aG9yc19QcmltYXJ5PjxEYXRlX1ByaW1hcnk+MjAwOTwvRGF0ZV9QcmltYXJ5PjxL

ZXl3b3Jkcz5pbXB1dGF0aW9uPC9LZXl3b3Jkcz48S2V5d29yZHM+bXVsdGlwbGUgaW1wdXRhdGlv

bjwvS2V5d29yZHM+PFJlcHJpbnQ+Tm90IGluIEZpbGU8L1JlcHJpbnQ+PFN0YXJ0X1BhZ2U+MTwv

U3RhcnRfUGFnZT48RW5kX1BhZ2U+Mjc8L0VuZF9QYWdlPjxQZXJpb2RpY2FsPkpvdXJuYWwgb2Yg

U3RhdGlzdGljYWwgU29mdHdhcmU8L1BlcmlvZGljYWw+PFZvbHVtZT4yMDwvVm9sdW1lPjxJc3N1

ZT4yPC9Jc3N1ZT48WlpfSm91cm5hbEZ1bGw+PGYgbmFtZT0iU3lzdGVtIj5Kb3VybmFsIG9mIFN0

YXRpc3RpY2FsIFNvZnR3YXJlPC9mPjwvWlpfSm91cm5hbEZ1bGw+PFpaX1dvcmtmb3JtSUQ+MTwv

WlpfV29ya2Zvcm1JRD48L01ETD48L0NpdGU+PC9SZWZtYW4+

ADDIN EN.CITE.DATA (20, 30, 31). The basic idea is to impute missing values in Y1 from a regression of the observed elements of Y1 on Y2,Y3,etc., impute missing values in Y2 from a regression of Y2 on Y1,Y3,etc., impute missing values in Y3 from a regression of Y3 on Y1,Y2,etc., and so on. It is generally easier to specify these conditional models than a plausible joint distribution of all the data. However, in general, there need not exist a joint distribution that corresponds to the set of specified conditional distributions, so it is possible that this imputation method produces logically inconsistent imputation models ADDIN REFMGR.CITE <Refman><Cite><Author>Gelman</Author><Year>1993</Year><RecNum>10</RecNum><IDText>Characterizing A Joint Probability-Distribution by Conditionals</IDText><MDL Ref_Type="Journal"><Ref_Type>Journal</Ref_Type><Ref_ID>10</Ref_ID><Title_Primary>Characterizing A Joint Probability-Distribution by Conditionals</Title_Primary><Authors_Primary>Gelman,A.</Authors_Primary><Authors_Primary>Speed,T.P.</Authors_Primary><Date_Primary>1993</Date_Primary><Keywords>CONDITIONAL DISTRIBUTIONS</Keywords><Keywords>MODEL</Keywords><Keywords>MODELS</Keywords><Keywords>MULTIVARIATE DISTRIBUTIONS</Keywords><Reprint>Not in File</Reprint><Start_Page>185</Start_Page><End_Page>188</End_Page><Periodical>Journal of the Royal Statistical Society Series B-Methodological</Periodical><Volume>55</Volume><Issue>1</Issue><ISSN_ISBN>0035-9246</ISSN_ISBN><Web_URL>ISI:A1993KL31400012</Web_URL><ZZ_JournalFull><f name="System">Journal of the Royal Statistical Society Series B-Methodological</f></ZZ_JournalFull><ZZ_WorkformID>1</ZZ_WorkformID></MDL></Cite></Refman>(11). Despite this deficiency, the method is widely used because of its flexibility and relative ease of implementation. With MICE, the imputer has to specify conditional models for all variables with missing data. With dozens or hundreds of variables, as is often the case in large epidemiological studies, specifying these models is no easy task. Relationships among the variables may be interactive and non-linear, and identifying these complexities can be a laborious task with no guarantee of success. Furthermore, often variables have distributions that are not easily captured with standard parametric models. Motivated by these challenges, we present a MICE approach that uses classification and regression trees (CART) PFJlZm1hbj48Q2l0ZT48QXV0aG9yPkJyZWltYW48L0F1dGhvcj48WWVhcj4xOTg0PC9ZZWFyPjxS

ZWNOdW0+NTwvUmVjTnVtPjxJRFRleHQ+JnF1b3Q7Q2xhc3NpZmljYXRpb24gYW5kIFJlZ3Jlc3Np

b24gVHJlZXMmcXVvdDs8L0lEVGV4dD48TURMIFJlZl9UeXBlPSJCb29rLCBXaG9sZSI+PFJlZl9U

eXBlPkJvb2ssIFdob2xlPC9SZWZfVHlwZT48UmVmX0lEPjU8L1JlZl9JRD48VGl0bGVfUHJpbWFy

eT4mcXVvdDtDbGFzc2lmaWNhdGlvbiBhbmQgUmVncmVzc2lvbiBUcmVlcyZxdW90OzwvVGl0bGVf

UHJpbWFyeT48QXV0aG9yc19QcmltYXJ5PkJyZWltYW4sTC48L0F1dGhvcnNfUHJpbWFyeT48QXV0

aG9yc19QcmltYXJ5PkZyaWVkbWFuLEouSC48L0F1dGhvcnNfUHJpbWFyeT48QXV0aG9yc19Qcmlt

YXJ5Pk9sc2hlbixSLkEuPC9BdXRob3JzX1ByaW1hcnk+PEF1dGhvcnNfUHJpbWFyeT5TdG9uZSxD

LkouPC9BdXRob3JzX1ByaW1hcnk+PERhdGVfUHJpbWFyeT4xOTg0LzEvMTwvRGF0ZV9QcmltYXJ5

PjxSZXByaW50Pk5vdCBpbiBGaWxlPC9SZXByaW50PjxTdGFydF9QYWdlPjE8L1N0YXJ0X1BhZ2U+

PEVuZF9QYWdlPjM2ODwvRW5kX1BhZ2U+PFZvbHVtZT4xc3Q8L1ZvbHVtZT48SXNzdWU+MTwvSXNz

dWU+PFB1Yl9QbGFjZT5Cb2NhIFJhdG9uLCBGTDwvUHViX1BsYWNlPjxQdWJsaXNoZXI+Q2hhcG1h

biBhbmQgSGFsbC9DUkM8L1B1Ymxpc2hlcj48TWlzY18xPjE8L01pc2NfMT48QWRkcmVzcz5Cb2Nh

IFJhdG9uLCBGTDwvQWRkcmVzcz48WlpfV29ya2Zvcm1JRD4yPC9aWl9Xb3JrZm9ybUlEPjwvTURM

PjwvQ2l0ZT48Q2l0ZT48QXV0aG9yPkhhc3RpZTwvQXV0aG9yPjxZZWFyPjIwMDk8L1llYXI+PFJl

Y051bT4xMjwvUmVjTnVtPjxJRFRleHQ+VGhlIEVsZW1lbnRzIG9mIFN0YXRpc3RpY2FsIExlYXJu

aW5nOiBEYXRhIE1pbmluZywgSW5mZXJlbmNlLCBhbmQgUHJlZGljdGlvbjwvSURUZXh0PjxNREwg

UmVmX1R5cGU9IkJvb2ssIFdob2xlIj48UmVmX1R5cGU+Qm9vaywgV2hvbGU8L1JlZl9UeXBlPjxS

ZWZfSUQ+MTI8L1JlZl9JRD48VGl0bGVfUHJpbWFyeT5UaGUgRWxlbWVudHMgb2YgU3RhdGlzdGlj

YWwgTGVhcm5pbmc6IERhdGEgTWluaW5nLCBJbmZlcmVuY2UsIGFuZCBQcmVkaWN0aW9uPC9UaXRs

ZV9QcmltYXJ5PjxBdXRob3JzX1ByaW1hcnk+SGFzdGllLFQ8L0F1dGhvcnNfUHJpbWFyeT48QXV0

aG9yc19QcmltYXJ5PlRpYnNoaXJhbmksUi48L0F1dGhvcnNfUHJpbWFyeT48QXV0aG9yc19Qcmlt

YXJ5PkZyaWVkbWFuLEouPC9BdXRob3JzX1ByaW1hcnk+PERhdGVfUHJpbWFyeT4yMDA5PC9EYXRl

X1ByaW1hcnk+PEtleXdvcmRzPklORkVSRU5DRTwvS2V5d29yZHM+PFJlcHJpbnQ+Tm90IGluIEZp

bGU8L1JlcHJpbnQ+PFN0YXJ0X1BhZ2U+MzMzPC9TdGFydF9QYWdlPjxWb2x1bWU+MjwvVm9sdW1l

PjxQdWJfUGxhY2U+TmV3IFlvcms8L1B1Yl9QbGFjZT48UHVibGlzaGVyPlNwcmluZ2VyPC9QdWJs

aXNoZXI+PFpaX1dvcmtmb3JtSUQ+MjwvWlpfV29ya2Zvcm1JRD48L01ETD48L0NpdGU+PENpdGU+

PEF1dGhvcj5SaXBsZXk8L0F1dGhvcj48WWVhcj4yMDA5PC9ZZWFyPjxSZWNOdW0+NDQ8L1JlY051

bT48SURUZXh0PnRyZWU6IENsYXNzaWZpY2F0aW9uIGFuZCByZWdyZXNzaW9uIHRyZWVzPC9JRFRl

eHQ+PE1ETCBSZWZfVHlwZT0iQ29tcHV0ZXIgUHJvZ3JhbSI+PFJlZl9UeXBlPkNvbXB1dGVyIFBy

b2dyYW08L1JlZl9UeXBlPjxSZWZfSUQ+NDQ8L1JlZl9JRD48VGl0bGVfUHJpbWFyeT50cmVlOiBD

bGFzc2lmaWNhdGlvbiBhbmQgcmVncmVzc2lvbiB0cmVlczwvVGl0bGVfUHJpbWFyeT48QXV0aG9y

c19QcmltYXJ5PlJpcGxleSxCLjwvQXV0aG9yc19QcmltYXJ5PjxEYXRlX1ByaW1hcnk+MjAwOTwv

RGF0ZV9QcmltYXJ5PjxLZXl3b3Jkcz5SRUdSRVNTSU9OPC9LZXl3b3Jkcz48UmVwcmludD5Ob3Qg

aW4gRmlsZTwvUmVwcmludD48Vm9sdW1lPlIgcGFja2FnZTwvVm9sdW1lPjxJc3N1ZT4xLjAtMjc8

L0lzc3VlPjxQdWJsaXNoZXI+Y3Jhbi5yLXByb2plY3Qub3JnPC9QdWJsaXNoZXI+PFdlYl9VUkw+

Y3Jhbi5yLXByb2plY3Qub3JnL3dlYi9wYWNrYWdlcy9taS9taS5wZGY8L1dlYl9VUkw+PFpaX1dv

cmtmb3JtSUQ+MTE8L1paX1dvcmtmb3JtSUQ+PC9NREw+PC9DaXRlPjwvUmVmbWFuPm==

ADDIN REFMGR.CITE PFJlZm1hbj48Q2l0ZT48QXV0aG9yPkJyZWltYW48L0F1dGhvcj48WWVhcj4xOTg0PC9ZZWFyPjxS

ZWNOdW0+NTwvUmVjTnVtPjxJRFRleHQ+JnF1b3Q7Q2xhc3NpZmljYXRpb24gYW5kIFJlZ3Jlc3Np

b24gVHJlZXMmcXVvdDs8L0lEVGV4dD48TURMIFJlZl9UeXBlPSJCb29rLCBXaG9sZSI+PFJlZl9U

eXBlPkJvb2ssIFdob2xlPC9SZWZfVHlwZT48UmVmX0lEPjU8L1JlZl9JRD48VGl0bGVfUHJpbWFy

eT4mcXVvdDtDbGFzc2lmaWNhdGlvbiBhbmQgUmVncmVzc2lvbiBUcmVlcyZxdW90OzwvVGl0bGVf

UHJpbWFyeT48QXV0aG9yc19QcmltYXJ5PkJyZWltYW4sTC48L0F1dGhvcnNfUHJpbWFyeT48QXV0

aG9yc19QcmltYXJ5PkZyaWVkbWFuLEouSC48L0F1dGhvcnNfUHJpbWFyeT48QXV0aG9yc19Qcmlt

YXJ5Pk9sc2hlbixSLkEuPC9BdXRob3JzX1ByaW1hcnk+PEF1dGhvcnNfUHJpbWFyeT5TdG9uZSxD

LkouPC9BdXRob3JzX1ByaW1hcnk+PERhdGVfUHJpbWFyeT4xOTg0LzEvMTwvRGF0ZV9QcmltYXJ5

PjxSZXByaW50Pk5vdCBpbiBGaWxlPC9SZXByaW50PjxTdGFydF9QYWdlPjE8L1N0YXJ0X1BhZ2U+

PEVuZF9QYWdlPjM2ODwvRW5kX1BhZ2U+PFZvbHVtZT4xc3Q8L1ZvbHVtZT48SXNzdWU+MTwvSXNz

dWU+PFB1Yl9QbGFjZT5Cb2NhIFJhdG9uLCBGTDwvUHViX1BsYWNlPjxQdWJsaXNoZXI+Q2hhcG1h

biBhbmQgSGFsbC9DUkM8L1B1Ymxpc2hlcj48TWlzY18xPjE8L01pc2NfMT48QWRkcmVzcz5Cb2Nh

IFJhdG9uLCBGTDwvQWRkcmVzcz48WlpfV29ya2Zvcm1JRD4yPC9aWl9Xb3JrZm9ybUlEPjwvTURM

PjwvQ2l0ZT48Q2l0ZT48QXV0aG9yPkhhc3RpZTwvQXV0aG9yPjxZZWFyPjIwMDk8L1llYXI+PFJl

Y051bT4xMjwvUmVjTnVtPjxJRFRleHQ+VGhlIEVsZW1lbnRzIG9mIFN0YXRpc3RpY2FsIExlYXJu

aW5nOiBEYXRhIE1pbmluZywgSW5mZXJlbmNlLCBhbmQgUHJlZGljdGlvbjwvSURUZXh0PjxNREwg

UmVmX1R5cGU9IkJvb2ssIFdob2xlIj48UmVmX1R5cGU+Qm9vaywgV2hvbGU8L1JlZl9UeXBlPjxS

ZWZfSUQ+MTI8L1JlZl9JRD48VGl0bGVfUHJpbWFyeT5UaGUgRWxlbWVudHMgb2YgU3RhdGlzdGlj

YWwgTGVhcm5pbmc6IERhdGEgTWluaW5nLCBJbmZlcmVuY2UsIGFuZCBQcmVkaWN0aW9uPC9UaXRs

ZV9QcmltYXJ5PjxBdXRob3JzX1ByaW1hcnk+SGFzdGllLFQ8L0F1dGhvcnNfUHJpbWFyeT48QXV0

aG9yc19QcmltYXJ5PlRpYnNoaXJhbmksUi48L0F1dGhvcnNfUHJpbWFyeT48QXV0aG9yc19Qcmlt

YXJ5PkZyaWVkbWFuLEouPC9BdXRob3JzX1ByaW1hcnk+PERhdGVfUHJpbWFyeT4yMDA5PC9EYXRl

X1ByaW1hcnk+PEtleXdvcmRzPklORkVSRU5DRTwvS2V5d29yZHM+PFJlcHJpbnQ+Tm90IGluIEZp

bGU8L1JlcHJpbnQ+PFN0YXJ0X1BhZ2U+MzMzPC9TdGFydF9QYWdlPjxWb2x1bWU+MjwvVm9sdW1l

PjxQdWJfUGxhY2U+TmV3IFlvcms8L1B1Yl9QbGFjZT48UHVibGlzaGVyPlNwcmluZ2VyPC9QdWJs

aXNoZXI+PFpaX1dvcmtmb3JtSUQ+MjwvWlpfV29ya2Zvcm1JRD48L01ETD48L0NpdGU+PENpdGU+

PEF1dGhvcj5SaXBsZXk8L0F1dGhvcj48WWVhcj4yMDA5PC9ZZWFyPjxSZWNOdW0+NDQ8L1JlY051

bT48SURUZXh0PnRyZWU6IENsYXNzaWZpY2F0aW9uIGFuZCByZWdyZXNzaW9uIHRyZWVzPC9JRFRl

eHQ+PE1ETCBSZWZfVHlwZT0iQ29tcHV0ZXIgUHJvZ3JhbSI+PFJlZl9UeXBlPkNvbXB1dGVyIFBy

b2dyYW08L1JlZl9UeXBlPjxSZWZfSUQ+NDQ8L1JlZl9JRD48VGl0bGVfUHJpbWFyeT50cmVlOiBD

bGFzc2lmaWNhdGlvbiBhbmQgcmVncmVzc2lvbiB0cmVlczwvVGl0bGVfUHJpbWFyeT48QXV0aG9y

c19QcmltYXJ5PlJpcGxleSxCLjwvQXV0aG9yc19QcmltYXJ5PjxEYXRlX1ByaW1hcnk+MjAwOTwv

RGF0ZV9QcmltYXJ5PjxLZXl3b3Jkcz5SRUdSRVNTSU9OPC9LZXl3b3Jkcz48UmVwcmludD5Ob3Qg

aW4gRmlsZTwvUmVwcmludD48Vm9sdW1lPlIgcGFja2FnZTwvVm9sdW1lPjxJc3N1ZT4xLjAtMjc8

L0lzc3VlPjxQdWJsaXNoZXI+Y3Jhbi5yLXByb2plY3Qub3JnPC9QdWJsaXNoZXI+PFdlYl9VUkw+

Y3Jhbi5yLXByb2plY3Qub3JnL3dlYi9wYWNrYWdlcy9taS9taS5wZGY8L1dlYl9VUkw+PFpaX1dv

cmtmb3JtSUQ+MTE8L1paX1dvcmtmb3JtSUQ+PC9NREw+PC9DaXRlPjwvUmVmbWFuPm==

ADDIN EN.CITE.DATA (3, 14, 24) as the conditional models for imputation. CART has several features that suggest it can be a useful imputation engine. It is flexible enough to fit interactions, non-linear relationships, and complex distributions without parametric assumptions or data transformations. And, it does so automatically: there is little tuning needed by the imputer. Using simulation studies, we show that the CART imputation engine can result in more reliable inferences compared to naive applications of MICE based on main-effects generalized linear models. We also apply sequential CART to impute missing values in a study of adverse birth outcomes, which includes a wide array of psychological, health, and environmental variables. The study team expects that interactions among the variables in these domains, rather than main effects alone, are likely to be predictors of adverse birth outcomes. Yet, the nature of these interactions is not known a priori. Hence, the imputations of missing data must be flexible enough to capture the most important interactions in the data. Finally, we check the plausibility of our imputation models using posterior predictive checks ADDIN REFMGR.CITE <Refman><Cite><Author>He</Author><Year>2009</Year><RecNum>235</RecNum><IDText>Multiple Imputation in a large-scale complex survey: a guide</IDText><MDL Ref_Type="Journal"><Ref_Type>Journal</Ref_Type><Ref_ID>235</Ref_ID><Title_Primary>Multiple Imputation in a large-scale complex survey: a guide</Title_Primary><Authors_Primary>He,Y.</Authors_Primary><Authors_Primary>Zaslavsky,A.M.</Authors_Primary><Authors_Primary>Landrum,M.B.</Authors_Primary><Date_Primary>2009</Date_Primary><Keywords>multiple imputation</Keywords><Keywords>imputation</Keywords><Reprint>Not in File</Reprint><Start_Page>1</Start_Page><End_Page>18</End_Page><Periodical>Statistical Methods in Medical Research</Periodical><ZZ_JournalFull><f name="System">Statistical Methods in Medical Research</f></ZZ_JournalFull><ZZ_WorkformID>1</ZZ_WorkformID></MDL></Cite></Refman>(15).MICE AND CARTMultiple imputation through chained equationsSuppose that we have an n×p data matrix Y arranged so that Y=(YP,YC), where YP is composed of the p1 columns of Y that are partially observed, and YC is composed of the remaining columns that are completely observed. Let Yobs be the set of observed elements in Y and let Ymis be the set of missing elements in Y. Finally, we assume that the columns of YP are arranged such that, moving from left to right, the number of missing elements in each column is non-decreasing.To implement MICE, the imputer specifies a set of conditional distributions pYiY-i, where Yi is the ith column of YP, and Y-i is the matrix Y with its ith column removed. The imputed values can be produced with a four step strategy.Fill in initial values for the missing values as followsDefine a matrix Z equal to YCFor i=1,…,p1, impute missing values in Yi with draws from the predictive distribution conditional on Z, and append the completed version of Yi to Z prior to incrementing i For i=1,…,p1, replace the originally missing values of Yi with draws from the predictive distribution conditional on Y-iRepeat step 2 so as to have performed it l timesRepeat steps 1-3 m times, yielding m imputed setsWe order the columns of YP to have increasing numbers of missing values so that we build the models in step 1b with as much information as possible. Although one can formally check stochastic convergence with a diagnostic tool such as the scale reduction factor ADDIN REFMGR.CITE <Refman><Cite><Author>Gelman</Author><Year>1992</Year><RecNum>9</RecNum><IDText>Inference from iterative simulation using multiple sequences</IDText><MDL Ref_Type="Journal"><Ref_Type>Journal</Ref_Type><Ref_ID>9</Ref_ID><Title_Primary>Inference from iterative simulation using multiple sequences</Title_Primary><Authors_Primary>Gelman,A.</Authors_Primary><Authors_Primary>Rubin,D.</Authors_Primary><Date_Primary>1992</Date_Primary><Keywords>INFERENCE</Keywords><Reprint>Not in File</Reprint><Start_Page>457</Start_Page><End_Page>472</End_Page><Periodical>Statistical Science</Periodical><Volume>7</Volume><Issue>4</Issue><ZZ_JournalFull><f name="System">Statistical Science</f></ZZ_JournalFull><ZZ_WorkformID>1</ZZ_WorkformID></MDL></Cite></Refman>(10), using l=10 typically yields satisfactory results ADDIN REFMGR.CITE <Refman><Cite><Author>Raghunathan</Author><Year>2002</Year><RecNum>42</RecNum><IDText>IVEware: Imputation and Variance Estimation Software</IDText><MDL Ref_Type="Generic"><Ref_Type>Generic</Ref_Type><Ref_ID>42</Ref_ID><Title_Primary>IVEware: Imputation and Variance Estimation Software</Title_Primary><Authors_Primary>Raghunathan,T.</Authors_Primary><Authors_Primary>Solenberger,P.</Authors_Primary><Authors_Primary>Van Hoewyk.</Authors_Primary><Date_Primary>2002</Date_Primary><Keywords>imputation</Keywords><Reprint>Not in File</Reprint><Pub_Place>Ann Arbor, MI</Pub_Place><Publisher>Survey Methodology Program, Survey Research Center, Institute for Social Research, University of Michigan</Publisher><ZZ_WorkformID>33</ZZ_WorkformID></MDL></Cite></Refman>(21). It is standard to use generalized linear models (GLMs) as the basis of the predictive draws in steps 1b and 2, but in this paper we adapt CART for this purpose. Classification and regression treesCART models seek to approximate the conditional distribution of a univariate outcome from multiple predictors. The CART algorithm partitions the predictor space so that subsets of units formed by the partitions have relatively homogeneous outcomes. The partitions are found by recursive binary splits of the predictors. The series of splits can be effectively represented by a tree structure, with leaves corresponding to the subsets of units. The values in each leaf represent the conditional distribution of the outcome for units in the data with predictors that satisfy the partitioning criteria that define the leaf. For further discussion of CART, see ADDIN REFMGR.CITE <Refman><Cite><Author>Breiman</Author><Year>1984</Year><RecNum>5</RecNum><IDText>"Classification and Regression Trees"</IDText><MDL Ref_Type="Book, Whole"><Ref_Type>Book, Whole</Ref_Type><Ref_ID>5</Ref_ID><Title_Primary>"Classification and Regression Trees"</Title_Primary><Authors_Primary>Breiman,L.</Authors_Primary><Authors_Primary>Friedman,J.H.</Authors_Primary><Authors_Primary>Olshen,R.A.</Authors_Primary><Authors_Primary>Stone,C.J.</Authors_Primary><Date_Primary>1984/1/1</Date_Primary><Reprint>Not in File</Reprint><Start_Page>1</Start_Page><End_Page>368</End_Page><Volume>1st</Volume><Issue>1</Issue><Pub_Place>Boca Raton, FL</Pub_Place><Publisher>Chapman and Hall/CRC</Publisher><Misc_1>1</Misc_1><Address>Boca Raton, FL</Address><ZZ_WorkformID>2</ZZ_WorkformID></MDL></Cite><Cite><Author>Hastie</Author><Year>2009</Year><RecNum>12</RecNum><IDText>The Elements of Statistical Learning: Data Mining, Inference, and Prediction</IDText><MDL Ref_Type="Book, Whole"><Ref_Type>Book, Whole</Ref_Type><Ref_ID>12</Ref_ID><Title_Primary>The Elements of Statistical Learning: Data Mining, Inference, and Prediction</Title_Primary><Authors_Primary>Hastie,T</Authors_Primary><Authors_Primary>Tibshirani,R.</Authors_Primary><Authors_Primary>Friedman,J.</Authors_Primary><Date_Primary>2009</Date_Primary><Keywords>INFERENCE</Keywords><Reprint>Not in File</Reprint><Start_Page>333</Start_Page><Volume>2</Volume><Pub_Place>New York</Pub_Place><Publisher>Springer</Publisher><ZZ_WorkformID>2</ZZ_WorkformID></MDL></Cite></Refman>(3, 14). An example of a tree structure for a univariate outcome Y and two predictors, X1 and X2, is displayed in Figure 1. Units with X1≥2 fall in the leaf labeled L1, regardless of their value of X2. Units with X1<2 and X2≥0 fall in the leaf labeled L2, and units with X1<2 and X2<0 fall in the leaf labeled L3. Thus, if we wanted to approximate the distribution of Y for units with X1<2 and X2<0, we would use the values of Y in L3. Since CART provides distributions for units defined by various combinations of X, it effectively can result in models with many interaction effects. The primary disadvantages of CART relative to parametric models include decreased efficiency when the parametric models are adequate and discontinuities at partition boundaries PFJlZm1hbj48Q2l0ZT48QXV0aG9yPkZyaWVkbWFuPC9BdXRob3I+PFllYXI+MTk5MTwvWWVhcj48

UmVjTnVtPjIzMjwvUmVjTnVtPjxJRFRleHQ+TXVsdGl2YXJpYXRlIEFkYXB0aXZlIFJlZ3Jlc3Np

b24gU3BsaW5lczwvSURUZXh0PjxNREwgUmVmX1R5cGU9IkpvdXJuYWwiPjxSZWZfVHlwZT5Kb3Vy

bmFsPC9SZWZfVHlwZT48UmVmX0lEPjIzMjwvUmVmX0lEPjxUaXRsZV9QcmltYXJ5Pk11bHRpdmFy

aWF0ZSBBZGFwdGl2ZSBSZWdyZXNzaW9uIFNwbGluZXM8L1RpdGxlX1ByaW1hcnk+PEF1dGhvcnNf

UHJpbWFyeT5GcmllZG1hbixKLkguPC9BdXRob3JzX1ByaW1hcnk+PERhdGVfUHJpbWFyeT4xOTkx

LzM8L0RhdGVfUHJpbWFyeT48S2V5d29yZHM+QURESVRJVkUtTU9ERUxTPC9LZXl3b3Jkcz48S2V5

d29yZHM+QUlEPC9LZXl3b3Jkcz48S2V5d29yZHM+QVBQUk9YSU1BVElPTjwvS2V5d29yZHM+PEtl

eXdvcmRzPkNBUlQ8L0tleXdvcmRzPjxLZXl3b3Jkcz5MSU5FQVIgU01PT1RIRVJTPC9LZXl3b3Jk

cz48S2V5d29yZHM+TU9ERUw8L0tleXdvcmRzPjxLZXl3b3Jkcz5NT0RFTFM8L0tleXdvcmRzPjxL

ZXl3b3Jkcz5NVUxUSVZBUklBQkxFIEZVTkNUSU9OIEFQUFJPWElNQVRJT048L0tleXdvcmRzPjxL

ZXl3b3Jkcz5NVUxUSVZBUklBVEUgU01PT1RISU5HPC9LZXl3b3Jkcz48S2V5d29yZHM+Tk9OUEFS

QU1FVFJJQyBNVUxUSVBMRSBSRUdSRVNTSU9OPC9LZXl3b3Jkcz48S2V5d29yZHM+UFJFRElDVElP

TjwvS2V5d29yZHM+PEtleXdvcmRzPlBST0pFQ1RJT04gUFVSU1VJVDwvS2V5d29yZHM+PEtleXdv

cmRzPlJFQ1VSU0lWRSBQQVJUSVRJT05JTkc8L0tleXdvcmRzPjxLZXl3b3Jkcz5SRUdSRVNTSU9O

PC9LZXl3b3Jkcz48S2V5d29yZHM+U1BMSU5FUzwvS2V5d29yZHM+PEtleXdvcmRzPlNUQVRJU1RJ

Q0FMIExFQVJOSU5HIE5FVVJBTCBORVRXT1JLUzwvS2V5d29yZHM+PEtleXdvcmRzPlNUQVRJU1RJ

Q1M8L0tleXdvcmRzPjxSZXByaW50Pk5vdCBpbiBGaWxlPC9SZXByaW50PjxTdGFydF9QYWdlPjE8

L1N0YXJ0X1BhZ2U+PEVuZF9QYWdlPjY3PC9FbmRfUGFnZT48UGVyaW9kaWNhbD5Bbm5hbHMgb2Yg

U3RhdGlzdGljczwvUGVyaW9kaWNhbD48Vm9sdW1lPjE5PC9Wb2x1bWU+PElzc3VlPjE8L0lzc3Vl

PjxJU1NOX0lTQk4+MDA5MC01MzY0PC9JU1NOX0lTQk4+PFdlYl9VUkw+SVNJOkExOTkxRkYwNDcw

MDAwMTwvV2ViX1VSTD48WlpfSm91cm5hbEZ1bGw+PGYgbmFtZT0iU3lzdGVtIj5Bbm5hbHMgb2Yg

U3RhdGlzdGljczwvZj48L1paX0pvdXJuYWxGdWxsPjxaWl9Xb3JrZm9ybUlEPjE8L1paX1dvcmtm

b3JtSUQ+PC9NREw+PC9DaXRlPjxDaXRlPjxBdXRob3I+UmVpdGVyPC9BdXRob3I+PFllYXI+MjAw

NTwvWWVhcj48UmVjTnVtPjM8L1JlY051bT48SURUZXh0PlVzaW5nIENBUlQgdG8gR2VuZXJhdGUg

UGFydGlhbGx5IFN5bnRoZXRpYyBQdWJsaWMgVXNlIE1pY3JvZGF0YTwvSURUZXh0PjxNREwgUmVm

X1R5cGU9IkpvdXJuYWwiPjxSZWZfVHlwZT5Kb3VybmFsPC9SZWZfVHlwZT48UmVmX0lEPjM8L1Jl

Zl9JRD48VGl0bGVfUHJpbWFyeT5Vc2luZyBDQVJUIHRvIEdlbmVyYXRlIFBhcnRpYWxseSBTeW50

aGV0aWMgUHVibGljIFVzZSBNaWNyb2RhdGE8L1RpdGxlX1ByaW1hcnk+PEF1dGhvcnNfUHJpbWFy

eT5SZWl0ZXIsSi5QLjwvQXV0aG9yc19QcmltYXJ5PjxEYXRlX1ByaW1hcnk+MjAwNTwvRGF0ZV9Q

cmltYXJ5PjxLZXl3b3Jkcz5DQVJUPC9LZXl3b3Jkcz48S2V5d29yZHM+bWljcm9kYXRhPC9LZXl3

b3Jkcz48UmVwcmludD5Ob3QgaW4gRmlsZTwvUmVwcmludD48U3RhcnRfUGFnZT43PC9TdGFydF9Q

YWdlPjxFbmRfUGFnZT4zMDwvRW5kX1BhZ2U+PFBlcmlvZGljYWw+Sm91cm5hbCBvZiBPZmZpY2lh

bCBTdGF0aXN0aWNzLVN0b2NraG9sbTwvUGVyaW9kaWNhbD48Vm9sdW1lPjIxPC9Wb2x1bWU+PElz

c3VlPjM8L0lzc3VlPjxaWl9Kb3VybmFsRnVsbD48ZiBuYW1lPSJTeXN0ZW0iPkpvdXJuYWwgb2Yg

T2ZmaWNpYWwgU3RhdGlzdGljcy1TdG9ja2hvbG08L2Y+PC9aWl9Kb3VybmFsRnVsbD48WlpfV29y

a2Zvcm1JRD4xPC9aWl9Xb3JrZm9ybUlEPjwvTURMPjwvQ2l0ZT48L1JlZm1hbj4A

ADDIN REFMGR.CITE PFJlZm1hbj48Q2l0ZT48QXV0aG9yPkZyaWVkbWFuPC9BdXRob3I+PFllYXI+MTk5MTwvWWVhcj48

UmVjTnVtPjIzMjwvUmVjTnVtPjxJRFRleHQ+TXVsdGl2YXJpYXRlIEFkYXB0aXZlIFJlZ3Jlc3Np

b24gU3BsaW5lczwvSURUZXh0PjxNREwgUmVmX1R5cGU9IkpvdXJuYWwiPjxSZWZfVHlwZT5Kb3Vy

bmFsPC9SZWZfVHlwZT48UmVmX0lEPjIzMjwvUmVmX0lEPjxUaXRsZV9QcmltYXJ5Pk11bHRpdmFy

aWF0ZSBBZGFwdGl2ZSBSZWdyZXNzaW9uIFNwbGluZXM8L1RpdGxlX1ByaW1hcnk+PEF1dGhvcnNf

UHJpbWFyeT5GcmllZG1hbixKLkguPC9BdXRob3JzX1ByaW1hcnk+PERhdGVfUHJpbWFyeT4xOTkx

LzM8L0RhdGVfUHJpbWFyeT48S2V5d29yZHM+QURESVRJVkUtTU9ERUxTPC9LZXl3b3Jkcz48S2V5

d29yZHM+QUlEPC9LZXl3b3Jkcz48S2V5d29yZHM+QVBQUk9YSU1BVElPTjwvS2V5d29yZHM+PEtl

eXdvcmRzPkNBUlQ8L0tleXdvcmRzPjxLZXl3b3Jkcz5MSU5FQVIgU01PT1RIRVJTPC9LZXl3b3Jk

cz48S2V5d29yZHM+TU9ERUw8L0tleXdvcmRzPjxLZXl3b3Jkcz5NT0RFTFM8L0tleXdvcmRzPjxL

ZXl3b3Jkcz5NVUxUSVZBUklBQkxFIEZVTkNUSU9OIEFQUFJPWElNQVRJT048L0tleXdvcmRzPjxL

ZXl3b3Jkcz5NVUxUSVZBUklBVEUgU01PT1RISU5HPC9LZXl3b3Jkcz48S2V5d29yZHM+Tk9OUEFS

QU1FVFJJQyBNVUxUSVBMRSBSRUdSRVNTSU9OPC9LZXl3b3Jkcz48S2V5d29yZHM+UFJFRElDVElP

TjwvS2V5d29yZHM+PEtleXdvcmRzPlBST0pFQ1RJT04gUFVSU1VJVDwvS2V5d29yZHM+PEtleXdv

cmRzPlJFQ1VSU0lWRSBQQVJUSVRJT05JTkc8L0tleXdvcmRzPjxLZXl3b3Jkcz5SRUdSRVNTSU9O

PC9LZXl3b3Jkcz48S2V5d29yZHM+U1BMSU5FUzwvS2V5d29yZHM+PEtleXdvcmRzPlNUQVRJU1RJ

Q0FMIExFQVJOSU5HIE5FVVJBTCBORVRXT1JLUzwvS2V5d29yZHM+PEtleXdvcmRzPlNUQVRJU1RJ

Q1M8L0tleXdvcmRzPjxSZXByaW50Pk5vdCBpbiBGaWxlPC9SZXByaW50PjxTdGFydF9QYWdlPjE8

L1N0YXJ0X1BhZ2U+PEVuZF9QYWdlPjY3PC9FbmRfUGFnZT48UGVyaW9kaWNhbD5Bbm5hbHMgb2Yg

U3RhdGlzdGljczwvUGVyaW9kaWNhbD48Vm9sdW1lPjE5PC9Wb2x1bWU+PElzc3VlPjE8L0lzc3Vl

PjxJU1NOX0lTQk4+MDA5MC01MzY0PC9JU1NOX0lTQk4+PFdlYl9VUkw+SVNJOkExOTkxRkYwNDcw

MDAwMTwvV2ViX1VSTD48WlpfSm91cm5hbEZ1bGw+PGYgbmFtZT0iU3lzdGVtIj5Bbm5hbHMgb2Yg

U3RhdGlzdGljczwvZj48L1paX0pvdXJuYWxGdWxsPjxaWl9Xb3JrZm9ybUlEPjE8L1paX1dvcmtm

b3JtSUQ+PC9NREw+PC9DaXRlPjxDaXRlPjxBdXRob3I+UmVpdGVyPC9BdXRob3I+PFllYXI+MjAw

NTwvWWVhcj48UmVjTnVtPjM8L1JlY051bT48SURUZXh0PlVzaW5nIENBUlQgdG8gR2VuZXJhdGUg

UGFydGlhbGx5IFN5bnRoZXRpYyBQdWJsaWMgVXNlIE1pY3JvZGF0YTwvSURUZXh0PjxNREwgUmVm

X1R5cGU9IkpvdXJuYWwiPjxSZWZfVHlwZT5Kb3VybmFsPC9SZWZfVHlwZT48UmVmX0lEPjM8L1Jl

Zl9JRD48VGl0bGVfUHJpbWFyeT5Vc2luZyBDQVJUIHRvIEdlbmVyYXRlIFBhcnRpYWxseSBTeW50

aGV0aWMgUHVibGljIFVzZSBNaWNyb2RhdGE8L1RpdGxlX1ByaW1hcnk+PEF1dGhvcnNfUHJpbWFy

eT5SZWl0ZXIsSi5QLjwvQXV0aG9yc19QcmltYXJ5PjxEYXRlX1ByaW1hcnk+MjAwNTwvRGF0ZV9Q

cmltYXJ5PjxLZXl3b3Jkcz5DQVJUPC9LZXl3b3Jkcz48S2V5d29yZHM+bWljcm9kYXRhPC9LZXl3

b3Jkcz48UmVwcmludD5Ob3QgaW4gRmlsZTwvUmVwcmludD48U3RhcnRfUGFnZT43PC9TdGFydF9Q

YWdlPjxFbmRfUGFnZT4zMDwvRW5kX1BhZ2U+PFBlcmlvZGljYWw+Sm91cm5hbCBvZiBPZmZpY2lh

bCBTdGF0aXN0aWNzLVN0b2NraG9sbTwvUGVyaW9kaWNhbD48Vm9sdW1lPjIxPC9Wb2x1bWU+PElz

c3VlPjM8L0lzc3VlPjxaWl9Kb3VybmFsRnVsbD48ZiBuYW1lPSJTeXN0ZW0iPkpvdXJuYWwgb2Yg

T2ZmaWNpYWwgU3RhdGlzdGljcy1TdG9ja2hvbG08L2Y+PC9aWl9Kb3VybmFsRnVsbD48WlpfV29y

a2Zvcm1JRD4xPC9aWl9Xb3JrZm9ybUlEPjwvTURMPjwvQ2l0ZT48L1JlZm1hbj4A

ADDIN EN.CITE.DATA (8, 22). Additionally, large trees can be difficult to interpret, but this is not a major concern when using CART for imputations. Categorical predictors with many levels can cause computational difficulties for CART, as it examines all possible partitions of predictors when selecting splits. For example, a categorical predictor with 32 levels—which is the hard-coded maximum number of levels in the “tree” routine for fitting CART in the software package R—results in over 2 billion potential partitions ADDIN REFMGR.CITE <Refman><Cite><Author>Ripley</Author><Year>2009</Year><RecNum>44</RecNum><IDText>tree: Classification and regression trees</IDText><MDL Ref_Type="Computer Program"><Ref_Type>Computer Program</Ref_Type><Ref_ID>44</Ref_ID><Title_Primary>tree: Classification and regression trees</Title_Primary><Authors_Primary>Ripley,B.</Authors_Primary><Date_Primary>2009</Date_Primary><Keywords>REGRESSION</Keywords><Reprint>Not in File</Reprint><Volume>R package</Volume><Issue>1.0-27</Issue><Publisher>cran.r-</Publisher><Web_URL>cran.web/packages/mi/mi.pdf</Web_URL><ZZ_WorkformID>11</ZZ_WorkformID></MDL></Cite></Refman>(24).After growing a tree, it is possible to prune it by removing branches. When using trees as an analytical tool, pruning is desirable because smaller trees are easier to interpret, and they are less prone to over-fitting the data. When using trees as an imputation engine, interpretation is not a primary concern; we primarily seek plausible imputations. Furthermore, it is generally advisable to use large imputation models, so as to minimize bias ADDIN REFMGR.CITE <Refman><Cite><Author>Rubin</Author><Year>1987</Year><RecNum>46</RecNum><IDText>Multiple imputation for nonresponse in surveys</IDText><MDL Ref_Type="Book, Whole"><Ref_Type>Book, Whole</Ref_Type><Ref_ID>46</Ref_ID><Title_Primary>Multiple imputation for nonresponse in surveys</Title_Primary><Authors_Primary>Rubin,D.</Authors_Primary><Date_Primary>1987</Date_Primary><Keywords>imputation</Keywords><Keywords>multiple imputation</Keywords><Keywords>nonresponse in surveys</Keywords><Reprint>Not in File</Reprint><Pub_Place>Hoboken, NJ</Pub_Place><Publisher>Wiley-IEEE</Publisher><ZZ_WorkformID>2</ZZ_WorkformID></MDL></Cite></Refman>(25). Therefore, we recommend pruning weakly if at all. In our applications of the technique, we do not prune the trees. Rather, we modulate the size of trees by requiring a minimum number of observations in each leaf and by controlling the minimum heterogeneity in the values in the leaf in order to consider it for further splitting. To implement sequential CART, we use steps 1-4 with CART models. In step 1b we use a CART of each Yi on Z, and in step 2 we use CARTs of each Yi on Y-i. We take draws from the predictive distribution by sampling elements from the leaf that corresponds to the covariate values of the record of interest. Using Figure 1 as an example, for a record with (X1<2, X2<0) and missing Y, we sample a value of Y from L3. In order to reflect uncertainty about the population distributions in the leaves, we actually perform a Bayesian bootstrap ADDIN REFMGR.CITE <Refman><Cite><Author>Rubin</Author><Year>1981</Year><RecNum>45</RecNum><IDText>The Bayesian Bootstrap</IDText><MDL Ref_Type="Journal"><Ref_Type>Journal</Ref_Type><Ref_ID>45</Ref_ID><Title_Primary>The Bayesian Bootstrap</Title_Primary><Authors_Primary>Rubin,D.B.</Authors_Primary><Date_Primary>1981</Date_Primary><Keywords>STATISTICS</Keywords><Reprint>Not in File</Reprint><Start_Page>130</Start_Page><End_Page>134</End_Page><Periodical>Annals of Statistics</Periodical><Volume>9</Volume><Issue>1</Issue><ISSN_ISBN>0090-5364</ISSN_ISBN><Web_URL>ISI:A1981LA53000011</Web_URL><ZZ_JournalFull><f name="System">Annals of Statistics</f></ZZ_JournalFull><ZZ_WorkformID>1</ZZ_WorkformID></MDL></Cite></Refman>(26) within each leaf before sampling. For continuously-valued variables, it is also possible to draw predictions from a smoothed distributional estimator. CART models can be used with continuous and categorical variables, both as dependent and independent variables. Users must specify nominal variables to ensure that they are not treated as continuous. Because CART imputations come from the observed values, certain restrictions, e.g., variables that must be between zero and one or that must be positive, are automatically enforced. Skip patterns can be handled in ways akin to those for existing multiple imputation packages like IVEware ADDIN REFMGR.CITE <Refman><Cite><Author>Raghunathan</Author><Year>2002</Year><RecNum>42</RecNum><IDText>IVEware: Imputation and Variance Estimation Software</IDText><MDL Ref_Type="Generic"><Ref_Type>Generic</Ref_Type><Ref_ID>42</Ref_ID><Title_Primary>IVEware: Imputation and Variance Estimation Software</Title_Primary><Authors_Primary>Raghunathan,T.</Authors_Primary><Authors_Primary>Solenberger,P.</Authors_Primary><Authors_Primary>Van Hoewyk.</Authors_Primary><Date_Primary>2002</Date_Primary><Keywords>imputation</Keywords><Reprint>Not in File</Reprint><Pub_Place>Ann Arbor, MI</Pub_Place><Publisher>Survey Methodology Program, Survey Research Center, Institute for Social Research, University of Michigan</Publisher><ZZ_WorkformID>33</ZZ_WorkformID></MDL></Cite></Refman>(21).CART has been suggested previously as the basis of imputation algorithms, somewhat outside of the standard MICE framework. It has been called “an ideal choice for this imputation ‘engine’” (14, p.333). Rather than filling in initial values and using l>1 iterations, these authors suggest using surrogate splits to deal with the issue of missing values in more than one column. This was implemented by ADDIN REFMGR.CITE <Refman><Cite><Author>Dai</Author><Year>2006</Year><RecNum>7</RecNum><IDText>Imputation methods to improve inference in SNP association studies</IDText><MDL Ref_Type="Journal"><Ref_Type>Journal</Ref_Type><Ref_ID>7</Ref_ID><Title_Primary>Imputation methods to improve inference in SNP association studies</Title_Primary><Authors_Primary>Dai,J.Y.</Authors_Primary><Authors_Primary>Ruczinski,I.</Authors_Primary><Authors_Primary>LeBlanc,M.</Authors_Primary><Authors_Primary>Kooperberg,C.</Authors_Primary><Date_Primary>2006/12</Date_Primary><Keywords>ALGORITHM</Keywords><Keywords>CANCER</Keywords><Keywords>CART</Keywords><Keywords>DISEASE</Keywords><Keywords>EM algorithm</Keywords><Keywords>GENE</Keywords><Keywords>GENOTYPE DATA</Keywords><Keywords>Gibbs sampler</Keywords><Keywords>HAPLOTYPE INFERENCE</Keywords><Keywords>INFERENCE</Keywords><Keywords>linkage disequilibrium</Keywords><Keywords>LINKAGE PHASE</Keywords><Keywords>missing data</Keywords><Keywords>multiple imputation</Keywords><Keywords>polymorphisms</Keywords><Keywords>POPULATION</Keywords><Keywords>REGRESSION</Keywords><Reprint>Not in File</Reprint><Start_Page>690</Start_Page><End_Page>702</End_Page><Periodical>Genetic Epidemiology</Periodical><Volume>30</Volume><Issue>8</Issue><ISSN_ISBN>0741-0395</ISSN_ISBN><Misc_3>DOI 10.1002/gepi.20180</Misc_3><Address>Fred Hutchinson Canc Res Ctr, Div Publ Hlth Sci, Seattle, WA 98109 USA
Univ Washington, Dept Biostat, Seattle, WA 98195 USA
Johns Hopkins Univ, Sch Publ Hlth, Dept Biostat, Baltimore, MD USA</Address><Web_URL>ISI:000242383400005</Web_URL><ZZ_JournalFull><f name="System">Genetic Epidemiology</f></ZZ_JournalFull><ZZ_WorkformID>1</ZZ_WorkformID></MDL></Cite></Refman>(7). Others have used trees as an imputation engine, but only to obtain a single imputation and without the multiple iterations (i.e., l>1) of typical MICE algorithms ADDIN REFMGR.CITE <Refman><Cite><Author>Conversano</Author><Year>2002</Year><RecNum>6</RecNum><IDText>Missing data incremental imputation through tree based methods</IDText><MDL Ref_Type="Journal"><Ref_Type>Journal</Ref_Type><Ref_ID>6</Ref_ID><Title_Primary>Missing data incremental imputation through tree based methods</Title_Primary><Authors_Primary>Conversano,C.</Authors_Primary><Authors_Primary>Cappelli,C.</Authors_Primary><Date_Primary>2002</Date_Primary><Keywords>missing data</Keywords><Reprint>Not in File</Reprint><Start_Page>455</Start_Page><End_Page>460</End_Page><Periodical>Compstat: Proceedings in Computational Statistics: 15th Symposium Held in Berlin, Germany</Periodical><Publisher>Physica Verlag</Publisher><ZZ_JournalFull><f name="System">Compstat: Proceedings in Computational Statistics: 15th Symposium Held in Berlin, Germany</f></ZZ_JournalFull><ZZ_WorkformID>1</ZZ_WorkformID></MDL></Cite></Refman>(6). Our approach is most like ADDIN REFMGR.CITE <Refman><Cite><Author>Reiter</Author><Year>2005</Year><RecNum>3</RecNum><IDText>Using CART to Generate Partially Synthetic Public Use Microdata</IDText><MDL Ref_Type="Journal"><Ref_Type>Journal</Ref_Type><Ref_ID>3</Ref_ID><Title_Primary>Using CART to Generate Partially Synthetic Public Use Microdata</Title_Primary><Authors_Primary>Reiter,J.P.</Authors_Primary><Date_Primary>2005</Date_Primary><Keywords>CART</Keywords><Keywords>microdata</Keywords><Reprint>Not in File</Reprint><Start_Page>7</Start_Page><End_Page>30</End_Page><Periodical>Journal of Official Statistics-Stockholm</Periodical><Volume>21</Volume><Issue>3</Issue><ZZ_JournalFull><f name="System">Journal of Official Statistics-Stockholm</f></ZZ_JournalFull><ZZ_WorkformID>1</ZZ_WorkformID></MDL></Cite></Refman>(22), which uses a sequential CART approach to generate replacement values for observed confidential data. APPLICATION TO SIMULATED DATATo assess the performance of a CART-based MICE algorithm, we compare it to a na?ve application of the GLM-based “mi” package in R ADDIN REFMGR.CITE <Refman><Cite><Author>Su</Author><Year>2009</Year><RecNum>50</RecNum><IDText>Multiple imputation with diagnostics (mi) in R: Opening windown into the black box</IDText><MDL Ref_Type="Journal"><Ref_Type>Journal</Ref_Type><Ref_ID>50</Ref_ID><Title_Primary>Multiple imputation with diagnostics (mi) in R: Opening windown into the black box</Title_Primary><Authors_Primary>Su,Y.</Authors_Primary><Authors_Primary>Gelman,A.</Authors_Primary><Authors_Primary>Hill,J.</Authors_Primary><Authors_Primary>Yajima,M.</Authors_Primary><Date_Primary>2009</Date_Primary><Keywords>imputation</Keywords><Keywords>multiple imputation</Keywords><Reprint>Not in File</Reprint><Start_Page>1</Start_Page><End_Page>27</End_Page><Periodical>Journal of Statistical Software</Periodical><Volume>20</Volume><Issue>2</Issue><ZZ_JournalFull><f name="System">Journal of Statistical Software</f></ZZ_JournalFull><ZZ_WorkformID>1</ZZ_WorkformID></MDL></Cite><Cite><Author>Gelman</Author><Year>2009</Year><RecNum>8</RecNum><IDText>mi: Missing Data Imputation</IDText><MDL Ref_Type="Computer Program"><Ref_Type>Computer Program</Ref_Type><Ref_ID>8</Ref_ID><Title_Primary>mi: Missing Data Imputation</Title_Primary><Authors_Primary>Gelman,A.</Authors_Primary><Authors_Primary>Hill,J</Authors_Primary><Authors_Primary>Yajima,M.</Authors_Primary><Authors_Primary>Su,Y.-S.</Authors_Primary><Authors_Primary>Pittau,M.G.</Authors_Primary><Date_Primary>2009</Date_Primary><Keywords>missing data</Keywords><Reprint>Not in File</Reprint><Issue>0.08-04.01</Issue><Publisher>cran.r-</Publisher><Web_URL>cran.web/packages/mi/mi.pdf</Web_URL><ZZ_WorkformID>11</ZZ_WorkformID></MDL></Cite></Refman>(9, 30) using simulation studies. The data-generating model isyi= β0+β1x1,i+β2x2,i+β3x3,i+β4x8,i+β5x9,i+β6x3,i2+β7x1,ix2,i+ β8x8,ix9,i+εiwhere the true value of the regression parameters β=(0,0.5,0.5,0.5,0.5,0.5,1,1). The errors εi have independent, standard normal distributions. The explanatory variables are drawn from a multivariate normal distribution such that the first four columns have correlation 0.5 and the last six columns have correlation 0.3. We simulate QUOTE n=5000 1000 observations from this design and delete observations from Y and X1 through X8 via a missing at random mechanism that depends on X9 and X10, which are completely observed. On average, this results in around 17% missing values in every variable except X9 and X10; on average, fewer than 25% of the records are complete. We perform multiple imputation using the “mi” package default settings, and its adaptive choice of l ADDIN REFMGR.CITE <Refman><Cite><Author>Su</Author><Year>2009</Year><RecNum>50</RecNum><IDText>Multiple imputation with diagnostics (mi) in R: Opening windown into the black box</IDText><MDL Ref_Type="Journal"><Ref_Type>Journal</Ref_Type><Ref_ID>50</Ref_ID><Title_Primary>Multiple imputation with diagnostics (mi) in R: Opening windown into the black box</Title_Primary><Authors_Primary>Su,Y.</Authors_Primary><Authors_Primary>Gelman,A.</Authors_Primary><Authors_Primary>Hill,J.</Authors_Primary><Authors_Primary>Yajima,M.</Authors_Primary><Date_Primary>2009</Date_Primary><Keywords>imputation</Keywords><Keywords>multiple imputation</Keywords><Reprint>Not in File</Reprint><Start_Page>1</Start_Page><End_Page>27</End_Page><Periodical>Journal of Statistical Software</Periodical><Volume>20</Volume><Issue>2</Issue><ZZ_JournalFull><f name="System">Journal of Statistical Software</f></ZZ_JournalFull><ZZ_WorkformID>1</ZZ_WorkformID></MDL></Cite><Cite><Author>Gelman</Author><Year>2009</Year><RecNum>8</RecNum><IDText>mi: Missing Data Imputation</IDText><MDL Ref_Type="Computer Program"><Ref_Type>Computer Program</Ref_Type><Ref_ID>8</Ref_ID><Title_Primary>mi: Missing Data Imputation</Title_Primary><Authors_Primary>Gelman,A.</Authors_Primary><Authors_Primary>Hill,J</Authors_Primary><Authors_Primary>Yajima,M.</Authors_Primary><Authors_Primary>Su,Y.-S.</Authors_Primary><Authors_Primary>Pittau,M.G.</Authors_Primary><Date_Primary>2009</Date_Primary><Keywords>missing data</Keywords><Reprint>Not in File</Reprint><Issue>0.08-04.01</Issue><Publisher>cran.r-</Publisher><Web_URL>cran.web/packages/mi/mi.pdf</Web_URL><ZZ_WorkformID>11</ZZ_WorkformID></MDL></Cite></Refman>(9, 30). We also carry out the CART-based method with l=5, 10 and 20, using the “tree” package in R to fit the CART models ADDIN REFMGR.CITE <Refman><Cite><Author>Ripley</Author><Year>2009</Year><RecNum>44</RecNum><IDText>tree: Classification and regression trees</IDText><MDL Ref_Type="Computer Program"><Ref_Type>Computer Program</Ref_Type><Ref_ID>44</Ref_ID><Title_Primary>tree: Classification and regression trees</Title_Primary><Authors_Primary>Ripley,B.</Authors_Primary><Date_Primary>2009</Date_Primary><Keywords>REGRESSION</Keywords><Reprint>Not in File</Reprint><Volume>R package</Volume><Issue>1.0-27</Issue><Publisher>cran.r-</Publisher><Web_URL>cran.web/packages/mi/mi.pdf</Web_URL><ZZ_WorkformID>11</ZZ_WorkformID></MDL></Cite></Refman>(24). A basic implementation of this procedure—along with the predictive diagnostic check described below—is available at stat.duke.edu/~jerry/software/CARTMI.html. We find that the performance of the sequential CART imputations is insensitive to the number of iterations l for this application; we only present the l=10 results. We use a minimum leaf size of 5, and a leaf is not considered for further splitting if the deviance of its values is less than 0.0001. This combination results in relatively large trees, which exhibit less bias than those grown with a larger leaf size or deviance criterion. We generate m=10 imputed sets for each method, although in some situations using m=20 or 40 may be warranted for added accuracy ADDIN REFMGR.CITE <Refman><Cite><Author>Graham</Author><Year>2007</Year><RecNum>231</RecNum><IDText>How many imputations are really needed? - Some practical clarifications of multiple imputation theory</IDText><MDL Ref_Type="Journal"><Ref_Type>Journal</Ref_Type><Ref_ID>231</Ref_ID><Title_Primary>How many imputations are really needed? - Some practical clarifications of multiple imputation theory</Title_Primary><Authors_Primary>Graham,J.W.</Authors_Primary><Authors_Primary>Olchowski,A.E.</Authors_Primary><Authors_Primary>Gilreath,T.D.</Authors_Primary><Date_Primary>2007/9</Date_Primary><Keywords>full information maximum likelihood</Keywords><Keywords>imputation</Keywords><Keywords>LIKELIHOOD</Keywords><Keywords>missing data</Keywords><Keywords>MISSING-DATA</Keywords><Keywords>MODEL</Keywords><Keywords>MODELS</Keywords><Keywords>multiple imputation</Keywords><Keywords>number of imputations</Keywords><Keywords>P-VALUE</Keywords><Keywords>REGRESSION</Keywords><Keywords>statistical power</Keywords><Keywords>VALUES</Keywords><Reprint>Not in File</Reprint><Start_Page>206</Start_Page><End_Page>213</End_Page><Periodical>Prevention Science</Periodical><Volume>8</Volume><Issue>3</Issue><ISSN_ISBN>1389-4986</ISSN_ISBN><Misc_3>DOI 10.1007/s11121-007-0070-9</Misc_3><Address>Penn State Univ, Dept Biobehav Hlth, University Pk, PA 16802 USA</Address><Web_URL>ISI:000249206400004</Web_URL><ZZ_JournalFull><f name="System">Prevention Science</f></ZZ_JournalFull><ZZ_WorkformID>1</ZZ_WorkformID></MDL></Cite></Refman>(12). We include Y as a predictor in the imputation models for each X PFJlZm1hbj48Q2l0ZT48QXV0aG9yPkxpdHRsZTwvQXV0aG9yPjxZZWFyPjE5OTI8L1llYXI+PFJl

Y051bT4yMzM8L1JlY051bT48SURUZXh0PlJlZ3Jlc3Npb24gd2l0aCBNaXNzaW5nIFhzIC0gQSBS

ZXZpZXc8L0lEVGV4dD48TURMIFJlZl9UeXBlPSJKb3VybmFsIj48UmVmX1R5cGU+Sm91cm5hbDwv

UmVmX1R5cGU+PFJlZl9JRD4yMzM8L1JlZl9JRD48VGl0bGVfUHJpbWFyeT5SZWdyZXNzaW9uIHdp

dGggTWlzc2luZyBYcyAtIEEgUmV2aWV3PC9UaXRsZV9QcmltYXJ5PjxBdXRob3JzX1ByaW1hcnk+

TGl0dGxlLFIuSi5BLjwvQXV0aG9yc19QcmltYXJ5PjxEYXRlX1ByaW1hcnk+MTk5Mi8xMjwvRGF0

ZV9QcmltYXJ5PjxLZXl3b3Jkcz5BTEdPUklUSE08L0tleXdvcmRzPjxLZXl3b3Jkcz5CQVlFU0lB

TiBJTkZFUkVOQ0U8L0tleXdvcmRzPjxLZXl3b3Jkcz5DT1ZBUklBTkNFPC9LZXl3b3Jkcz48S2V5

d29yZHM+REFUQSBBVUdNRU5UQVRJT048L0tleXdvcmRzPjxLZXl3b3Jkcz5pbXB1dGF0aW9uPC9L

ZXl3b3Jkcz48S2V5d29yZHM+SU1QVVRFRCBEQVRBPC9LZXl3b3Jkcz48S2V5d29yZHM+SU5DT01Q

TEVURSBEQVRBPC9LZXl3b3Jkcz48S2V5d29yZHM+SU5DT01QTEVURS1EQVRBPC9LZXl3b3Jkcz48

S2V5d29yZHM+SU5GRVJFTkNFPC9LZXl3b3Jkcz48S2V5d29yZHM+TElLRUxJSE9PRDwvS2V5d29y

ZHM+PEtleXdvcmRzPk1BWElNVU0tTElLRUxJSE9PRDwvS2V5d29yZHM+PEtleXdvcmRzPm1pc3Np

bmcgZGF0YTwvS2V5d29yZHM+PEtleXdvcmRzPk1JU1NJTkcgVkFMVUVTPC9LZXl3b3Jkcz48S2V5

d29yZHM+TUlTU0lORy1EQVRBPC9LZXl3b3Jkcz48S2V5d29yZHM+TU9ERUw8L0tleXdvcmRzPjxL

ZXl3b3Jkcz5tdWx0aXBsZSBpbXB1dGF0aW9uPC9LZXl3b3Jkcz48S2V5d29yZHM+UEFSQU1FVEVS

UzwvS2V5d29yZHM+PEtleXdvcmRzPlBPU1RFUklPUiBESVNUUklCVVRJT05TPC9LZXl3b3Jkcz48

S2V5d29yZHM+UkVHUkVTU0lPTjwvS2V5d29yZHM+PEtleXdvcmRzPlZBTFVFUzwvS2V5d29yZHM+

PFJlcHJpbnQ+Tm90IGluIEZpbGU8L1JlcHJpbnQ+PFN0YXJ0X1BhZ2U+MTIyNzwvU3RhcnRfUGFn

ZT48RW5kX1BhZ2U+MTIzNzwvRW5kX1BhZ2U+PFBlcmlvZGljYWw+Sm91cm5hbCBvZiB0aGUgQW1l

cmljYW4gU3RhdGlzdGljYWwgQXNzb2NpYXRpb248L1BlcmlvZGljYWw+PFZvbHVtZT44NzwvVm9s

dW1lPjxJc3N1ZT40MjA8L0lzc3VlPjxJU1NOX0lTQk4+MDE2Mi0xNDU5PC9JU1NOX0lTQk4+PFdl

Yl9VUkw+SVNJOkExOTkyS0I4OTYwMDAzODwvV2ViX1VSTD48WlpfSm91cm5hbEZ1bGw+PGYgbmFt

ZT0iU3lzdGVtIj5Kb3VybmFsIG9mIHRoZSBBbWVyaWNhbiBTdGF0aXN0aWNhbCBBc3NvY2lhdGlv

bjwvZj48L1paX0pvdXJuYWxGdWxsPjxaWl9Xb3JrZm9ybUlEPjE8L1paX1dvcmtmb3JtSUQ+PC9N

REw+PC9DaXRlPjxDaXRlPjxBdXRob3I+TW9vbnM8L0F1dGhvcj48WWVhcj4yMDA2PC9ZZWFyPjxS

ZWNOdW0+MjM0PC9SZWNOdW0+PElEVGV4dD5Vc2luZyB0aGUgb3V0Y29tZSBmb3IgaW1wdXRhdGlv

biBvZiBtaXNzaW5nIHByZWRpY3RvciB2YWx1ZXMgd2FzIHByZWZlcnJlZDwvSURUZXh0PjxNREwg

UmVmX1R5cGU9IkpvdXJuYWwiPjxSZWZfVHlwZT5Kb3VybmFsPC9SZWZfVHlwZT48UmVmX0lEPjIz

NDwvUmVmX0lEPjxUaXRsZV9QcmltYXJ5PlVzaW5nIHRoZSBvdXRjb21lIGZvciBpbXB1dGF0aW9u

IG9mIG1pc3NpbmcgcHJlZGljdG9yIHZhbHVlcyB3YXMgcHJlZmVycmVkPC9UaXRsZV9QcmltYXJ5

PjxBdXRob3JzX1ByaW1hcnk+TW9vbnMsSy5HLk0uPC9BdXRob3JzX1ByaW1hcnk+PEF1dGhvcnNf

UHJpbWFyeT5Eb25kZXJzLFIuQS5SLlQuPC9BdXRob3JzX1ByaW1hcnk+PEF1dGhvcnNfUHJpbWFy

eT5TdGlqbmVuLFQuPC9BdXRob3JzX1ByaW1hcnk+PEF1dGhvcnNfUHJpbWFyeT5IYXJyZWxsLEYu

RS48L0F1dGhvcnNfUHJpbWFyeT48RGF0ZV9QcmltYXJ5PjIwMDYvMTA8L0RhdGVfUHJpbWFyeT48

S2V5d29yZHM+YmlhczwvS2V5d29yZHM+PEtleXdvcmRzPkNMSU5JQ0FMLVRSSUFMUzwvS2V5d29y

ZHM+PEtleXdvcmRzPmltcHV0YXRpb248L0tleXdvcmRzPjxLZXl3b3Jkcz5JTkZFUkVOQ0U8L0tl

eXdvcmRzPjxLZXl3b3Jkcz5taXNzaW5nIGF0IHJhbmRvbTwvS2V5d29yZHM+PEtleXdvcmRzPm1p

c3NpbmcgcHJlZGljdG9yczwvS2V5d29yZHM+PEtleXdvcmRzPk1JU1NJTkcgVkFMVUVTPC9LZXl3

b3Jkcz48S2V5d29yZHM+TU9ERUw8L0tleXdvcmRzPjxLZXl3b3Jkcz5tdWx0aXBsZSBpbXB1dGF0

aW9uPC9LZXl3b3Jkcz48S2V5d29yZHM+cHJlY2lzaW9uPC9LZXl3b3Jkcz48S2V5d29yZHM+UFJF

RElDVElPTjwvS2V5d29yZHM+PEtleXdvcmRzPlBVTE1PTkFSWS1FTUJPTElTTTwvS2V5d29yZHM+

PEtleXdvcmRzPlJFR1JFU1NJT048L0tleXdvcmRzPjxLZXl3b3Jkcz5SSVNLPC9LZXl3b3Jkcz48

S2V5d29yZHM+VkFMVUVTPC9LZXl3b3Jkcz48S2V5d29yZHM+VkFSSUFCTEVTPC9LZXl3b3Jkcz48

UmVwcmludD5Ob3QgaW4gRmlsZTwvUmVwcmludD48U3RhcnRfUGFnZT4xMDkyPC9TdGFydF9QYWdl

PjxFbmRfUGFnZT4xMTAxPC9FbmRfUGFnZT48UGVyaW9kaWNhbD5Kb3VybmFsIG9mIENsaW5pY2Fs

IEVwaWRlbWlvbG9neTwvUGVyaW9kaWNhbD48Vm9sdW1lPjU5PC9Wb2x1bWU+PElzc3VlPjEwPC9J

c3N1ZT48SVNTTl9JU0JOPjA4OTUtNDM1NjwvSVNTTl9JU0JOPjxNaXNjXzM+RE9JIDEwLjEwMTYv

ai5qY2xpbmVwaS4yMDA2LjAxLjAwOTwvTWlzY18zPjxBZGRyZXNzPlVuaXYgVXRyZWNodCwgTWVk

IEN0ciwgSnVsaXVzIEN0ciBIbHRoIFNjaSAmYW1wOyBHZW4gUHJhY3RpY2UsIE5MLTM1MDggR0Eg

VXRyZWNodCwgTmV0aGVybGFuZHMmI3hBO1VuaXYgVXRyZWNodCwgQ29wZXJuaWN1cyBJbnN0LCBE

ZXB0IElubm92YXQgU3R1ZGllcywgTkwtMzUwOCBHQSBVdHJlY2h0LCBOZXRoZXJsYW5kcyYjeEE7

RXJhc211cyBVbml2LCBNZWQgQ3RyLCBEZXB0IEVwaWRlbWlvbCAmYW1wOyBCaW9zdGF0LCBOTC0z

MDAwIERSIFJvdHRlcmRhbSwgTmV0aGVybGFuZHMmI3hBO1ZhbmRlcmJpbHQgVW5pdiwgTWVkIEN0

ciwgRGVwdCBCaW9zdGF0LCBOYXNodmlsbGUsIFROIDM3MjMyIFVTQTwvQWRkcmVzcz48V2ViX1VS

TD5JU0k6MDAwMjQxMDY0NTAwMDEyPC9XZWJfVVJMPjxaWl9Kb3VybmFsRnVsbD48ZiBuYW1lPSJT

eXN0ZW0iPkpvdXJuYWwgb2YgQ2xpbmljYWwgRXBpZGVtaW9sb2d5PC9mPjwvWlpfSm91cm5hbEZ1

bGw+PFpaX1dvcmtmb3JtSUQ+MTwvWlpfV29ya2Zvcm1JRD48L01ETD48L0NpdGU+PC9SZWZtYW4+

ADDIN REFMGR.CITE PFJlZm1hbj48Q2l0ZT48QXV0aG9yPkxpdHRsZTwvQXV0aG9yPjxZZWFyPjE5OTI8L1llYXI+PFJl

Y051bT4yMzM8L1JlY051bT48SURUZXh0PlJlZ3Jlc3Npb24gd2l0aCBNaXNzaW5nIFhzIC0gQSBS

ZXZpZXc8L0lEVGV4dD48TURMIFJlZl9UeXBlPSJKb3VybmFsIj48UmVmX1R5cGU+Sm91cm5hbDwv

UmVmX1R5cGU+PFJlZl9JRD4yMzM8L1JlZl9JRD48VGl0bGVfUHJpbWFyeT5SZWdyZXNzaW9uIHdp

dGggTWlzc2luZyBYcyAtIEEgUmV2aWV3PC9UaXRsZV9QcmltYXJ5PjxBdXRob3JzX1ByaW1hcnk+

TGl0dGxlLFIuSi5BLjwvQXV0aG9yc19QcmltYXJ5PjxEYXRlX1ByaW1hcnk+MTk5Mi8xMjwvRGF0

ZV9QcmltYXJ5PjxLZXl3b3Jkcz5BTEdPUklUSE08L0tleXdvcmRzPjxLZXl3b3Jkcz5CQVlFU0lB

TiBJTkZFUkVOQ0U8L0tleXdvcmRzPjxLZXl3b3Jkcz5DT1ZBUklBTkNFPC9LZXl3b3Jkcz48S2V5

d29yZHM+REFUQSBBVUdNRU5UQVRJT048L0tleXdvcmRzPjxLZXl3b3Jkcz5pbXB1dGF0aW9uPC9L

ZXl3b3Jkcz48S2V5d29yZHM+SU1QVVRFRCBEQVRBPC9LZXl3b3Jkcz48S2V5d29yZHM+SU5DT01Q

TEVURSBEQVRBPC9LZXl3b3Jkcz48S2V5d29yZHM+SU5DT01QTEVURS1EQVRBPC9LZXl3b3Jkcz48

S2V5d29yZHM+SU5GRVJFTkNFPC9LZXl3b3Jkcz48S2V5d29yZHM+TElLRUxJSE9PRDwvS2V5d29y

ZHM+PEtleXdvcmRzPk1BWElNVU0tTElLRUxJSE9PRDwvS2V5d29yZHM+PEtleXdvcmRzPm1pc3Np

bmcgZGF0YTwvS2V5d29yZHM+PEtleXdvcmRzPk1JU1NJTkcgVkFMVUVTPC9LZXl3b3Jkcz48S2V5

d29yZHM+TUlTU0lORy1EQVRBPC9LZXl3b3Jkcz48S2V5d29yZHM+TU9ERUw8L0tleXdvcmRzPjxL

ZXl3b3Jkcz5tdWx0aXBsZSBpbXB1dGF0aW9uPC9LZXl3b3Jkcz48S2V5d29yZHM+UEFSQU1FVEVS

UzwvS2V5d29yZHM+PEtleXdvcmRzPlBPU1RFUklPUiBESVNUUklCVVRJT05TPC9LZXl3b3Jkcz48

S2V5d29yZHM+UkVHUkVTU0lPTjwvS2V5d29yZHM+PEtleXdvcmRzPlZBTFVFUzwvS2V5d29yZHM+

PFJlcHJpbnQ+Tm90IGluIEZpbGU8L1JlcHJpbnQ+PFN0YXJ0X1BhZ2U+MTIyNzwvU3RhcnRfUGFn

ZT48RW5kX1BhZ2U+MTIzNzwvRW5kX1BhZ2U+PFBlcmlvZGljYWw+Sm91cm5hbCBvZiB0aGUgQW1l

cmljYW4gU3RhdGlzdGljYWwgQXNzb2NpYXRpb248L1BlcmlvZGljYWw+PFZvbHVtZT44NzwvVm9s

dW1lPjxJc3N1ZT40MjA8L0lzc3VlPjxJU1NOX0lTQk4+MDE2Mi0xNDU5PC9JU1NOX0lTQk4+PFdl

Yl9VUkw+SVNJOkExOTkyS0I4OTYwMDAzODwvV2ViX1VSTD48WlpfSm91cm5hbEZ1bGw+PGYgbmFt

ZT0iU3lzdGVtIj5Kb3VybmFsIG9mIHRoZSBBbWVyaWNhbiBTdGF0aXN0aWNhbCBBc3NvY2lhdGlv

bjwvZj48L1paX0pvdXJuYWxGdWxsPjxaWl9Xb3JrZm9ybUlEPjE8L1paX1dvcmtmb3JtSUQ+PC9N

REw+PC9DaXRlPjxDaXRlPjxBdXRob3I+TW9vbnM8L0F1dGhvcj48WWVhcj4yMDA2PC9ZZWFyPjxS

ZWNOdW0+MjM0PC9SZWNOdW0+PElEVGV4dD5Vc2luZyB0aGUgb3V0Y29tZSBmb3IgaW1wdXRhdGlv

biBvZiBtaXNzaW5nIHByZWRpY3RvciB2YWx1ZXMgd2FzIHByZWZlcnJlZDwvSURUZXh0PjxNREwg

UmVmX1R5cGU9IkpvdXJuYWwiPjxSZWZfVHlwZT5Kb3VybmFsPC9SZWZfVHlwZT48UmVmX0lEPjIz

NDwvUmVmX0lEPjxUaXRsZV9QcmltYXJ5PlVzaW5nIHRoZSBvdXRjb21lIGZvciBpbXB1dGF0aW9u

IG9mIG1pc3NpbmcgcHJlZGljdG9yIHZhbHVlcyB3YXMgcHJlZmVycmVkPC9UaXRsZV9QcmltYXJ5

PjxBdXRob3JzX1ByaW1hcnk+TW9vbnMsSy5HLk0uPC9BdXRob3JzX1ByaW1hcnk+PEF1dGhvcnNf

UHJpbWFyeT5Eb25kZXJzLFIuQS5SLlQuPC9BdXRob3JzX1ByaW1hcnk+PEF1dGhvcnNfUHJpbWFy

eT5TdGlqbmVuLFQuPC9BdXRob3JzX1ByaW1hcnk+PEF1dGhvcnNfUHJpbWFyeT5IYXJyZWxsLEYu

RS48L0F1dGhvcnNfUHJpbWFyeT48RGF0ZV9QcmltYXJ5PjIwMDYvMTA8L0RhdGVfUHJpbWFyeT48

S2V5d29yZHM+YmlhczwvS2V5d29yZHM+PEtleXdvcmRzPkNMSU5JQ0FMLVRSSUFMUzwvS2V5d29y

ZHM+PEtleXdvcmRzPmltcHV0YXRpb248L0tleXdvcmRzPjxLZXl3b3Jkcz5JTkZFUkVOQ0U8L0tl

eXdvcmRzPjxLZXl3b3Jkcz5taXNzaW5nIGF0IHJhbmRvbTwvS2V5d29yZHM+PEtleXdvcmRzPm1p

c3NpbmcgcHJlZGljdG9yczwvS2V5d29yZHM+PEtleXdvcmRzPk1JU1NJTkcgVkFMVUVTPC9LZXl3

b3Jkcz48S2V5d29yZHM+TU9ERUw8L0tleXdvcmRzPjxLZXl3b3Jkcz5tdWx0aXBsZSBpbXB1dGF0

aW9uPC9LZXl3b3Jkcz48S2V5d29yZHM+cHJlY2lzaW9uPC9LZXl3b3Jkcz48S2V5d29yZHM+UFJF

RElDVElPTjwvS2V5d29yZHM+PEtleXdvcmRzPlBVTE1PTkFSWS1FTUJPTElTTTwvS2V5d29yZHM+

PEtleXdvcmRzPlJFR1JFU1NJT048L0tleXdvcmRzPjxLZXl3b3Jkcz5SSVNLPC9LZXl3b3Jkcz48

S2V5d29yZHM+VkFMVUVTPC9LZXl3b3Jkcz48S2V5d29yZHM+VkFSSUFCTEVTPC9LZXl3b3Jkcz48

UmVwcmludD5Ob3QgaW4gRmlsZTwvUmVwcmludD48U3RhcnRfUGFnZT4xMDkyPC9TdGFydF9QYWdl

PjxFbmRfUGFnZT4xMTAxPC9FbmRfUGFnZT48UGVyaW9kaWNhbD5Kb3VybmFsIG9mIENsaW5pY2Fs

IEVwaWRlbWlvbG9neTwvUGVyaW9kaWNhbD48Vm9sdW1lPjU5PC9Wb2x1bWU+PElzc3VlPjEwPC9J

c3N1ZT48SVNTTl9JU0JOPjA4OTUtNDM1NjwvSVNTTl9JU0JOPjxNaXNjXzM+RE9JIDEwLjEwMTYv

ai5qY2xpbmVwaS4yMDA2LjAxLjAwOTwvTWlzY18zPjxBZGRyZXNzPlVuaXYgVXRyZWNodCwgTWVk

IEN0ciwgSnVsaXVzIEN0ciBIbHRoIFNjaSAmYW1wOyBHZW4gUHJhY3RpY2UsIE5MLTM1MDggR0Eg

VXRyZWNodCwgTmV0aGVybGFuZHMmI3hBO1VuaXYgVXRyZWNodCwgQ29wZXJuaWN1cyBJbnN0LCBE

ZXB0IElubm92YXQgU3R1ZGllcywgTkwtMzUwOCBHQSBVdHJlY2h0LCBOZXRoZXJsYW5kcyYjeEE7

RXJhc211cyBVbml2LCBNZWQgQ3RyLCBEZXB0IEVwaWRlbWlvbCAmYW1wOyBCaW9zdGF0LCBOTC0z

MDAwIERSIFJvdHRlcmRhbSwgTmV0aGVybGFuZHMmI3hBO1ZhbmRlcmJpbHQgVW5pdiwgTWVkIEN0

ciwgRGVwdCBCaW9zdGF0LCBOYXNodmlsbGUsIFROIDM3MjMyIFVTQTwvQWRkcmVzcz48V2ViX1VS

TD5JU0k6MDAwMjQxMDY0NTAwMDEyPC9XZWJfVVJMPjxaWl9Kb3VybmFsRnVsbD48ZiBuYW1lPSJT

eXN0ZW0iPkpvdXJuYWwgb2YgQ2xpbmljYWwgRXBpZGVtaW9sb2d5PC9mPjwvWlpfSm91cm5hbEZ1

bGw+PFpaX1dvcmtmb3JtSUQ+MTwvWlpfV29ya2Zvcm1JRD48L01ETD48L0NpdGU+PC9SZWZtYW4+

ADDIN EN.CITE.DATA (17, 19).Using the rules in ADDIN REFMGR.CITE <Refman><Cite><Author>Rubin</Author><Year>1987</Year><RecNum>46</RecNum><IDText>Multiple imputation for nonresponse in surveys</IDText><MDL Ref_Type="Book, Whole"><Ref_Type>Book, Whole</Ref_Type><Ref_ID>46</Ref_ID><Title_Primary>Multiple imputation for nonresponse in surveys</Title_Primary><Authors_Primary>Rubin,D.</Authors_Primary><Date_Primary>1987</Date_Primary><Keywords>imputation</Keywords><Keywords>multiple imputation</Keywords><Keywords>nonresponse in surveys</Keywords><Reprint>Not in File</Reprint><Pub_Place>Hoboken, NJ</Pub_Place><Publisher>Wiley-IEEE</Publisher><ZZ_WorkformID>2</ZZ_WorkformID></MDL></Cite></Refman>(25), we estimate the parameters in the model along with their standard errors using the correct model specification. We then evaluate the root mean squared errors and biases for these estimates of β. Table 1 displays averages of these quantities over 1000 generated sets. For each repetition, we also simulate 500 additional records and use the fitted models to predict Y for these new cases. We evaluate the root mean squared prediction error (RMSPE) on the evaluation sample using the parameter estimates from the fitted models. The averages of the RMSPEs are in the last line of Table 1.For the quadratic and interaction terms, CART-based MICE results in notably lower mean squared errors and biases. Even the estimated main effects are somewhat closer to the truth. This combines to make out-of-sample prediction much more accurate. The models fit on the CART imputations were uniformly better in this regard. Because the residual standard deviation equals 1.0, the excess prediction error from standard MICE is more than three times higher that of CART on average.Both CART-based and standard MICE result in many intervals that do not cover the corresponding truths, because they are based on imperfect imputation models. For example, the 95% intervals from the CART imputations only cover the true values of β7 and β8 (the interaction terms) in approximately 42% and 9% of the simulated runs, respectively; these percentages are 0.2% and 0.0% for standard MICE. Across all β elements, approximately 70% of the intervals cover the truth when using CART-based MICE, compared to 53% for standard MICE. We also compared CART-based and standard MICE using the complex data-generating model of ADDIN REFMGR.CITE <Refman><Cite><Author>Van der Laan</Author><Year>2007</Year><RecNum>52</RecNum><IDText>Super learner</IDText><MDL Ref_Type="Journal"><Ref_Type>Journal</Ref_Type><Ref_ID>52</Ref_ID><Title_Primary>Super learner</Title_Primary><Authors_Primary>Van der Laan,M.</Authors_Primary><Authors_Primary>Polley,E.</Authors_Primary><Authors_Primary>Hubbard,A.</Authors_Primary><Date_Primary>2007</Date_Primary><Reprint>Not in File</Reprint><Start_Page>1</Start_Page><End_Page>21</End_Page><Periodical>Statistical Applications in Genetics and Molecular Biology</Periodical><Volume>6</Volume><Issue>1</Issue><ZZ_JournalFull><f name="System">Statistical Applications in Genetics and Molecular Biology</f></ZZ_JournalFull><ZZ_WorkformID>1</ZZ_WorkformID></MDL></Cite></Refman>(32), in which the continuous outcome is a function of ten binary predictors including three- and four-way interactions. The relative performances of the two approaches are materially unchanged. APPLICATION TO ADVERSE BIRTH OUTCOMESWe now apply the sequential CART imputation algorithm to a prospective study of adverse birth outcomes, e.g., low birth weight and pre-term birth. The data comprise 115 variables measured on 1054 non-Hispanic white and black mothers who gave singleton births in Durham, NC. The variables include mothers' demographics like age, race, education, and income; mothers' medical history variables like existence of chronic hypertension, anemia, and previous birth outcomes; mothers' environmental variables like levels of cadmium, nicotine, cotinine, mercury, and lead in the mothers' blood; mothers' psychological factors like ISEL (Interpersonal Support Evaluation List) measurements ADDIN REFMGR.CITE <Refman><Cite><Author>Cohen</Author><Year>1985</Year><RecNum>230</RecNum><IDText>Measuring the functional components of social support</IDText><MDL Ref_Type="Book Chapter"><Ref_Type>Book Chapter</Ref_Type><Ref_ID>230</Ref_ID><Title_Primary>Measuring the functional components of social support</Title_Primary><Authors_Primary>Cohen,S.</Authors_Primary><Authors_Primary>Mermelstein,R.</Authors_Primary><Authors_Primary>Kamarck,T.</Authors_Primary><Authors_Primary>Hoberman,H.</Authors_Primary><Date_Primary>1985</Date_Primary><Keywords>COMPONENTS</Keywords><Keywords>Social</Keywords><Keywords>SOCIAL SUPPORT</Keywords><Keywords>SUPPORT</Keywords><Reprint>Not in File</Reprint><Start_Page>74</Start_Page><End_Page>94</End_Page><Title_Secondary>Social support: Theory, research and application</Title_Secondary><Authors_Secondary>I.F.Sarason</Authors_Secondary><Authors_Secondary>B.R.Sarason</Authors_Secondary><Pub_Place>The Hague, Holland</Pub_Place><Publisher>Martinus Nijhoff</Publisher><ZZ_WorkformID>3</ZZ_WorkformID></MDL></Cite></Refman>(5) and the NEO Personality Inventory (Psychological Assessment Resources, Inc., Lutz, Florida); and, social factors like perceived racism, and availability of social support. These variables are a mix of categorical and numerical data, many with irregular distributions. The study team was successful in recruiting and retaining mothers in the study; retention rates among eligible women exceed 95%. However, many variables have modest amounts of missing data. All but 21 of the variables have less than 10% missing values; 18 of the variables have between 10% and 45% missing; and, three variables have between 58% and 61% missing. Although the missing rates are mostly modest, they are scattered among the variables such that only 7 mothers have complete data on all variables. There is weak evidence that low birth weights are associated with lower rates of missingness. We include the outcome variables in the imputation models to account for a missing at random mechanism consistent with such a pattern ADDIN REFMGR.CITE <Refman><Cite><Author>Little</Author><Year>1992</Year><RecNum>233</RecNum><IDText>Regression with Missing Xs - A Review</IDText><MDL Ref_Type="Journal"><Ref_Type>Journal</Ref_Type><Ref_ID>233</Ref_ID><Title_Primary>Regression with Missing Xs - A Review</Title_Primary><Authors_Primary>Little,R.J.A.</Authors_Primary><Date_Primary>1992/12</Date_Primary><Keywords>ALGORITHM</Keywords><Keywords>BAYESIAN INFERENCE</Keywords><Keywords>COVARIANCE</Keywords><Keywords>DATA AUGMENTATION</Keywords><Keywords>imputation</Keywords><Keywords>IMPUTED DATA</Keywords><Keywords>INCOMPLETE DATA</Keywords><Keywords>INCOMPLETE-DATA</Keywords><Keywords>INFERENCE</Keywords><Keywords>LIKELIHOOD</Keywords><Keywords>MAXIMUM-LIKELIHOOD</Keywords><Keywords>missing data</Keywords><Keywords>MISSING VALUES</Keywords><Keywords>MISSING-DATA</Keywords><Keywords>MODEL</Keywords><Keywords>multiple imputation</Keywords><Keywords>PARAMETERS</Keywords><Keywords>POSTERIOR DISTRIBUTIONS</Keywords><Keywords>REGRESSION</Keywords><Keywords>VALUES</Keywords><Reprint>Not in File</Reprint><Start_Page>1227</Start_Page><End_Page>1237</End_Page><Periodical>Journal of the American Statistical Association</Periodical><Volume>87</Volume><Issue>420</Issue><ISSN_ISBN>0162-1459</ISSN_ISBN><Web_URL>ISI:A1992KB89600038</Web_URL><ZZ_JournalFull><f name="System">Journal of the American Statistical Association</f></ZZ_JournalFull><ZZ_WorkformID>1</ZZ_WorkformID></MDL></Cite></Refman>(17).A large research team comprised of social, environmental, and medical scientists plans to use the data for a variety of analyses, many of which will involve interactions among predictors of adverse birth outcomes. Hence, the team decided to create m=10 completed datasets using MICE via sequential CART. Imputations were done separately for black mothers and white mothers, since cross-racial comparisons are of primary interest to several team members. We order the variables from least amount to largest amount of missing data, and proceed as in steps 1-4 of the imputation algorithm. As in the simulated example, we use a minimum leaf size of 5 and the splitting criteria of a deviance greater than 0.0001. We use l=10 iterations of step 3; the results did not change systematically with l>10 and l=5 would have been acceptable. Some of the variables have logical constraints that we enforce in imputations. For instance, if Y1 records the number of previous pre-term pregnancies, and Y2 is the number of previous pregnancies, we require that 0≤Y1≤Y2. Whenever a constraint of equality exists among the columns (e.g., Y1+ Y2=Y3), we exclude one of the algebraically dependent columns from the imputation process, and then determine its value from the other imputed values in the constraint. Before eliminating columns, we fill in any values that can be logically deduced through differing missing patterns in the relevant variables. In the case of constraints of inequality, of which we had only a few, we simply make a post-hoc adjustment to ensure that the inequality is satisfied. In datasets that are characterized by many such constrained relationships, it may be necessary to incorporate the restrictions explicitly into the conditional models of a chained equation imputation. Because CART draws values from the collection of observed values in a given column, marginal constraints (such as positivity) are automatic.As suggested in ADDIN REFMGR.CITE <Refman><Cite><Author>Abayomi</Author><Year>2008</Year><RecNum>2</RecNum><IDText>Diagnostics for multivariate imputations</IDText><MDL Ref_Type="Journal"><Ref_Type>Journal</Ref_Type><Ref_ID>2</Ref_ID><Title_Primary>Diagnostics for multivariate imputations</Title_Primary><Authors_Primary>Abayomi,K.</Authors_Primary><Authors_Primary>Gelman,A.</Authors_Primary><Authors_Primary>Levy,M.</Authors_Primary><Date_Primary>2008</Date_Primary><Keywords>environmental statistics</Keywords><Keywords>IMPUTED DATA</Keywords><Keywords>MISSING VALUES</Keywords><Keywords>multiple imputation</Keywords><Keywords>multivariate statistics</Keywords><Keywords>sustainability</Keywords><Reprint>Not in File</Reprint><Start_Page>273</Start_Page><End_Page>291</End_Page><Periodical>Journal of the Royal Statistical Society Series C-Applied Statistics</Periodical><Volume>57</Volume><ISSN_ISBN>0035-9254</ISSN_ISBN><Address>Columbia Univ, Dept Stat, New York, NY 10027 USA</Address><Web_URL>ISI:000255663200002</Web_URL><ZZ_JournalFull><f name="System">Journal of the Royal Statistical Society Series C-Applied Statistics</f></ZZ_JournalFull><ZZ_WorkformID>1</ZZ_WorkformID></MDL></Cite></Refman>(1), we check the appropriateness of the imputation models with graphical diagnostics that compare the marginal distributions of observed and imputed values. These did not raise red flags. However, these diagnostics may not tell us enough about joint distributions to identify problems in the imputation models ADDIN REFMGR.CITE <Refman><Cite><Author>Stuart</Author><Year>2009</Year><RecNum>49</RecNum><IDText>Multiple Imputation With Large Data Sets: A Case Study of the Children's Mental Health Initiative</IDText><MDL Ref_Type="Journal"><Ref_Type>Journal</Ref_Type><Ref_ID>49</Ref_ID><Title_Primary>Multiple Imputation With Large Data Sets: A Case Study of the Children's Mental Health Initiative</Title_Primary><Authors_Primary>Stuart,E.A.</Authors_Primary><Authors_Primary>Azur,M.</Authors_Primary><Authors_Primary>Frangakis,C.</Authors_Primary><Authors_Primary>Leaf,P.</Authors_Primary><Date_Primary>2009/5/1</Date_Primary><Keywords>imputation</Keywords><Keywords>mental health services</Keywords><Keywords>missing at random</Keywords><Keywords>missing data</Keywords><Keywords>MISSING-DATA</Keywords><Keywords>multiple imputation</Keywords><Keywords>MULTIVARIATE MISSING-DATA</Keywords><Keywords>NONRESPONSE</Keywords><Keywords>POPULATION</Keywords><Keywords>STRATEGIES</Keywords><Keywords>VALUES</Keywords><Reprint>Not in File</Reprint><Start_Page>1133</Start_Page><End_Page>1139</End_Page><Periodical>American Journal of Epidemiology</Periodical><Volume>169</Volume><Issue>9</Issue><ISSN_ISBN>0002-9262</ISSN_ISBN><Misc_3>DOI 10.1093/aje/kwp026</Misc_3><Address>Johns Hopkins Bloomberg Sch Publ Hlth, Dept Biostat, Baltimore, MD 21205 USA
Johns Hopkins Bloomberg Sch Publ Hlth, Dept Mental Hlth, Baltimore, MD 21205 USA</Address><Web_URL>ISI:000265267100011</Web_URL><ZZ_JournalFull><f name="System">American Journal of Epidemiology</f></ZZ_JournalFull><ZZ_WorkformID>1</ZZ_WorkformID></MDL></Cite></Refman>(29). To illustrate, if one were to impute missing values in a column of Y by sampling at random from the observed elements in that column, associations involving that variable would be attenuated, but the univariate diagnostics would not raise any red flags. Therefore, we also examined posterior predictive checks ADDIN REFMGR.CITE <Refman><Cite><Author>Meng</Author><Year>1994</Year><RecNum>15</RecNum><IDText>Posterior Predictive P-Values</IDText><MDL Ref_Type="Journal"><Ref_Type>Journal</Ref_Type><Ref_ID>15</Ref_ID><Title_Primary>Posterior Predictive P-Values</Title_Primary><Authors_Primary>Meng,X.L.</Authors_Primary><Date_Primary>1994/9</Date_Primary><Keywords>BAYESIAN P-VALUE</Keywords><Keywords>BEHRENS-FISHER PROBLEM</Keywords><Keywords>DISCREPANCY</Keywords><Keywords>INFERENCE</Keywords><Keywords>multiple imputation</Keywords><Keywords>MULTIPLY-IMPUTED DATA</Keywords><Keywords>NUISANCE PARAMETER</Keywords><Keywords>P-VALUE</Keywords><Keywords>PIVOT</Keywords><Keywords>POPULATION</Keywords><Keywords>SIGNIFICANCE LEVEL</Keywords><Keywords>STATISTICS</Keywords><Keywords>TAIL-AREA PROBABILITY</Keywords><Keywords>TEST VARIABLE</Keywords><Keywords>TESTS</Keywords><Keywords>TYPE I ERROR</Keywords><Reprint>Not in File</Reprint><Start_Page>1142</Start_Page><End_Page>1160</End_Page><Periodical>Annals of Statistics</Periodical><Volume>22</Volume><Issue>3</Issue><ISSN_ISBN>0090-5364</ISSN_ISBN><Web_URL>ISI:A1994QJ59900003</Web_URL><ZZ_JournalFull><f name="System">Annals of Statistics</f></ZZ_JournalFull><ZZ_WorkformID>1</ZZ_WorkformID></MDL></Cite></Refman>(18) as suggested by He et al. ADDIN REFMGR.CITE <Refman><Cite><Author>He</Author><Year>2009</Year><RecNum>235</RecNum><IDText>Multiple Imputation in a large-scale complex survey: a guide</IDText><MDL Ref_Type="Journal"><Ref_Type>Journal</Ref_Type><Ref_ID>235</Ref_ID><Title_Primary>Multiple Imputation in a large-scale complex survey: a guide</Title_Primary><Authors_Primary>He,Y.</Authors_Primary><Authors_Primary>Zaslavsky,A.M.</Authors_Primary><Authors_Primary>Landrum,M.B.</Authors_Primary><Date_Primary>2009</Date_Primary><Keywords>multiple imputation</Keywords><Keywords>imputation</Keywords><Reprint>Not in File</Reprint><Start_Page>1</Start_Page><End_Page>18</End_Page><Periodical>Statistical Methods in Medical Research</Periodical><ZZ_JournalFull><f name="System">Statistical Methods in Medical Research</f></ZZ_JournalFull><ZZ_WorkformID>1</ZZ_WorkformID></MDL></Cite></Refman>(15). These are implemented as follows. First, we form 500 imputed sets using the imputation models under consideration. At the same time, we also use CART to create 500 datasets with YP completely replaced (not merely completed) with approximate draws from the distribution of YP|YC. We call these the predicted sets. To obtain these sets, we create a copy of Y (call it Ynew), and consider all observed elements of YP in the new copy to be missing. Then, using the fitted model that was used to impute the missing values in Yi, we draw replacements for all elements in the ith column of Ynew. We do this by tracing down the branches of the imputation tree using the other columns of Ynew as predictors. These draws are not used for the imputation; they are additional and used only for imputation diagnostics. Second, we identify some statistic with epidemiologic relevance, which we call T. For example, T could be a regression coefficient of a particular interaction in a linear regression of birth weight on several covariates. Let Timp,i be the value of the statistic computed with the ith imputed set, and let Tpred,i be the statistic computed with the ith predicted set. We then compute a two-sided posterior predictive P-value,P =2/500?min(I(Timp,i- Tpred,i)>0, I(Tpred,i –Timp,i)>0), where I? is the indicator function that equals one if the argument is true and zero otherwise ADDIN REFMGR.CITE <Refman><Cite><Author>He</Author><Year>2009</Year><RecNum>235</RecNum><IDText>Multiple Imputation in a large-scale complex survey: a guide</IDText><MDL Ref_Type="Journal"><Ref_Type>Journal</Ref_Type><Ref_ID>235</Ref_ID><Title_Primary>Multiple Imputation in a large-scale complex survey: a guide</Title_Primary><Authors_Primary>He,Y.</Authors_Primary><Authors_Primary>Zaslavsky,A.M.</Authors_Primary><Authors_Primary>Landrum,M.B.</Authors_Primary><Date_Primary>2009</Date_Primary><Keywords>multiple imputation</Keywords><Keywords>imputation</Keywords><Reprint>Not in File</Reprint><Start_Page>1</Start_Page><End_Page>18</End_Page><Periodical>Statistical Methods in Medical Research</Periodical><ZZ_JournalFull><f name="System">Statistical Methods in Medical Research</f></ZZ_JournalFull><ZZ_WorkformID>1</ZZ_WorkformID></MDL></Cite></Refman>(15). If Timp,i and Tpred,i consistently deviate from each other in one direction—which would be indicated by a small P-value—the imputation model may be distorting the relationship implicit in the test statistic. To illustrate, suppose that a regression coefficient is consistently larger in the imputed sets than it is in the predicted sets. If this coefficient is estimated to be positive, the association involving this coefficient might be attenuated by the imputed values. Essentially, if the imputation models do not recreate important features in the observed data, they may not generate plausible values for the missing data.From a practical standpoint, posterior predictive checks are well-suited for use in large studies with many investigators. The imputation team can create and store many imputed and predicted sets. Researchers interested in using the imputed datasets for their particular model can compute posterior predictive P-values for their model to check the suitability of the imputations for their analyses. This process takes only seconds of computer time (whereas generating the 500 predicted sets can take several days of computer time depending on the number of imputations), and it can be automated in software that is distributed with the imputed datasets. If evidence of serious imputation deficiencies arises, the analyst can inform the imputation team about the significant P-values, and the team can adjust the imputation procedure with the aim of remedying the problems if necessary. This might involve, for example, reducing the minimum leaf size or the minimum deviance value for splitting. It also might involve using different imputation models for the offending variables, for example parametric models based on an exhaustive search for complex interactions. If the imputation team cannot remedy the problems, analysts are left with the options of generating their own imputations in ways tailored to their specific models—which may not necessarily improve the quality of the imputations—or reporting potential sensitivity to the imputations in the analysis. In the adverse birth outcomes imputation project, we focus on posterior predictive checks of linear and logistic regression coefficients in models of interest to the scientific team, where T is the value of the maximum likelihood estimate of the regression coefficient. Each model includes a particular response (birth weight, low/normal birth weight, gestational age, pre-term/term birth, maternal hypertension), standard control variables for race, age, education and an indicator of the mother’s first pregnancy, and additional covariates selected from the remaining variables. For example, one of the regressions is a linear model of birth weight as a function of NEO Openness and Conscientiousness scores and their interaction, along with the standard control variables. Figure 2 displays the 500 values of Timp,i- Tpred,i for the interaction term in this model. Here, Timp,i- Tpred,i is less than zero for 56 out of the 500 cases, so that the estimated P=0.224. Thus, for this interaction coefficient, we do not have strong evidence that the imputations seriously distort the relationships in the observed data. After screening many models, we do not find substantial evidence that the sequential tree imputations are implausible. Figure 3 displays the posterior predictive P-values for 99 regression coefficients for variables other than the standard controls; none of these P-values is below 0.10. We exclude the standard control variables from Figure 3 because each of these variables is missing in four or fewer (less than 0.4%) of the records, so that regression analyses are insensitive to any reasonable imputation model for these variables. The covariates related to the coefficients in Figure 3 are missing in 1.9% to 24.5% of the records. Among the standard control variables, the P-value for the indicator of mother’s first pregnancy is consistently small; in a few regressions, we even estimate P=0 from the 500 pairs of datasets. The small P-value indicates that the CART imputation model did not accurately recreate the conditional distribution of first pregnancy for the entire dataset. However, because previous pregnancy data are missing for only three mothers, we are not particularly concerned with a potential misspecification of the imputation model for the first pregnancy indicator. CONCLUSIONResearchers often avoid tree-based regressions because they can be difficult to interpret unless the trees are relatively small. Interpretation also can be strained by the volatility of the fitting process: when small changes in the observed data would lead to different initial splits, the resulting trees could be very different from the original one. As an imputation engine, however, neither of these issues is particularly consequential. We are not interested in interpreting the trees or making inferences related to them. Their ability to provide sensible imputations, and preserve complexity, is all that matters. With that in mind, one might consider using more exotic nonparametric modeling techniques like random forests, neural networks or Bayesian additive regression trees ADDIN REFMGR.CITE <Refman><Cite><Author>Hastie</Author><Year>2009</Year><RecNum>12</RecNum><IDText>The Elements of Statistical Learning: Data Mining, Inference, and Prediction</IDText><MDL Ref_Type="Book, Whole"><Ref_Type>Book, Whole</Ref_Type><Ref_ID>12</Ref_ID><Title_Primary>The Elements of Statistical Learning: Data Mining, Inference, and Prediction</Title_Primary><Authors_Primary>Hastie,T</Authors_Primary><Authors_Primary>Tibshirani,R.</Authors_Primary><Authors_Primary>Friedman,J.</Authors_Primary><Date_Primary>2009</Date_Primary><Keywords>INFERENCE</Keywords><Reprint>Not in File</Reprint><Start_Page>333</Start_Page><Volume>2</Volume><Pub_Place>New York</Pub_Place><Publisher>Springer</Publisher><ZZ_WorkformID>2</ZZ_WorkformID></MDL></Cite><Cite><Author>Chipman</Author><Year>2010</Year><RecNum>236</RecNum><IDText>BART: Bayesian additive regression trees</IDText><MDL Ref_Type="Journal"><Ref_Type>Journal</Ref_Type><Ref_ID>236</Ref_ID><Title_Primary>BART: Bayesian additive regression trees</Title_Primary><Authors_Primary>Chipman,H.A.</Authors_Primary><Authors_Primary>George,E.I,</Authors_Primary><Authors_Primary>McCulloch,R.E.</Authors_Primary><Date_Primary>2010=Forthcoming</Date_Primary><Keywords>REGRESSION</Keywords><Reprint>Not in File</Reprint><Periodical>Annals of Applied Statistics</Periodical><ZZ_JournalFull><f name="System">Annals of Applied Statistics</f></ZZ_JournalFull><ZZ_WorkformID>1</ZZ_WorkformID></MDL></Cite></Refman>(4, 14). Such techniques generate results that can be even more difficult to interpret, but their predictive performance can be excellent. One drawback of these approaches compared to CART is the typically much slower speed of the fitting algorithms. This is especially important when using posterior predictive checks; for example, performing imputations along with the posterior predictive checks in the adverse birth outcome study conservatively requires half a million model fits. Nonetheless, we anticipate increased use of nonparametric methods to implement MICE as computing power continues to grow.Acknowledgements: This work was supported by Environmental Protection Agency grant R833293. The authors thank Dr. Marie Lynn Miranda, Dr. Geeta Swamy and Dr. Redford Williams for suggesting the models used to check the imputations. Affiliations: Duke University Department of Statistical Science (Lane F. Burgette, Jerome P. Reiter) ADDIN REFMGR.REFLIST References1. Abayomi K, Gelman A, Levy M. Diagnostics for multivariate imputations. Journal of the Royal Statistical Society Series C-Applied Statistics 2008;57(3):273-91.2. Barnard J, Meng XL. Applications of multiple imputation in medical studies: from AIDS to NHANES. Statistical Methods in Medical Research 1999;8(1):17-36.3. Breiman L, Friedman JH, Olshen RA, et al. "Classification and Regression Trees". Boca Raton, FL: Chapman and Hall/CRC, 1984.4. Chipman HA, George EI, McCulloch RE. BART: Bayesian additive regression trees. Annals of Applied Statistics 2010;4(1):266-98.5. Cohen S, Mermelstein R, Kamarck T, et al. Measuring the functional components of social support. In: I.F.Sarason, B.R.Sarason, eds. Social support: Theory, research and application. The Hague, Holland: Martinus Nijhoff, 1985:74-94.6. Conversano C, Cappelli C. Missing data incremental imputation through tree based methods. Compstat: Proceedings in Computational Statistics: 15th Symposium Held in Berlin, Germany 2002;455-60.7. Dai JY, Ruczinski I, LeBlanc M, et al. Imputation methods to improve inference in SNP association studies. Genetic Epidemiology 2006;30(8):690-702.8. Friedman JH. Multivariate Adaptive Regression Splines. Annals of Statistics 1991;19(1):1-67.9. Gelman A, Hill J, Yajima M, et al. mi: Missing Data Imputation. cran.r-, 2009.10. Gelman A, Rubin DB. Inference from iterative simulation using multiple sequences. Statistical Science 1992;7(4):457-72.11. Gelman A, Speed TP. Characterizing A Joint Probability-Distribution by Conditionals. Journal of the Royal Statistical Society Series B-Methodological 1993;55(1):185-8.12. Graham JW, Olchowski AE, Gilreath TD. How many imputations are really needed? - Some practical clarifications of multiple imputation theory. Prevention Science 2007;8(3):206-13.13. Harel O, Zhou XH. Multiple imputation: Review of theory, implementation and software. Statistics in Medicine 2007;26(16):3057-77.14. Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York: Springer, 2009.15. He Y, Zaslavsky AM, Landrum MB. Multiple Imputation in a large-scale complex survey: a guide. Statistical Methods in Medical Research 2009;1-18.16. Klebanoff MA, Cole SR. Use of multiple imputation in the epidemiologic literature. American Journal of Epidemiology 2008;168(4):355-7.17. Little RJA. Regression with Missing Xs - A Review. Journal of the American Statistical Association 1992;87(420):1227-37.18. Meng XL. Posterior Predictive P-Values. Annals of Statistics 1994;22(3):1142-60.19. Moons KGM, Donders RART, Stijnen T, et al. Using the outcome for imputation of missing predictor values was preferred. Journal of Clinical Epidemiology 2006;59(10):1092-101.20. Raghunathan T, Solenberger P, Van Hoewyk J. A multivariate technique for multiply imputing missing values using a sequence of regression models. Survey Methodology 2002;27(1):85-96.21. Raghunathan T, Solenberger P, Van Hoewyk. IVEware: Imputation and Variance Estimation Software. Ann Arbor, MI: Survey Methodology Program, Survey Research Center, Institute for Social Research, University of Michigan, 2002.22. Reiter JP. Using CART to Generate Partially Synthetic Public Use Microdata. Journal of Official Statistics-Stockholm 2005;21(3):7-30.23. Reiter JP, Raghunathan TE. The multiple adaptations of multiple imputation. Journal of the American Statistical Association 2007;102(480):1462-71.24. Ripley B. tree: Classification and regression trees. cran.r-, 2009.25. Rubin DB. Multiple imputation for nonresponse in surveys. Hoboken, NJ: Wiley-IEEE, 1987.26. Rubin DB. The Bayesian Bootstrap. Annals of Statistics 1981;9(1):130-4.27. Rubin DB. Multiple imputation after 18+ years. Journal of the American Statistical Association 1996;91(434):473-89.28. Schafer JL. Multiple imputation: a primer. Statistical Methods in Medical Research 1999;8(1):3-15.29. Stuart EA, Azur M, Frangakis C, et al. Multiple Imputation With Large Data Sets: A Case Study of the Children's Mental Health Initiative. American Journal of Epidemiology 2009;169(9):1133-9.30. Su Y, Gelman A, Hill J, et al. Multiple imputation with diagnostics (mi) in R: Opening windown into the black box. Journal of Statistical Software 2009;20(1):1-27.31. Van Buuren S, Oudshoorn K. Flexible multivariate imputation by MICE. Leiden, The Netherlands: TNO Prevention Center, 1999.32. Van der Laan M, Polley E, Hubbard A. Super learner. Statistical Applications in Genetics and Molecular Biology 2007;6(1):1-21.Figure SEQ Figure \* ARABIC 1. Example of a tree structure. Figure SEQ Figure \* ARABIC 2. Density histogram of differences between regression coefficients calculated on imputed and predicted sets for NEO-Openness/NEO-Conscientiousness interaction. 56 out of 500 of the differences are negative, so we have the two-sided estimate P=2?56/500=0.224, which does not indicate a deficiency in the imputation model for this parameter. If 12 or fewer of these differences were negative (or positive), however, we would have a two-sided P-value below 0.05, which would indicate a possible problem. Figure SEQ Figure \* ARABIC 3. Histogram of the 99 two-sided posterior predictive P-values related to the coefficients of interest. Table SEQ Table \* ARABIC 1. Average root mean-squared error and bias for β estimates. The columns correspond to default “mi” package behavior, and CART-based MICE with l=10. The last row gives out-of-sample average root mean squared prediction error (ARMSPE) based on parameter estimates from the various imputed sets. All of the model fits use the true model.Root mean squared errorBiasTRUE βCART-MICEDefault “mi”CART-MICEDefault “mi”β00.00.1680.3790.1560.373β10.50.0610.077-0.020-0.018β20.50.0610.078-0.015-0.015β30.50.0590.076-0.010-0.018β40.50.1200.149-0.108-0.132β50.50.0540.0670.0060.016β60.50.0530.132-0.035-0.125β71.00.1440.315-0.134-0.310β81.00.1980.314-0.190-0.309ARMSPE1.1061.348 ................
................

In order to avoid copyright disputes, this page is only a partial summary.

To fulfill the demand for quickly locating and searching documents.

It is intelligent file search solution for home and business.

Literature Lottery

Multiple Imputation for Missing Data via Sequential ...

To fulfill the demand for quickly locating and searching documents.

Related download

Related searches

Multiple Imputation for Missing Data via Sequential ...

Sequential research method

To fulfill the demand for quickly locating and searching documents.

Related download

Related searches