Gelman(2006) マルチレベルモデリングで出来ること出来ないこと

マルチレベルモデリングの有用性と限界が明確に述べてある論文.著者はData Analysis Using Regression and Multilevel/Hierarchical Modelsで有名なAndrew Gelman.

Gelman, A. 2006. “Multilevel (Hierarchical) Modeling: What It Can and Cannot DoTECHNOMETRICS 48(3): 432-435.


Multilevel (hierarchical) modeling is a generalization of linear and generalized linear modeling in which regression coefficients are themselves given a model, whose parameters are also estimated from data. We illustrate the strengths and limitations of multilevel modeling through an example of the prediction of home radon levels in U.S. counties. The multilevel model is highly effective for predictions at both levels of the model, but could easily be misinterpreted for causal inference.

{ \displaystyle
y_{ij}\sim N(\alpha_j+\beta x_{ij}, \sigma_y^2)
{ \displaystyle
\alpha_j\sim N(\gamma_0+\gamma_1 u_{j}, \sigma_\alpha^2)
上式をミネソタ州データ(919house, 85counties)を用いた階層ベイズ(Hierarchical Bayes Methods)で分析する.

Data Reduction
ここでは上式のマルチレベルモデリングを,すべてプールした{ \displaystyle y=\alpha+\beta x}とプールなしの{ \displaystyle y=\alpha_j+\beta x}と比較している.Fig1.からマルチレベルモデリングが他の2モデルに比べてData Reductionの面で優れていることが一目瞭然である.


We can use cross-validation to formally demonstrate the benefits of multilevel modeling. We perform two cross-validation tests: first removing single data points and checking the pre- diction from the model fit to the rest of the data, then removing single counties and performing the same procedure. For each cross-validation step, we compare complete-pooling, nopooling, and multilevel estimates. Other cross-validation tests for this example were performed by Price et al. (1996).


Causal Inference

In other settings, especially in social science, individual averages used as group-level predictors are often interpreted as “contextual effects.” For example, the presence of more basements in a county would somehow have a radon-lowering effect. This makes no sense here, but it serves as a warning that, with identical data of a social nature (e.g., consider substituting “income” for “radon level” and “ethnic minority” for “basement” in our study), it would be easy to leap to a misleading conclusion and find contextual effects where none necessarily exist.
