Hubbard et al. (2010) 混合効果モデルとGEEモデルの違い

個体があるグループにネストされている場合に社会学では混合効果(マルチレベル)モデルを用いることが多いが,疫学系の研究会にでているとGEEを用いることが多い気がする.本論文では,混合効果とGEEの違いを簡単に解説している.そもそも知りたい量が違うという点がハイライト.

Hubbard, Alan E., et al. “To GEE or Not to GEE: Comparing Population Average and Mixed Models for Estimating the Associations Between Neighborhood Risk Factors and Health.Epidemiology, vol. 21, no. 4, 2010, pp. 467–474.

アブスト

Two modeling approaches are commonly used to estimate the associations between neighborhood characteristics and individual-level health outcomes in multilevel studies (subjects within neighborhoods). Random effects models (or mixed models) use maximum likelihood estimation. Population average models typically use a generalized estimating equation (GEE) approach. These methods are used in place of basic regression approaches because the health of residents in the same neighborhood may be correlated, thus violating independence assumptions made by traditional regression procedures. This violation is particularly relevant to estimates of the variability of estimates. Though the literature appears to favor the mixed-model approach, little theoretical guidance has been offered to justify this choice. In this paper, we review the assumptions behind the estimates and inference provided by these 2 approaches. We propose a perspective that treats regression models for what they are in most circumstances: reasonable approximations of some true underlying relationship. We argue in general that mixed models involve unverifiable assumptions on the data-generating distribution, which lead to potentially misleading estimates and biased inference. We conclude that the estimation-equation approach of population average models provides a more useful approximation of the truth.