[ADMB Users] NLMM Model Selection

Sat Feb 5 13:55:02 PST 2011

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 11-02-05 01:02 PM, Chris Gast wrote:
> This isn't precisely an ADMB topic, but it seems as though ADMB users
> might be knowledgeable in this regard.
> 
> I've searched the archives and haven't found a lot of discussion
> regarding model selection in nonlinear mixed models. For a given
> dataset, I have a series of models which differ in combinations of
> structure, number of effects considered random, and assumed distribution
> of random effect components, and would like some (preferably
> likelihood-based) method to rank them. Burnham and Anderson (Model
> Selection and Multimodel Inference, 2002, page 310) describe a method
> based on shrinkage estimators where the penalty term is computed
> somewhere between 1 and the number of random components, but this
> appears to require both a single random effect and a fit of the model
> where each random component is considered a parameter; neither of these
> is feasible with my models (or, I suspect, many others). I can't simply
> use LRTs to decide between a mixed model and its fixed counterpart,
> because the value of interest for the sigma parameter lies on the
> boundary of its space, 0.

  Although you could, approximately, by doubling the p value (for a
single random effect, the null distribution of the deviance is a 50/50
mixture of chi^2 with df=0 and df=1; this is equivalent to halving the
area in the tail of the distribution or equivalently doubling the p
value.  (See references in Bolker et al 2009 TREE article.)

> I have found some instances where the problem is basically ignored
> (Hall, D.B. and Clutter, M. 2004. Multivariate multilevel nonlinear
> mixed effects models for timer yield predictions. Biometrics, 60:16-24).
> To quote: "...the first-order approximate log likelihood is treated as
> the true log likelihood, and standard errors for parameter estimates,
> likelihood ratio tests for nested models, and model selection criteria
> such as AIC and BIC are formed in the usual way. Although the formal
> justification of this “approximately asymptotic” approach to inference
> is an open problem, it is commonly used in practice, and we adopt it for
> our purposes in this article."
> 
> One simple method would be to choose the model that best reconstructs
> the original data as measured by the chi-squared test statistic
> sum((O-E)^2/E), but again, it would be nice to have something
> likelihood-based such that the framework is a cohesive, and the
> principle of parsimony is in effect.
> 
> One additional question: these models also may include covariates.
>  Holding all other model features of a mixed-model constant, LRTs should
> be justified for model selection of covariates only, as they result from
> a mathematical restriction of some beta=0, correct? I see plenty of
> information about the LASSO for covariate selection in NLMMs, but
> haven't yet found the time to learn this technique.

  A quite technical but useful recent paper is:

  Greven, Sonja, and Thomas Kneib. 2010. On the Behaviour of Marginal
and Conditional
Akaike Information Criteria in Linear Mixed
Models. Biometrika 97, no. 4: 773-789.
http://www.bepress.com/jhubiostat/paper202/.

  There is a fundamental distinction between the 'marginal AIC' (for
population-level predictions, i.e. where you want to predict future
values for a different set of random effects than those measured) and
the 'conditional AIC' (for group-level predictions where you want to
predict future values for the same random effects measured); see
<http://glmm.wikidot.com/faq> (recently updated) for more information.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk1NxzYACgkQc5UpGjwzenP1oACfU+Izl+dIhs7huh3pPlLhd7Hx
WDsAn0OpH9MzkduC+5+uNhBMp3urY9KM
=H2vm
-----END PGP SIGNATURE-----