[ADMB Users] mildly robust methods in count models
Ben Bolker
bbolker at gmail.com
Fri Dec 14 18:55:23 PST 2012
Dave,
I don't disagree with you.
I think we *should* use robust methods more of the time, and I
certainly like the fact that ADMB makes it easier to do so; I would like
it to make it even easier through either implementing this in glmmADMB
or in Mollie and Steve Martell's "statslib" (in a convenient,
overloaded, vectorized way), or preferably both.
Where I disagree mildly is that in looking at **this particular case**
it seems to me that there are not important qualitative differences
between the fits. Yes, the type I error rate/coverage is definitely
better with the robust model (5.4% vs 7.7% of the time). In my own
personal universe, 50% undercoverage (i.e. approx 7.5% type I error rate
for a nominal rate of 5%) is fairly bad, but not terrible -- in the
context of all the other things that are always wrong with the model
that we can't control, I consider that a moderate but not an
earth-shattering problem.
All of this is of course my own opinion. I'm really not trying to
argue ADMB vs. R here; I'm just talking about priorities for getting
people to do better stats. Robust methods are definitely on the list,
but they might fall below (e.g.) getting people to stop using stepwise
approaches ...
cheers
Ben
On 12-12-14 11:09 AM, dave fournier wrote:
> On 12-12-13 08:43 PM, Hans J. Skaug wrote:
>
> Hi,
>
> BTW Hans, I think you misunderstood my remarks about robustness in
> count models. I haven't got any response so I am kicking the wasp nest
> again.
> My feeling is that there appear to be two schools,
> those who blindly use standard non-robust methods and those
> who use very robust methods. While there may be situations
> where very robust methods are a good idea, what I am advocating
> is to routinely use mildly robust methods. My reasoning is that
> mildly robust methods perform almost as well as the standard
> methods when the non robust hypothesis is satisfied and
> they perform much better when just a small amount of
> contamination is introduced. I don't think Ben gets this point.
> He notes that the point estimates are nearly the same. This
> is just like the fact that for normal theory estimates of means
> are more insensitive to outliers than estimates of variances.
> However it is the estimates of variances that are important
> for significance tests.
>
> To test out these ideas I took Mollies negative binomial model with the
> fir data and
> added a covariate which was random noise. With the non-robust
> model including this covariate was considered significant 7.7% of
> the time while the robust version it was 5.4% of the time, much closer
> to the theoretical value of 5%. Why do I think this is an ADMB thing?
>
> The reason that the R gurus avoid these ideas is that they don't
> fit into their simple exponential family methodology. With ADMB it is
> is a trivial extension to the model.
>
> A major problem (and I know I have said this before) with promoting ADMB is
> that it always seems that to compare it so say R you have to use methods
> that
> R deems to be legitimate. So these mildly robust methods never come up.
> Rather the question is always posed as either standard non-robust
> methods or
> extreme robust methods like Huber stuff. I think that ADMB can do a
> great service to
> the statistical community by making these mildly robust methods more
> widely known.
>
> For the record the mildly robust method for the negative binomial with
> parameters mu and tau
> where tau>1 is the overdispersion would be to use something like
>
> (1-p)*NB(mu,tau)+p*NB(mu,q*tau)
>
> where p and q may be estimated or held fixed. Typical values for p and
> q are .05 and 9.0.
> For mollies fir model these were estimated.
>
> This all came up first when I came across an interesting article on the
> web which
> stated (and seemed to demonstrate)
> that max like should never be used in NB count models for significance
> tests on the
> fixed effects because this lack of robustness led to rejecting the null
> hypothesis
> too often. They advocated quasi-likelihood or GEE instead. As a fan
> of max like
> I wondered if it could be "fixed" for these models.
>
> Wish I could find that article again.
>
> Dave
>
>
>
>
>
>
>
>
>
More information about the Users
mailing list