[ADMB Users] mildly robust methods in count models

Fri Dec 14 09:35:48 PST 2012

On 12-12-14 09:04 AM, H. Skaug wrote:

After saying that my results for the robustified NB should be verified I 
realized
that my simulation size of 1000 is a bit small.
I increased it to 5000.
Nice for me is that the acceptance rate for the non-robust model 
increased a bit and the
acceptance rate for the robust model decreased a bit.
Here are the numbers with +- 2*sd's intervals.

non robust model

 > sum(as.numeric(x>3.84))/5000
[1] 0.0878
 >
p-2*s
.07979544329772097874
p+2*s
.09580455670227902126

robust model
 > sum(as.numeric(x>3.84))/5000
[1] 0.0524
 >
p-2*s
.04609735039844352446
p+2*s
.05870264960155647554

> Hi,
>
> I agree fully with you about the need for robust methods and that ADMB
> is a great
> tool for implementing such. However, and "robustification" is a generic
> tool and should be documented separately from other modelling issues.
> I think we need a document (with small  tpl examples), written in a
> pedagogical way,
> describing  how all the standard probability distributions should
> be robustified. That would be a good resource for people who want to
> learn about robust methods.
>
> If somebody is willing to develop such a suite I will upload it immediately as
> an example on the webpage, but currently I am not able to develop this
> from scratch  myself.
>
> Hans
>
> On Fri, Dec 14, 2012 at 5:09 PM, dave fournier <davef at otter-rsch.com> wrote:
>> On 12-12-13 08:43 PM, Hans J. Skaug wrote:
>>
>> Hi,
>>
>> BTW Hans, I think you misunderstood my remarks about robustness in
>> count models. I haven't got any response so I am kicking the wasp nest
>> again.
>>   My feeling is that there appear to be two schools,
>> those who blindly use standard non-robust methods and those
>> who use very robust methods.  While there may be situations
>> where very robust methods are a good idea, what I am advocating
>> is to routinely use mildly robust methods.  My reasoning is that
>> mildly robust methods perform almost as well as the standard
>> methods when the non robust hypothesis is satisfied and
>> they perform much better when just a small amount of
>> contamination is introduced.   I don't think Ben gets this point.
>> He notes that the point estimates are nearly the same.  This
>> is just like the fact that for normal theory estimates of means
>> are more insensitive to outliers than estimates of variances.
>> However it is the estimates of variances that are important
>> for significance tests.
>>
>> To test out these ideas I took Mollies negative binomial model with the fir
>> data and
>> added a covariate which was random noise.  With the non-robust
>> model including this covariate  was considered significant 7.7% of
>> the time while the robust version it was 5.4% of the time, much closer
>> to the theoretical value of 5%.  Why do I think this is an ADMB thing?
>>
>> The reason that the R gurus avoid these ideas is that they don't
>> fit into their simple exponential family methodology.  With ADMB it is
>> is a trivial extension to the model.
>>
>> A major problem (and I know I have said this before) with promoting ADMB is
>> that it always seems that to compare it so say R you have to use methods
>> that
>> R deems to be legitimate.  So these mildly robust methods never come up.
>> Rather the question is always posed as either standard non-robust methods or
>> extreme robust methods like Huber stuff.  I think that ADMB can do a great
>> service to
>> the statistical community by making these mildly robust methods more
>> widely known.
>>
>> For the record the mildly robust method for the negative binomial with
>> parameters mu and tau
>> where tau>1 is the overdispersion would be to use something like
>>
>>                    (1-p)*NB(mu,tau)+p*NB(mu,q*tau)
>>
>> where p and q may be estimated or held fixed.  Typical values for p and q
>> are .05 and 9.0.
>> For mollies fir model these were estimated.
>>
>> This all came up first when I came across an interesting article on the web
>> which
>> stated (and seemed to demonstrate)
>> that max like should never be used in NB count models for significance tests
>> on the
>> fixed effects because this lack of robustness led  to rejecting the null
>> hypothesis
>> too often.  They advocated quasi-likelihood or GEE instead.   As a fan of
>> max like
>> I wondered if it could be "fixed" for these models.
>>
>>   Wish I could find that article again.
>>
>>             Dave
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Users mailing list
>> Users at admb-project.org
>> http://lists.admb-project.org/mailman/listinfo/users