[ADMB Users] mildly robust methods in count models

Fri Dec 14 18:49:25 PST 2012

On 12-12-14 03:00 PM, dave fournier wrote:
> On 12-12-14 11:53 AM, H. Skaug wrote:
> 
> What I was proposing was a compromise.  Use the R side of glmmadmb to
> write the model
> and produce the pin and dat files. 

  Depending on what you're doing, you might not need glmmADMB: R2admb
has some less-encumbered functions to write pin and dat files ...

 Then just modify the glmmadmb.tpl
> file to produce various versions
> of the code that do what you want.  I don't really care much about the R
> side of the reports.
> They are already missing some things anyway.

  I apologize if you've already told me, but can you remind me what's
missing ... ?

>  If the results look good
> it may motivate
> others to complete the R side of the thing.  Personally I don't like
> having to rely on others
> to write code.  Other people are always busy.  God knows what they are
> doing.
> 
>> I think it is a good idea check if robustness changes the conclusion
>> of published
>> empirical studies. That would be an interesting study in itself.
>>
>> About your suggestion of using glmmADMB
>> I think that should be easily done an a case by case basis.
>> What requires a lot of work however, is to "robustify glmmADMB"
>> as an R-package. I imagine that would involve changing all the
>> postprocessing
>> stuff that is written in R (diagnostics, etc).

   Again, can you be any more specific?

  I might move glmmADMB to github sometime soon -- the "issues lists"
there are prettier than the R-forge issue trackers (although I do also
have to get back to the issue tracker and see about fixing what's there ...
http://r-forge.r-project.org/tracker/?group_id=847

  Ben

>>
>> Hans
>>
>> On Fri, Dec 14, 2012 at 6:13 PM, dave fournier <davef at otter-rsch.com>
>> wrote:
>>> On 12-12-14 09:04 AM, H. Skaug wrote:
>>>
>>> I think that there are a number of issues. Two that come to mind are
>>>
>>> First is to verify my claim for each type of example
>>> that these methods perform nearly as well as the nonrobust methods
>>> when the
>>> null hypothesis is satisfied.  This would  show that there is
>>> (virtually) no
>>> risk in using them.
>>>
>>> Another issue is to investigate real examples to see how often one gets
>>> different
>>> results for various estimates and tests.  It would be nice to use the
>>> glmmadmb framework
>>> if possible.  So long as glmmadmb can create the correct dat and pin
>>> files
>>> it is simple
>>> to modify the glmmadmb.tpl to do the robust estimates.  This should
>>> enable
>>> us to
>>> use a lot of data sets that are available in R.  I quess that R users
>>> would
>>> also find the results
>>> more accessible conceptually.
>>>
>>>
>>>
>>>> Hi,
>>>>
>>>> I agree fully with you about the need for robust methods and that ADMB
>>>> is a great
>>>> tool for implementing such. However, and "robustification" is a generic
>>>> tool and should be documented separately from other modelling issues.
>>>> I think we need a document (with small  tpl examples), written in a
>>>> pedagogical way,
>>>> describing  how all the standard probability distributions should
>>>> be robustified. That would be a good resource for people who want to
>>>> learn about robust methods.
>>>>
>>>> If somebody is willing to develop such a suite I will upload it
>>>> immediately as
>>>> an example on the webpage, but currently I am not able to develop this
>>>> from scratch  myself.
>>>>
>>>> Hans
>>>>
>>>> On Fri, Dec 14, 2012 at 5:09 PM, dave fournier <davef at otter-rsch.com>
>>>> wrote:
>>>>> On 12-12-13 08:43 PM, Hans J. Skaug wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> BTW Hans, I think you misunderstood my remarks about robustness in
>>>>> count models. I haven't got any response so I am kicking the wasp nest
>>>>> again.
>>>>>    My feeling is that there appear to be two schools,
>>>>> those who blindly use standard non-robust methods and those
>>>>> who use very robust methods.  While there may be situations
>>>>> where very robust methods are a good idea, what I am advocating
>>>>> is to routinely use mildly robust methods.  My reasoning is that
>>>>> mildly robust methods perform almost as well as the standard
>>>>> methods when the non robust hypothesis is satisfied and
>>>>> they perform much better when just a small amount of
>>>>> contamination is introduced.   I don't think Ben gets this point.
>>>>> He notes that the point estimates are nearly the same.  This
>>>>> is just like the fact that for normal theory estimates of means
>>>>> are more insensitive to outliers than estimates of variances.
>>>>> However it is the estimates of variances that are important
>>>>> for significance tests.
>>>>>
>>>>> To test out these ideas I took Mollies negative binomial model with
>>>>> the
>>>>> fir
>>>>> data and
>>>>> added a covariate which was random noise.  With the non-robust
>>>>> model including this covariate  was considered significant 7.7% of
>>>>> the time while the robust version it was 5.4% of the time, much closer
>>>>> to the theoretical value of 5%.  Why do I think this is an ADMB thing?
>>>>>
>>>>> The reason that the R gurus avoid these ideas is that they don't
>>>>> fit into their simple exponential family methodology.  With ADMB it is
>>>>> is a trivial extension to the model.
>>>>>
>>>>> A major problem (and I know I have said this before) with promoting
>>>>> ADMB
>>>>> is
>>>>> that it always seems that to compare it so say R you have to use
>>>>> methods
>>>>> that
>>>>> R deems to be legitimate.  So these mildly robust methods never
>>>>> come up.
>>>>> Rather the question is always posed as either standard non-robust
>>>>> methods
>>>>> or
>>>>> extreme robust methods like Huber stuff.  I think that ADMB can do a
>>>>> great
>>>>> service to
>>>>> the statistical community by making these mildly robust methods more
>>>>> widely known.
>>>>>
>>>>> For the record the mildly robust method for the negative binomial with
>>>>> parameters mu and tau
>>>>> where tau>1 is the overdispersion would be to use something like
>>>>>
>>>>>                     (1-p)*NB(mu,tau)+p*NB(mu,q*tau)
>>>>>
>>>>> where p and q may be estimated or held fixed.  Typical values for p
>>>>> and q
>>>>> are .05 and 9.0.
>>>>> For mollies fir model these were estimated.
>>>>>
>>>>> This all came up first when I came across an interesting article on
>>>>> the
>>>>> web
>>>>> which
>>>>> stated (and seemed to demonstrate)
>>>>> that max like should never be used in NB count models for significance
>>>>> tests
>>>>> on the
>>>>> fixed effects because this lack of robustness led  to rejecting the
>>>>> null
>>>>> hypothesis
>>>>> too often.  They advocated quasi-likelihood or GEE instead.   As a
>>>>> fan of
>>>>> max like
>>>>> I wondered if it could be "fixed" for these models.
>>>>>
>>>>>    Wish I could find that article again.
>>>>>
>>>>>              Dave
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Users mailing list
>>>>> Users at admb-project.org
>>>>> http://lists.admb-project.org/mailman/listinfo/users
>>>
>