[ADMB Users] Problems estimating variance component(s)

H. Skaug hskaug at gmail.com
Sat May 1 22:40:29 PDT 2010


Chris,

What you are indirectly doing when using

 totL += (.5)*(-(nyears-1)*log(tau)-.5*norm2(e/tau));

is to place a penalty on tau, favoring large values of tau. (Somewhat
like a Bayesian prior on tau.) The penalty you have added is

totL += (.5)*(nyears-1)*log(tau);


But, your MLE of tau is 0. Since this is happending repeatedly
on simulated datasets where you used tau>0 to generate the
data, the conclusion is that tau is weakly identified under your
model (given parameters and amount of data). This is maybe not
good news for you, but you should appreciate ADMB telling you this.
To investigate my claim you can increase the information about
tau in the simulations, by increasing value of tau and the amount of data.


Hans




On Sat, May 1, 2010 at 9:44 PM, Chris Gast <cmgast at gmail.com> wrote:
> Hello,
> I'm still relatively new to ADMB--that information might be helpful in
> diagnosing the problem I seem to be encountering.
> The setup is this:  I have a relatively simple product-multinomial model
> that I can fit well (with simulated data) when all effects are considered
> fixed.  The data are age-specific harvest numbers of a fictional animal.  I
> am simulating stochasticity in both survival ( logistic transformation of
> \beta+e_i, e_i~N(0,tau^2)), and in my vulnerability parameter.  I have a
> single vulnerability coefficient c, such that the probability of harvest is
> 1-exp(-(c+e_i)*f_i), for hunter-effort level f_i in year i.  I am currently
> working with 12 simulated years of data and three age classes.  For the
> discussion below, I am limiting myself to fitting the single random effect
> for survival (rather than both random effects for survival and
> vulnerability).  (I am also simulating 4 years of binomial telemetry data to
> provide information on the probability of harvest.)
> The problem is this:  ADMB seems to have some trouble estimating the single
> variance component.  My resulting report either spits out my initial value
> (I'm using true simulated values as initial estimates), or ADMB forces the
> parameter estimates as close to zero as I will allow it (of course, it's
> constrained to be positive).  I have tried many avenues of attack from the
> manual and in other examples, and none of them seem to alleviate the
> problem.  Having run out of options, I tried replacing the "prior"
> distribution portion of the log-likelihood with
> totL += (.5)*(-(nyears-1)*log(tau)-.5*norm2(e/tau));
> instead of the usual
> totL += (-(nyears-1)*log(tau)-.5*norm2(e/tau));
> and the variance component estimates are actually OK--they are at least on
> the correct order of magnitude, and not precisely equal to my initial
> parameter estimate.  I don't really understand why this down-weighting is
> successful (or if it is truly successful or I have just gotten lucky
> somehow).  I would also really like to avoid this seemingly ad hoc
> procedure, since I think ADMB is powerful enough to fit this model without
> resorting to such things--it seems to fit more complex models relatively
> easily.
> For the record, I have tried the transformation procedure of using a
> standard N(0,1) density for the random effect, and then scaling up to match
> the true N(0,tau^2) distribution, and in this case, this "trick" seems to
> hasten ADMBs result of loglikelihood=nan, from which it does not recover.  I
> have also tried playing with boundary values on constrained parameters, and
> met with no success.
> I have pasted a stripped-down version of my .tpl file (the working version,
> with the down-weighting intact), with accompanying .dat and .pin files, in
> the hope that someone would be kind enough to examine them.  I am aware that
> there are probably some inefficiencies here, but I'm not yet concerned about
> optimizing the program for run-time.  Of course, I'm only posting one
> simulated dataset---this phenomenon occurs for hundreds of simulations in a
> row.
> Thanks very much,
> Chris Gast
>
>
> /********************* start tpl file ******************************/
> DATA_SECTION
>   init_int nyears;
>   init_int nages;
>   init_int m;
>   init_vector effdat(0,nyears-1);
>   init_int telnyears;
>   init_vector numtagged(0,telnyears-1);
>   init_vector numrecovered(0,telnyears-1);
>   init_imatrix dat(0,nyears-1,0,nages-1);
>   imatrix fcohort(0,nyears-1,0,nages-1);
>   imatrix pcohort(0,nages-2,0,nages-1);
>   imatrix fdat(0,nyears+nages-2,0,nages-1);
>   int i;
>   int j;
> PARAMETER_SECTION
>   init_vector logparms(0,m-1,1);
>   init_bounded_number beta(-2,1.7,1);
>   init_bounded_number logcmu(-6,1,1);
>   init_bounded_number logtau(-10,1.5,2);
>   random_effects_vector e(0,nyears-2,2);
>   number tau;
>   vector parms(0,m-1);
>   number prob;
>   number prevq;
>   number prod;
>   number datL;
>   number telL;
>   number auxL;
>   number extraL;
>   number s;
>   number c;
>   number sprod;
>   objective_function_value totL;
> PROCEDURE_SECTION
>   const double pi=3.14159265358979323846264338327950288;
>   double datsum;
>   int colrank;
>   datL=0.0; auxL=0.0; telL=0.0; tau=mfexp(logtau);
>   parms=mfexp(logparms);
>   c=mfexp(logcmu);
>   cout << "tau=" << tau << ", c=" << c << "\n";
>   for(i=0;i<(nyears+nages-1);i++){
>     datsum=0;
>     datL+=gammln(parms[i]+1);
>     prevq=1;
>     prod=0;
>     colrank=0;
>     sprod=1;
>     for(j=0;j<nages;j++){
>       if(fdat[i][j]!=-1){
>         if(i<nages) prob=1-exp(-c*effdat[colrank]);
>         else if(i>=nages) prob=1-exp(-c*effdat[i-nages+1+j]);
>         if(colrank==0) s=1;
>         else if(i<nages)
>  s=exp(beta+e[colrank-1])/(1+exp(beta+e[colrank-1]));
>         else if(i>=nages)
> s=exp(beta+e[i-nages+1+j-1])/(1+exp(beta+e[i-nages+1+j-1]));
>         sprod*=s;
>         datsum+=fdat[i][j];
>         datL-=gammln(fdat[i][j]+1);
>         if(colrank>0) prod+=prevq*sprod*prob;
>         else prod=prob;
>         //cout << "prod=" << prod << ", prob=" << prob << ", s=" << s << ",
> sprod=" << sprod << ", prevq=" << prevq << "\n";
>         datL+=fdat[i][j]*log(prevq*sprod*prob);
>         colrank++;
>         prevq*=(1-prob);
>       }
>     }
>     datL+=(parms[i]-datsum)*log(1-prod) - gammln(parms[i]-datsum+1);
>   }
>   //now compute radio-telemetry likelihood
>   for(i=0;i<telnyears;i++){
>     telL+=gammln(numtagged[i]+1) - gammln(numrecovered[i]+1) -
> gammln(numtagged[i]-numrecovered[i]+1) +
> numrecovered[i]*log((1-exp(-c*effdat[nyears-1]))) +
> (numtagged[i]-numrecovered[i])*log(exp(-c*effdat[nyears-1]));
>   }
>   for(i=0;i<(nyears-1);i++){
>     cout << e[i] << "\t";
>   }
>   cout << "\n\ndatL=" << datL << ", telL=" << telL << ", extraL=" << extraL
> << ", totL=" << -(datL+telL) <<  "\n";
>   cout << "cest=" << c << ", betaest=" << beta << ", tauest=" << tau <<
> "\n\n\n\n\n\n\n\n\n";
>   totL=(datL+telL);
>   totL += (.5)*(-(nyears-1)*log(tau)-.5*norm2(e/tau));
>   /*totL += -.5*norm2(u);
>   e=tau*u;*/
>   totL *= -1;
> REPORT_SECTION
>   report << "s\n" << exp(beta)/(1+exp(beta)) << "\n";
>   report << "tau\n" << tau << "\n";
>   report << "cmu\n" << c << "\n";
>   report << "parms\n" << parms << "\n";
> PRELIMINARY_CALCS_SECTION
>   for(i=0;i<nyears;i++){
>     effdat[i]/=1000;
>   }
>   for(i=0;i<(nages-1);i++){
>     for(j=0;j<nages;j++){
>       pcohort[i][j]=-1;
>     }
>   }
>   for(i=0;i<nyears;i++){
>     for(j=0;j<nages;j++){
>       fcohort[i][j]=-1;
>     }
>   }
>   //get the partial cohorts first
>   for(i=0;i<(nages-1);i++){
>     for(j=0;j<nages;j++){
>       if(j>i){
>         pcohort[nages-j-1+i][j] = dat[i][j];
>       }
>     }
>   }
>   //now get the "full" cohorts
>   for(i=0;i<nyears;i++){
>     for(j=0;j<nages;j++){
>       if(i<=(nyears-nages)) fcohort[i][j]=dat[i+j][j];
>         else if((j+i)<nyears) fcohort[i][j]=dat[i+j][j];
>       }
>   }
>   //now put all the cohort data together in a single matrix
>   for(i=0;i<(nyears+nages-1);i++){
>     for(j=0;j<nages;j++){
>       if(i<(nages-1))  fdat[i][j]=pcohort[i][j];
>       else             fdat[i][j]=fcohort[i-(nages-1)][j];
>     }
>   }
>
> /********************* end tpl file ******************************/
>
> /********************* start dat file ******************************/
> #nyears
> 12
> #nages
> 3
> #parameters
> 14
> #effort
> 379 233 829 1012 807 698 859 137 779 574 460 684
> #telnyears
> 4
> #numtagged
> 100 100 100 100
> #numrecovered
> 13 19 17 11
> #harvest data
> 28 13 9
> 15 8 4
> 49 24 13
> 53 27 13
> 42 24 12
> 36 16 9
> 33 18 8
> 5 3 1
> 29 13 8
> 18 12 5
> 16 7 4
> 23 12 5
>
> /********************* end dat file ******************************/
> /********************* start pin file ******************************/
> #logparms
> 5.45959 5.86647 5.92426 5.85793 5.81114 5.69709 5.5835 5.60212 5.39816
> 5.30827 5.34711 5.17615 5.32301 5.11799
> #beta
> 0.322773
> #logtau
> -1.20397
> #logcmu
> -1.60944
> #e
> 0 0 0 0 0 0 0 0 0 0 0
>
> /********************* end pin file ******************************/
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> -----------------------------
> Chris Gast
> cmgast at gmail.com
>
> _______________________________________________
> Users mailing list
> Users at admb-project.org
> http://lists.admb-project.org/mailman/listinfo/users
>
>



More information about the Users mailing list