[ADMB Users] Problems estimating variance component(s)

Sat May 1 12:44:50 PDT 2010

Hello,

I'm still relatively new to ADMB--that information might be helpful in
diagnosing the problem I seem to be encountering.

The setup is this:  I have a relatively simple product-multinomial model
that I can fit well (with simulated data) when all effects are considered
fixed.  The data are age-specific harvest numbers of a fictional animal.  I
am simulating stochasticity in both survival ( logistic transformation of
\beta+e_i, e_i~N(0,tau^2)), and in my vulnerability parameter.  I have a
single vulnerability coefficient c, such that the probability of harvest is
1-exp(-(c+e_i)*f_i), for hunter-effort level f_i in year i.  I am currently
working with 12 simulated years of data and three age classes.  For the
discussion below, I am limiting myself to fitting the single random effect
for survival (rather than both random effects for survival and
vulnerability).  (I am also simulating 4 years of binomial telemetry data to
provide information on the probability of harvest.)

The problem is this:  ADMB seems to have some trouble estimating the single
variance component.  My resulting report either spits out my initial value
(I'm using true simulated values as initial estimates), or ADMB forces the
parameter estimates as close to zero as I will allow it (of course, it's
constrained to be positive).  I have tried many avenues of attack from the
manual and in other examples, and none of them seem to alleviate the
problem.  Having run out of options, I tried replacing the "prior"
distribution portion of the log-likelihood with

totL += (.5)*(-(nyears-1)*log(tau)-.5*norm2(e/tau));

instead of the usual

totL += (-(nyears-1)*log(tau)-.5*norm2(e/tau));

and the variance component estimates are actually OK--they are at least on
the correct order of magnitude, and not precisely equal to my initial
parameter estimate.  I don't really understand why this down-weighting is
successful (or if it is truly successful or I have just gotten lucky
somehow).  I would also really like to avoid this seemingly ad hoc
procedure, since I think ADMB is powerful enough to fit this model without
resorting to such things--it seems to fit more complex models relatively
easily.

For the record, I have tried the transformation procedure of using a
standard N(0,1) density for the random effect, and then scaling up to match
the true N(0,tau^2) distribution, and in this case, this "trick" seems to
hasten ADMBs result of loglikelihood=nan, from which it does not recover.  I
have also tried playing with boundary values on constrained parameters, and
met with no success.

I have pasted a stripped-down version of my .tpl file (the working version,
with the down-weighting intact), with accompanying .dat and .pin files, in
the hope that someone would be kind enough to examine them.  I am aware that
there are probably some inefficiencies here, but I'm not yet concerned about
optimizing the program for run-time.  Of course, I'm only posting one
simulated dataset---this phenomenon occurs for hundreds of simulations in a
row.

Thanks very much,

Chris Gast

/********************* start tpl file ******************************/

DATA_SECTION
  init_int nyears;
  init_int nages;
  init_int m;
  init_vector effdat(0,nyears-1);
  init_int telnyears;
  init_vector numtagged(0,telnyears-1);
  init_vector numrecovered(0,telnyears-1);
  init_imatrix dat(0,nyears-1,0,nages-1);

  imatrix fcohort(0,nyears-1,0,nages-1);
  imatrix pcohort(0,nages-2,0,nages-1);
  imatrix fdat(0,nyears+nages-2,0,nages-1);

  int i;
  int j;

PARAMETER_SECTION
  init_vector logparms(0,m-1,1);
  init_bounded_number beta(-2,1.7,1);
  init_bounded_number logcmu(-6,1,1);
  init_bounded_number logtau(-10,1.5,2);
  random_effects_vector e(0,nyears-2,2);
  number tau;
  vector parms(0,m-1);
  number prob;
  number prevq;
  number prod;
  number datL;
  number telL;
  number auxL;
  number extraL;
  number s;
  number c;
  number sprod;
  objective_function_value totL;

PROCEDURE_SECTION
  const double pi=3.14159265358979323846264338327950288;
  double datsum;
  int colrank;
  datL=0.0; auxL=0.0; telL=0.0; tau=mfexp(logtau);
  parms=mfexp(logparms);
  c=mfexp(logcmu);

  cout << "tau=" << tau << ", c=" << c << "\n";

  for(i=0;i<(nyears+nages-1);i++){

    datsum=0;
    datL+=gammln(parms[i]+1);

    prevq=1;
    prod=0;
    colrank=0;
    sprod=1;
    for(j=0;j<nages;j++){
      if(fdat[i][j]!=-1){

        if(i<nages) prob=1-exp(-c*effdat[colrank]);
        else if(i>=nages) prob=1-exp(-c*effdat[i-nages+1+j]);

        if(colrank==0) s=1;
        else if(i<nages)
 s=exp(beta+e[colrank-1])/(1+exp(beta+e[colrank-1]));
        else if(i>=nages)
s=exp(beta+e[i-nages+1+j-1])/(1+exp(beta+e[i-nages+1+j-1]));

        sprod*=s;

        datsum+=fdat[i][j];

        datL-=gammln(fdat[i][j]+1);

        if(colrank>0) prod+=prevq*sprod*prob;
        else prod=prob;

        //cout << "prod=" << prod << ", prob=" << prob << ", s=" << s << ",
sprod=" << sprod << ", prevq=" << prevq << "\n";

        datL+=fdat[i][j]*log(prevq*sprod*prob);

        colrank++;
        prevq*=(1-prob);
      }
    }
    datL+=(parms[i]-datsum)*log(1-prod) - gammln(parms[i]-datsum+1);
  }

  //now compute radio-telemetry likelihood

  for(i=0;i<telnyears;i++){
    telL+=gammln(numtagged[i]+1) - gammln(numrecovered[i]+1) -
gammln(numtagged[i]-numrecovered[i]+1) +
numrecovered[i]*log((1-exp(-c*effdat[nyears-1]))) +
(numtagged[i]-numrecovered[i])*log(exp(-c*effdat[nyears-1]));
  }

  for(i=0;i<(nyears-1);i++){
    cout << e[i] << "\t";
  }

  cout << "\n\ndatL=" << datL << ", telL=" << telL << ", extraL=" << extraL
<< ", totL=" << -(datL+telL) <<  "\n";
  cout << "cest=" << c << ", betaest=" << beta << ", tauest=" << tau <<
"\n\n\n\n\n\n\n\n\n";

  totL=(datL+telL);

  totL += (.5)*(-(nyears-1)*log(tau)-.5*norm2(e/tau));

  /*totL += -.5*norm2(u);
  e=tau*u;*/

  totL *= -1;

REPORT_SECTION
  report << "s\n" << exp(beta)/(1+exp(beta)) << "\n";
  report << "tau\n" << tau << "\n";
  report << "cmu\n" << c << "\n";
  report << "parms\n" << parms << "\n";

PRELIMINARY_CALCS_SECTION

  for(i=0;i<nyears;i++){
    effdat[i]/=1000;
  }

  for(i=0;i<(nages-1);i++){
    for(j=0;j<nages;j++){
      pcohort[i][j]=-1;
    }
  }

  for(i=0;i<nyears;i++){
    for(j=0;j<nages;j++){
      fcohort[i][j]=-1;
    }
  }
  //get the partial cohorts first

  for(i=0;i<(nages-1);i++){
    for(j=0;j<nages;j++){
      if(j>i){
        pcohort[nages-j-1+i][j] = dat[i][j];
      }
    }
  }

  //now get the "full" cohorts
  for(i=0;i<nyears;i++){
    for(j=0;j<nages;j++){
      if(i<=(nyears-nages)) fcohort[i][j]=dat[i+j][j];
        else if((j+i)<nyears) fcohort[i][j]=dat[i+j][j];
      }
  }

  //now put all the cohort data together in a single matrix
  for(i=0;i<(nyears+nages-1);i++){
    for(j=0;j<nages;j++){
      if(i<(nages-1))  fdat[i][j]=pcohort[i][j];
      else             fdat[i][j]=fcohort[i-(nages-1)][j];
    }

  }

/********************* end tpl file ******************************/

/********************* start dat file ******************************/

#nyears
12

#nages
3

#parameters
14

#effort
379 233 829 1012 807 698 859 137 779 574 460 684

#telnyears
4

#numtagged
100 100 100 100

#numrecovered
13 19 17 11

#harvest data
28 13 9
15 8 4
49 24 13
53 27 13
42 24 12
36 16 9
33 18 8
5 3 1
29 13 8
18 12 5
16 7 4
23 12 5

/********************* end dat file ******************************/

/********************* start pin file ******************************/

#logparms
5.45959 5.86647 5.92426 5.85793 5.81114 5.69709 5.5835 5.60212 5.39816
5.30827 5.34711 5.17615 5.32301 5.11799

#beta
0.322773

#logtau
-1.20397

#logcmu
-1.60944

#e
0 0 0 0 0 0 0 0 0 0 0

/********************* end pin file ******************************/

-----------------------------
Chris Gast
cmgast at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.admb-project.org/pipermail/users/attachments/20100501/9a819f4c/attachment.html>