[ADMB Users] mcmc with a high N

Thu Jul 5 12:22:40 PDT 2012

Hi James and Ian:

Certainly I should have discussed the burn-in period. I require 10 000 steps as burn-in period and 6 000 000 steps saving every 1 500 steps to obtain the posterior distribution of my parameters (according to the Raftery diagnostic I ran). So I was thinking about running parallel mcmcs: 12 runs of 510 000 steps saving every 1 500 steps. I now see that I need to change the seed of the mcmc for each run. As pin I'm using the maximum likelihood vector previously obtained for my model. I'm gonna run these mcmcs and see what I get. At the same time I'm gonna try to reparameterize my model, although I'm not sure of how to do this. Is there a book, paper or webpage where I can check for reparameterization tips?

Thanks a lot for your help.

Cheers, 

Edgar J. González
Ecology and Natural Resources Dept.
Science Faculty, UNAM

________________________________
 De: James R. Bence <bence at msu.edu>
Para: Ian Taylor <ian.taylor at noaa.gov>; Edgar Gonzalez <edgarjgonzalez at ymail.com> 
CC: ADMB <users at admb-project.org> 
Enviado: Jueves, 5 de julio, 2012 13:06:01
Asunto: Re: [ADMB Users] mcmc with a high N

Hi Ian and Edgar and all:

Just my two cents on convergence.  The MCMC literature etc is not
always that clear about what we mean by "convergence".  I
think strictly speaking convergence is supposed to mean that the chain
has converged to a stationary distribution.  However often when we
talk about needing a chain of particular length based on convergence
diagnostics we mean how long a chain is needed to have a reasonable
amount of info about the posterior. Anyway some things called convergence
diagnostics get at how long your burn-in should be and others how long
the chain after dropping the burn in should be, and clearly it matters
which is meant.

So if you mean that you have to go 6 million steps to get to a stationary
distribution then I totally agree with Ian that running a bunch of
shorter chains is unlikely to help or get you what you want.  You
should be dropping the first 6 million steps as part of a burn-in from
every chain before you get to useful info.  On the other hand if you
mean the diagnostics are telling you need 6 million steps after the
burn-in, one approach would be to run a bunch of parallel chains with
different starting points on different processors (I think that is what
you are suggesting).  You could then combine these shorter chains to
provide an estimate of the posterior.  But you would need to drop an
appropriate burn-in from each of the shorter chains and it is not clear
whether this would be lots shorter than say 500,000.  So whether it
makes sense to run parallel chains depends critically on whether the
approach to a stationary distribution is much less than your proposed
chain length and what this does to your computing time. In any case the
shorter parallel chains will cost more computer time but could
potentially done in less real time.

None of the above is intended to disagree with Ian's final comment that
you may want to try to reduce correlations etc.  A chain of  6
million seems really long and recasting the problem may be the best
bet.

At 01:05 PM 7/5/2012, Ian Taylor wrote:

Hi Edgar,
>You should be able to set a different random number seed for MCMC using
the -mcseed input. You can also theoretically change the starting point
for the MCMC using -mcpin. However, multiple short MCMC chains are not a
replacement for a single long chain, and if the convergence diagnostics
suggest 6 000 000 samples, I think you are very unlikely to get a good
approximation to the posterior with 12 runs of 500 000.
>
>I would say that a more fruitful path would be to look at which
parameters were the most correlated with each other or had the most
autocorrelation in the shorter MCMC and look at reparameterizing the
model or reducing the number of parameters.
>-Ian
>
>On Thu, Jul 5, 2012 at 9:00 AM, Edgar Gonzalez
<edgarjgonzalez at ymail.com>
wrote:
> 
>Hi everyone:
>
>
>I'm trying to run mcmc on ADMB. I ran a pre-run to estimate the Total
N required to obtain the posterior distribution of my parameters. I got a
rounded N of 6 000 000. As this would take weeks (months?) to do, I'd
like to know if there is a way to partition this into 12 runs with N =
500 000 and to assemble them into a single run. As far as I know, the
mcmc involves random walks, but when I ran two mcmc's I got the same walk
(the traces look the same).
>
>
>Thanks in advance,
>
> 
>
>Edgar J. González
>
>Ecology and Natural Resources Dept.
>
>Science Faculty, UNAM
>
>
>_______________________________________________
>
>Users mailing list
>
>Users at admb-project.org
>
>http://lists.admb-project.org/mailman/listinfo/users
>
>
>_______________________________________________
>Users mailing list
>Users at admb-project.org
>http://lists.admb-project.org/mailman/listinfo/users
Jim Bence
Dept. of Fisheries and Wildlife 
Michigan State University
http://www.msu.edu/user/bence/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.admb-project.org/pipermail/users/attachments/20120705/53e3a900/attachment.html>