[ADMB Users] mcmc with a high N

James R. Bence bence at msu.edu
Thu Jul 5 11:06:01 PDT 2012

Hi Ian and Edgar and all:

Just my two cents on convergence.  The MCMC literature etc is not always 
that clear about what we mean by "convergence".  I think strictly speaking 
convergence is supposed to mean that the chain has converged to a 
stationary distribution.  However often when we talk about needing a chain 
of particular length based on convergence diagnostics we mean how long a 
chain is needed to have a reasonable amount of info about the posterior. 
Anyway some things called convergence diagnostics get at how long your 
burn-in should be and others how long the chain after dropping the burn in 
should be, and clearly it matters which is meant.

So if you mean that you have to go 6 million steps to get to a stationary 
distribution then I totally agree with Ian that running a bunch of shorter 
chains is unlikely to help or get you what you want.  You should be 
dropping the first 6 million steps as part of a burn-in from every chain 
before you get to useful info.  On the other hand if you mean the 
diagnostics are telling you need 6 million steps after the burn-in, one 
approach would be to run a bunch of parallel chains with different starting 
points on different processors (I think that is what you are 
suggesting).  You could then combine these shorter chains to provide an 
estimate of the posterior.  But you would need to drop an appropriate 
burn-in from each of the shorter chains and it is not clear whether this 
would be lots shorter than say 500,000.  So whether it makes sense to run 
parallel chains depends critically on whether the approach to a stationary 
distribution is much less than your proposed chain length and what this 
does to your computing time. In any case the shorter parallel chains will 
cost more computer time but could potentially done in less real time.

None of the above is intended to disagree with Ian's final comment that you 
may want to try to reduce correlations etc.  A chain of  6 million seems 
really long and recasting the problem may be the best bet.

At 01:05 PM 7/5/2012, Ian Taylor wrote:
>Hi Edgar,
>You should be able to set a different random number seed for MCMC using 
>the -mcseed input. You can also theoretically change the starting point 
>for the MCMC using -mcpin. However, multiple short MCMC chains are not a 
>replacement for a single long chain, and if the convergence diagnostics 
>suggest 6 000 000 samples, I think you are very unlikely to get a good 
>approximation to the posterior with 12 runs of 500 000.
>I would say that a more fruitful path would be to look at which parameters 
>were the most correlated with each other or had the most autocorrelation 
>in the shorter MCMC and look at reparameterizing the model or reducing the 
>number of parameters.
>On Thu, Jul 5, 2012 at 9:00 AM, Edgar Gonzalez 
><<mailto:edgarjgonzalez at ymail.com>edgarjgonzalez at ymail.com> wrote:
>Hi everyone:
>I'm trying to run mcmc on ADMB. I ran a pre-run to estimate the Total N 
>required to obtain the posterior distribution of my parameters. I got a 
>rounded N of 6 000 000. As this would take weeks (months?) to do, I'd like 
>to know if there is a way to partition this into 12 runs with N = 500 000 
>and to assemble them into a single run. As far as I know, the mcmc 
>involves random walks, but when I ran two mcmc's I got the same walk (the 
>traces look the same).
>Thanks in advance,
>Edgar J. González
>Ecology and Natural Resources Dept.
>Science Faculty, UNAM
>Users mailing list
><mailto:Users at admb-project.org>Users at admb-project.org
>Users mailing list
>Users at admb-project.org

Jim Bence
Dept. of Fisheries and Wildlife
Michigan State University
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.admb-project.org/pipermail/users/attachments/20120705/4a2819d3/attachment.html>

More information about the Users mailing list