[ADMB Users] mcmc with a high N
James R. Bence
bence at msu.edu
Thu Jul 5 11:06:01 PDT 2012
Hi Ian and Edgar and all:
Just my two cents on convergence. The MCMC literature etc is not always
that clear about what we mean by "convergence". I think strictly speaking
convergence is supposed to mean that the chain has converged to a
stationary distribution. However often when we talk about needing a chain
of particular length based on convergence diagnostics we mean how long a
chain is needed to have a reasonable amount of info about the posterior.
Anyway some things called convergence diagnostics get at how long your
burn-in should be and others how long the chain after dropping the burn in
should be, and clearly it matters which is meant.
So if you mean that you have to go 6 million steps to get to a stationary
distribution then I totally agree with Ian that running a bunch of shorter
chains is unlikely to help or get you what you want. You should be
dropping the first 6 million steps as part of a burn-in from every chain
before you get to useful info. On the other hand if you mean the
diagnostics are telling you need 6 million steps after the burn-in, one
approach would be to run a bunch of parallel chains with different starting
points on different processors (I think that is what you are
suggesting). You could then combine these shorter chains to provide an
estimate of the posterior. But you would need to drop an appropriate
burn-in from each of the shorter chains and it is not clear whether this
would be lots shorter than say 500,000. So whether it makes sense to run
parallel chains depends critically on whether the approach to a stationary
distribution is much less than your proposed chain length and what this
does to your computing time. In any case the shorter parallel chains will
cost more computer time but could potentially done in less real time.
None of the above is intended to disagree with Ian's final comment that you
may want to try to reduce correlations etc. A chain of 6 million seems
really long and recasting the problem may be the best bet.
At 01:05 PM 7/5/2012, Ian Taylor wrote:
>You should be able to set a different random number seed for MCMC using
>the -mcseed input. You can also theoretically change the starting point
>for the MCMC using -mcpin. However, multiple short MCMC chains are not a
>replacement for a single long chain, and if the convergence diagnostics
>suggest 6 000 000 samples, I think you are very unlikely to get a good
>approximation to the posterior with 12 runs of 500 000.
>I would say that a more fruitful path would be to look at which parameters
>were the most correlated with each other or had the most autocorrelation
>in the shorter MCMC and look at reparameterizing the model or reducing the
>number of parameters.
>On Thu, Jul 5, 2012 at 9:00 AM, Edgar Gonzalez
><<mailto:edgarjgonzalez at ymail.com>edgarjgonzalez at ymail.com> wrote:
>I'm trying to run mcmc on ADMB. I ran a pre-run to estimate the Total N
>required to obtain the posterior distribution of my parameters. I got a
>rounded N of 6 000 000. As this would take weeks (months?) to do, I'd like
>to know if there is a way to partition this into 12 runs with N = 500 000
>and to assemble them into a single run. As far as I know, the mcmc
>involves random walks, but when I ran two mcmc's I got the same walk (the
>traces look the same).
>Thanks in advance,
>Edgar J. González
>Ecology and Natural Resources Dept.
>Science Faculty, UNAM
>Users mailing list
><mailto:Users at admb-project.org>Users at admb-project.org
>Users mailing list
>Users at admb-project.org
Dept. of Fisheries and Wildlife
Michigan State University
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Users