<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <div class="moz-cite-prefix">On 03/13/2015 05:04 PM, Shanae Allen -
      NOAA Affiliate wrote:<br>
      <br>
      Yes it is probably more natural to multiply by the scaling factor
      factor, but that day I decided to divide.<br>
      <br>
      <br>
    </div>
    <blockquote
cite="mid:CABj5QwGEpmzUXk=S2Kho1cDK6B9s8Uhy9SjvTARzuhj2meb9WA@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div>
          <div>Thank you Dave for the detailed response. I'm still
            digesting the first part, but I'm making progress. As for
            the set_scalefactor() function, I thought that's what it was
            doing but this page: <a moz-do-not-send="true"
href="http://www.admb-project.org/examples/function-minimization/parameter-scaling">http://www.admb-project.org/examples/function-minimization/parameter-scaling</a>
            suggests otherwise ( '...which makes the function minimizer
            work internally with b_internal = 0.001*b' and .001 is the
            scale factor).<br>
          </div>
          Thanks for taking the time to respond so thoroughly!<br>
        </div>
        Shanae<br>
        <br>
      </div>
      <div class="gmail_extra"><br>
        <div class="gmail_quote">On Fri, Mar 13, 2015 at 11:31 AM, dave
          fournier <span dir="ltr"><<a moz-do-not-send="true"
              href="mailto:davef@otter-rsch.com" target="_blank">davef@otter-rsch.com</a>></span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div text="#000000" bgcolor="#FFFFFF">
              <div>On 03/12/2015 06:17 PM, Shanae Allen - NOAA Affiliate
                wrote:<br>
                <br>
                Hi,<br>
                <br>
                I posted this response to the list in case others are
                interested in<br>
                this stuff.  You are correct that scaling alone is not
                the final solution<br>
                to reparameterizing functions in such a way that they
                are easy to minimize.<br>
                <br>
                I think the gold standard for this is you were
                omniscient is the Morse<br>
                lemma. It says that for a "nice" function f of n
                variables ( parameters )<br>
                say f(x_1,x_2,...,x_n) which has a minimum, with the
                value of the <br>
                function at the minimum being b, then there exists a
                reparameterization<br>
                of the funtion given by the n functions g_i with<br>
                <br>
                       x_i = g_i(y_1,...,y_n),   i=1,...,n<br>
                <br>
                such that<br>
                <br>
                  f(g_1(y_1,...,y_n),...,g_n(y_1,...,y_n)) = b + y_1^2 +
                ... + y_n^2   (1)<br>
                <br>
                So the functions g_i provide a "perfect"
                reparameterization of f.<br>
                <br>
                Of course to find the g_i you already need to know where
                the minimum<br>
                is located so by itself the Morse lemma is not very
                useful for this<br>
                problem.  It does however give one an idea of what we
                are trying to<br>
                accomplish by reparameterizing. Rescaling is really the
                last step in that if<br>
                <br>
                       f(x_1,...,x_n) = b + a_1*x_1^2 + ... + a_n*x_n^2<br>
                <br>
                then setting  y_i^2 = a_i*x_i^2  or y_i=sqrt(a)*x_i<br>
                <br>
                provides the required reparameterization.<br>
                <br>
                In general reparameterization is a bit of an art.  There
                are however<br>
                some general principles.  First is to "undemensionalize"
                the problme.<br>
                I'll give you a few examples.  <br>
                <br>
                   L_i =  Lmin + (Lmax-Lmin)/(1+exp(-b*(t_i-tmid))  (2)<br>
                <br>
                Then one should rescale and translate the L_i to go from
                -1 to 1<br>
                and the t_i to go from -1 to 1.  This removes the
                dependence on the<br>
                units used to measure the L's and t's.  Having solved
                the problem<br>
                one must transform the estimates for the Lmin, Lmax, b,
                and tmid<br>
                back to the "real" ones. (Left as an exercise for the
                student.)<br>
                Once you know the transformation you can set this up to
                be done<br>
                automatically in ADMB by making the real parameters
                sdreport variables.<br>
                <br>
                So say you have done that to the model.  You may still
                have trouble.<br>
                this is because  Lmin and Lmax are the lower and upper
                asymptote so<br>
                that they are not actually observed.<br>
                <br>
                The next principal with parameterization is that one
                should use parameters<br>
                which are well identified.  A mental test of this is to
                ask yourself<br>
                if you already have a good idea what the values of these
                parameters are.<br>
                For the model above suppose that you have n ordered
                observations<br>
                <br>
                  t_1 < ... < t_n<br>
                <br>
                Let Lone be the true value for L at t=t_1 and Ln be the
                true value for <br>
                L at t=t_n.  Clearly we know a lot more about these
                parameters than<br>
                we do for Lmin and Lmax.  But there is also another
                advantage. <br>
                If we use Lone and Ln as the parameters, the predicted
                values for<br>
                L_1 and L_n are independent of the estimate for b.   In
                other words<br>
                b now just spaces the predicted values between Lone and
                Ln.  <br>
                Using the original parameters Lmin and Lmax you will
                find that<br>
                many different combinations the parameters produce
                almost the same<br>
                predicted values for L_1 and L_n.  We say that there are
                interactions<br>
                or confounding between the parameters.  We want to
                remove the confounding.<br>
                <br>
                Reducing the problem to the form in equation (1)
                minimizes the confounding.<br>
                <br>
                To see that you really understand this stuff you should
                figure out<br>
                how to rewrite the equation (2) in terms of Lone,Ln,b,
                and tmid.<br>
                <br>
                Now as to see what the set_scalefactor() functions does
                if anything<br>
                what you should do is to compile the source with
                debugging turned on.<br>
                This enables you to really see what it does, rather than
                relying on what <br>
                someone tells you. (Who knows what is really in the code
                any more?)<br>
                <br>
                Then make a simple example like this<br>
                <br>
                DATA_SECTION<br>
                PARAMETER_SECTION<br>
                  init_bounded_number x(-10,10)<br>
                 !! x.set_scalefactor(200.);<br>
                  init_number y<br>
                  objective_function_value f<br>
                PROCEDURE_SECTION<br>
                  f=0.5*(x*x+y*y);<br>
                <br>
                Bounded numbers are more complicated than unbounded ones
                so this<br>
                is a good example to look at.<br>
                <br>
                If you step through the program you should get to line
                172 in df1b2qnm.cpp<br>
                <br>
                  {<br>
                 172           dvariable vf=0.0;<br>
                 > 173          
                vf=initial_params::reset(dvar_vector(x));<br>
                 174           *objective_function_value::pobjfun=0.0;<br>
                 175           pre_userfunction();<br>
                 176           if ( no_stuff ==0 &&
                quadratic_prior::get_num_quadratic_prior()>0)<br>
                 177           {<br>
                <br>
                the reset function takes the x vector from the function
                minimizer <br>
                and puts the values into the model rescaling if desired.<br>
                <br>
                stepping into that code youi should eventualy get to the
                line<br>
                 538 in model.cpp<br>
                <br>
                 533   const int& ii, const dvariable& pen)<br>
                 534   {<br>
                 535     if (!scalefactor)<br>
                 536       ::set_value(*this,x,ii,minb,maxb,pen);<br>
                 537     else<br>
                 >538      
                ::set_value(*this,x,ii,minb,maxb,pen,scalefactor);<br>
                 539   }<br>
                <br>
                Note that the field scalefactor is non zero. It was set
                by set_scalefactor().<br>
                <br>
                Now step into that line and you get to line 56 in the
                file set.cpp.<br>
                <br>
                 54 void set_value(const prevariable& _x,const
                dvar_vector& v,const int& _ii, <br>
                  55   double fmin, double fmax,const dvariable&
                _fpen,double s)<br>
                  >56 {<br>
                  57   int& ii=(int&)_ii; <br>
                  58   prevariable& x=(prevariable&) _x;<br>
                  59   dvariable& fpen=(dvariable&) _fpen;<br>
                  60   x=boundp(v(ii++),fmin,fmax,fpen,s);<br>
                  61 }<br>
                <br>
                Now finally step into line 60 and you end up at line 76
                in boundfun.cpp.<br>
                <br>
                  75 dvariable boundp(const prevariable& x, double
                fmin, double fmax,const prevariable& _fpen,double s)<br>
                  76 {<br>
                  77   return boundp(x/s,fmin,fmax,_fpen);<br>
                  78 }<br>
                <br>
                You see that at line 77 x is divided by s before being
                sent to the bounding<br>
                function boundp.  So x has come out of the function
                minimizer and<br>
                gets divided by s before going into the model<br>
                <br>
                   minimizer -----> x   ----- x/s ---> y ---> 
                boundp -----> model<br>
                <br>
                To get the initial x value for the minimizer there must
                be a corresponding<br>
                sequence like<br>
                <br>
                   model ----> boundpin ----> y ----> y*s 
                ----> x -----> minimizer<br>
                 <br>
                where boundpin is the inverse function of boundp.<br>
                <br>
                Note that one divides by s, so if yo want to make the<br>
                gradient smaller by a factor of 100. one should use<br>
                <br>
                   set_scalefactor(100.);<br>
                <br>
                <br>
                  Does that make sense?<span class="HOEnZb"><font
                    color="#888888"><br>
                    <br>
                          Dave<br>
                    <br>
                    <br>
                    <br>
                    <br>
                  </font></span></div>
              <div>
                <div class="h5">
                  <blockquote type="cite">
                    <div dir="ltr">
                      <div>Hi Dave, <br>
                        <br>
                        You helped me out a few years ago when I was
                        just starting to use ADMB, so I thought I'd try
                        again. I posted a related question to the ADMB
                        list, but I don't know if it went through. I'm
                        trying to understand when and how to scale
                        parameters. <br>
                        <br>
                        From what I gather there are three reasons to do
                        this: 1) when the likelihood function is highly
                        sensitive to a given parameter (Nocedal and
                        Wright (1999)) 2) when a parameter has a high
                        gradient (Hans), and 3) when the condition
                        number of the Hessian is very large (your post
                        here: <a moz-do-not-send="true"
href="http://www.admb-project.org/examples/function-minimization/parameter-scaling"
                          target="_blank">http://r.789695.n4.nabble.com/Complicated-nls-formula-giving-singular-gradient-message-td3085852.html</a>).

                        I understand these are all related issues,
                        however for my simulation exercise a single
                        'problem parameter' does not always satisfy all
                        of these (e.g., a parameter with a high gradient
                        does not have a relatively high/low eigenvalue).<br>
                        <br>
                      </div>
                      <div>So my question is how to apply scaling
                        factors in a structured way and is it ok to
                        scale many parameters? Also how do you do this
                        when you're fitting many simulated datasets (and
                        starting from many different starting points).
                        Finally, I'd very much appreciate a reference or
                        code where I can find out what the
                        set_scalefactor function in ADMB does.<br>
                      </div>
                      <div><br>
                        Thank you Dave - any tips would be greatly
                        appreciated!<br>
                        <br>
                        Shanae Allen<br>
                        <br>
                        <br>
                        <br>
                        Nocedal, J., & Wright, S. (1999). Numerical
                        optimization. New York: Springer.
                        <div>
                          <div><img moz-do-not-send="true"
                              src="https://ssl.gstatic.com/ui/v1/icons/mail/images/cleardot.gif"></div>
                        </div>
                        <br clear="all">
                        <div><br>
                          -- <br>
                          <div>
                            <div dir="ltr">Shanae Allen-Moran<br>
                              National Marine Fisheries Service<br>
                              110 Shaffer Rd.<br>
                              Santa Cruz, CA 95060<br>
                              Phone: <a moz-do-not-send="true"
                                href="tel:%28831%29%20420-3970"
                                value="+18314203970" target="_blank">(831)
                                420-3970</a><br>
                              Email: <a moz-do-not-send="true"
                                href="mailto:shanae.allen@noaa.gov"
                                target="_blank">shanae.allen@noaa.gov</a>
                              <br>
                              Website: <a moz-do-not-send="true"
                                href="http://swfsc.noaa.gov/SalmonAssessment/"
                                target="_blank">http://swfsc.noaa.gov/SalmonAssessment/</a></div>
                          </div>
                        </div>
                      </div>
                    </div>
                  </blockquote>
                  <br>
                </div>
              </div>
            </div>
          </blockquote>
        </div>
        <br>
        <br clear="all">
        <br>
        -- <br>
        <div class="gmail_signature">
          <div dir="ltr">Shanae Allen-Moran<br>
            National Marine Fisheries Service<br>
            110 Shaffer Rd.<br>
            Santa Cruz, CA 95060<br>
            Phone: (831) 420-3970<br>
            Email: <a moz-do-not-send="true"
              href="mailto:shanae.allen@noaa.gov" target="_blank">shanae.allen@noaa.gov</a>
            <br>
            Website: <a moz-do-not-send="true"
              href="http://swfsc.noaa.gov/SalmonAssessment/"
              target="_blank">http://swfsc.noaa.gov/SalmonAssessment/</a></div>
        </div>
      </div>
    </blockquote>
    <br>
  </body>
</html>