[Developers] Big improvement in the function minimizer with GPU.

Tue May 8 09:59:03 PDT 2012

To get a proof of concept for any programming technique it is nice to
get a big result fairly easily.  almost all ADMB users rely on the
function minimizer fmin in the file newfmin.cpp.  So to improve the
performance of this function in a more or less transparent
would immediately help a lot of users.

I hacked the newfmin.cpp file to add the BFGS quasi Newton update
with the (sort of) hess inverse kept on the GPU and main calcs done
on the GPU.

I tested this with a modified Rosenbrock function with 6144 parameters.
The new setup is both much faster and more stable than the old one
on newfmin. It appears that newfmin uses a different quasi-Newton update 
which
is not as efficient for a large number of parameters.

This is the tpl file for the example.

DATA_SECTION
   int n
  !! n=4096+2048;
PARAMETER_SECTION
   init_vector x(1,n);
   objective_function_value f
PROCEDURE_SECTION
   for (int i=1;i<=n/2;i++)
   {
      f+=100.*square(square(x(2*i-1))-x(2*i))+square(x(2*i-1)-1.0);
   }

The new GPU version took 36 seconds and 477 function evals to converge
  - final statistics:
6144 variables; iteration 277; function evaluation 477
Function value   3.2531e-21; maximum gradient component mag   9.7979e-11
Exit code = 1;  converg criter   1.0000e-10

real    0m35.414s
user   0m4.417s <--- most time waiting for the GPU calcs
sys     0m0.616s

Old version took 288 seconds to do 477 function evaluations
but is not nearly as good at this point.

6144 variables; iteration 300; function evaluation 485; phase 1
Function value   6.6252316e+00; maximum gradient component mag  -8.4966e+00

Old version converged in about 19 min 36 seconds
so the new version with BFGS update on the GPU
is about 32 times faster than the old version
and probably more stable.

Here is the old version final output
  - final statistics:
6144 variables; iteration 1212; function evaluation 2119
Function value   1.7758e-21; maximum gradient component mag   9.7086e-11
Exit code = 1;  converg criter   1.0000e-10

real    19m36.357s
user    19m35.848s
sys    0m0.093s

Yawn.