[Developers] Big improvement in the function minimizer with GPU.
dave fournier
davef at otter-rsch.com
Tue May 8 09:59:03 PDT 2012
To get a proof of concept for any programming technique it is nice to
get a big result fairly easily. almost all ADMB users rely on the
function minimizer fmin in the file newfmin.cpp. So to improve the
performance of this function in a more or less transparent
would immediately help a lot of users.
I hacked the newfmin.cpp file to add the BFGS quasi Newton update
with the (sort of) hess inverse kept on the GPU and main calcs done
on the GPU.
I tested this with a modified Rosenbrock function with 6144 parameters.
The new setup is both much faster and more stable than the old one
on newfmin. It appears that newfmin uses a different quasi-Newton update
which
is not as efficient for a large number of parameters.
This is the tpl file for the example.
DATA_SECTION
int n
!! n=4096+2048;
PARAMETER_SECTION
init_vector x(1,n);
objective_function_value f
PROCEDURE_SECTION
for (int i=1;i<=n/2;i++)
{
f+=100.*square(square(x(2*i-1))-x(2*i))+square(x(2*i-1)-1.0);
}
The new GPU version took 36 seconds and 477 function evals to converge
- final statistics:
6144 variables; iteration 277; function evaluation 477
Function value 3.2531e-21; maximum gradient component mag 9.7979e-11
Exit code = 1; converg criter 1.0000e-10
real 0m35.414s
user 0m4.417s <--- most time waiting for the GPU calcs
sys 0m0.616s
Old version took 288 seconds to do 477 function evaluations
but is not nearly as good at this point.
6144 variables; iteration 300; function evaluation 485; phase 1
Function value 6.6252316e+00; maximum gradient component mag -8.4966e+00
Old version converged in about 19 min 36 seconds
so the new version with BFGS update on the GPU
is about 32 times faster than the old version
and probably more stable.
Here is the old version final output
- final statistics:
6144 variables; iteration 1212; function evaluation 2119
Function value 1.7758e-21; maximum gradient component mag 9.7086e-11
Exit code = 1; converg criter 1.0000e-10
real 19m36.357s
user 19m35.848s
sys 0m0.093s
Yawn.
More information about the Developers
mailing list