[ADMB Users] Differences in speed of ADMB versions?

Derek Seiple dseiple84 at gmail.com
Fri Apr 29 10:15:44 PDT 2011


As of revision 10 I made some changes to one of the solve functions.
Based on my tests of the 'vol' example it is about 25% faster, but
there is still some room for improvement. I will look into it some
more.

Thanks Dave for your help in identifying areas of code to improve.

Derek

On Tue, Apr 26, 2011 at 3:14 AM, Arni Magnusson <arnima at hafro.is> wrote:
> Attachment.
>
>
>
> On mán, apríl 25, 2011 9:51 pm, Arni Magnusson wrote:
>
>> Hi Derek,
>>
>> The full line is:
>>
>> solve(dvar_matrix const&, dvar_vector const&, prevariable&, prevariable
>> const&)
>>
>> Attached are the full 'gprof' reports for your reference.
>>
>> Loking forward to meeting you in Santa Barbara in June!
>>
>> Arni
>>
>>
>>
>> On mán, apríl 25, 2011 1:30 pm, Derek Seiple wrote:
>>
>>> The algorithms for matrix inverse and solving a linear system were
>>> changed between versions 9 and 10.
>>>
>>> In version 9 the algorithms were essentially straight from Numerical
>>> Recipes in C. In version 10 they were rewritten to use a new class
>>> which does LU-Decomposition, cltudecomp_for_adjoint
>>>
>>> In version 9 you have
>>>
>>>>  2.20  26.82  1.56  solve(dvar_matrix const&, dvar_vector const&, ...
>>>
>>> as one of the top functions called. Can you tell me which one exactly
>>> this one is, there are a couple solve functions. If it is the one I am
>>> thinking of there is a chance that the new code is making more use of
>>> the  LU-Decomp classes than maybe it should (at least this is what
>>> your profiles and other peoples comments suggest to me).
>>>
>>> Derek
>>>
>>>
>>>
>>> On Sun, Apr 24, 2011 at 8:41 PM, Arni Magnusson <arnima at hafro.is> wrote:
>>>> Whenever a new version of ADMB comes out, I do a quick benchmark with
>>>> the
>>>> 'catage' example. Recent versions of ADMB have shown very similar
>>>> performance:
>>>>
>>>>          catage -mcmc 100000 -mcsave 100
>>>>  9.1-440  18 sec
>>>> 10.0-450  18 sec
>>>> 10.1-450  18 sec
>>>>
>>>> But like Tim and Allan, I do see a worrisome pattern in the 'vol'
>>>> example,
>>>>
>>>>          vol  -nohess
>>>>  9.1-440 172 84
>>>> 10.0-450  488  254
>>>> 10.1-450  498  254
>>>>
>>>> where 9.1-440 means ADMB 9.1 on MinGW GCC 4.4.0, and the columns show
>>>> how
>>>> many seconds it takes to run n2mvol -gbs 500000000 and -nohess. Looks
>>>> like
>>>> things got almost 3 times slower between ADMB 9 and 10. Or is this just
>>>> a
>>>> matter of running the model with the right -options?
>>>>
>>>> I've used the 'gprof' tool that comes with GCC to see how much time is
>>>> spent
>>>> on each function call. These are the top 10 calls in the "healthy"
>>>> 9.1-440
>>>> profile,
>>>>
>>>> %      cumu   self  call
>>>> 12.23   8.67  8.67  DF_FILE::fread(void*, unsigned int)
>>>> 10.68  16.24  7.57  DF_FILE::fwrite(void*, unsigned int)
>>>>  4.06  19.12  2.88  dmdv_solve()
>>>>  3.15 21.35 2.23  dfinvpret()
>>>>  3.13 23.57 2.22  dvector::allocate(int, int)
>>>>  2.38  25.26  1.69  dfpool::free(void*)
>>>>  2.20  26.82  1.56  solve(dvar_matrix const&, dvar_vector const&, ...
>>>>  2.19 28.37 1.55  dvector::operator=(dvector const&)
>>>>  2.09 29.85 1.48  vector_shapex::operator new(unsigned int)
>>>>  1.88  31.18  1.33  operator new(unsigned int)
>>>>  ...
>>>>  0.00  70.88  0.00
>>>>
>>>> and this is the "slow" 10.1-450 profile:
>>>>
>>>> %      cumu   self  call
>>>> 5.82    9.42  9.42  DF_FILE::fread(void*, unsigned int)
>>>> 5.26   17.94  8.52  DF_FILE::fwrite(void const*, unsigned int)
>>>> 4.20   24.74  6.80  dmatrix::deallocate()
>>>> 3.32   30.12  5.38  cltudecomp_for_adjoint::ludecomp_pivot_for_...
>>>> 2.90   34.81  4.69  dvector::~dvector()
>>>> 2.79   39.32  4.51  grad_stack::set_gradient_stack1(void (*)(), ...
>>>> 2.69 43.67 4.35  cltudecomp_for_adjoint::ludecomp_pivot_for_...
>>>> 2.54 47.79 4.12  dvector::allocate(int, int)
>>>> 2.52 51.87 4.08  operator new(unsigned int)
>>>> 2.47   55.87  4.00  ivector::~ivector()
>>>> ...
>>>> 0.00  161.91  0.00
>>>>
>>>> The first two read/write calls are not that different, but then we see
>>>> a
>>>> lot
>>>> of time spent on low-level constructors and destructors for matrices
>>>> and
>>>> vectors. Is it possible that some matrix operators were doing things
>>>> more
>>>> economically in ADMB 9 than in 10?
>>>>
>>>> Arni
>>>
>>
>



More information about the Users mailing list