[Developers] trying to compare autodif with cppad

Fri Aug 15 20:26:06 PDT 2014

On 08/14/2014 07:27 AM, Matthew Supernaw wrote:

One other thing is that the code executed on the graident stack takes 
aobut the same amount of time as
the function itself so it makes little sense to multi-thread the 
function without multi-threading the
gradient stack calculations even if they could be done in a single thread.

       Dave
> Hi Dave,
>
> I'd be surprised if you find that cppad(running sequentially) is 
> superior to autodiff. Autodiff is really fast. If I recall 
> correctly, cppad was used in tmb because it plays well with openmp. 
> And given that most ADMB users no little about c++ development, it may 
> be a desired feature to have Autodiff work with something like openmp.
>
> A while ago I started reviewing the autodiff code, focusing on the 
> operations that interface prevariables and the gradient stack. What I 
> quickly came to realize is almost everything is accessed via pointer, 
> which is good for atomic operations. Without fully understanding the 
> subtle details of these interactions and whether or not order in the 
> gradient stack is important, It appears there are a lot of areas that 
> could benefit from atomic ops and thus allowing admb to 
> run concurrently without doing any locking(accept when absolutely 
> necessary).
>
> For instance:
>
>    if (++gradient_structure::RETURN_PTR > 
> gradient_structure::MAX_RETURN) {
> gradient_structure::RETURN_PTR = gradient_structure::MIN_RETURN;
>     }
>
> may work as:
>
>
>    if (*atomic_increment(*gradient_structure::RETURN_PTR) > 
> gradient_structure::MAX_RETURN) {
> *compare_and_swap*(gradient_structure::RETURN_PTR , 
> gradient_structure::MIN_RETURN);
>     }
>
> *this code is repeated at every prevariable op/function and could 
> probably just be put into a function.
>
> And this:
>
>
> /**
>  * Description not yet available.
>  * \param
>  */
> inline void grad_stack::set_gradient_stack4(void (*func) (void),
>  double *dep_addr,
>  double *ind_addr1,
>  double *ind_addr2)
> {
> #ifdef NO_DERIVS
>    if (!gradient_structure::no_derivatives)
>    {
> #endif
>       if (ptr > ptr_last)
>       {
>  // current buffer is full -- write it to disk and reset pointer
>  // and counter
>  this->write_grad_stack_buffer();
>       }
>       ptr->func = func;
> ptr->dep_addr = dep_addr;
> ptr->ind_addr1 = ind_addr1;
> ptr->ind_addr2 = ind_addr2;
>       ptr++;
> #ifdef NO_DERIVS
>    }
> #endif
> }
>
> Might work as:
>
> /**
>  * Description not yet available.
>  * \param
>  */
> inline void grad_stack::set_gradient_stack4(void (*func) (void),
>  double *dep_addr,
>  double *ind_addr1,
>  double *ind_addr2)
> {
> #ifdef NO_DERIVS
>    if (!gradient_structure::no_derivatives)
>    {
> #endif
> *grad_stack_entry* ptr_l = ptr;*
> *atomic_increment*(ptr);
>
>       if (ptr_l > ptr_last)
>       {
> *lock*();
>  // current buffer is full -- write it to disk and reset pointer
>  // and counter
>  this->write_grad_stack_buffer();
> *unlock*();
>       }
>
> *ptr_l*->func = func;
> *ptr_l*->dep_addr = dep_addr;
> *ptr_l*->ind_addr1 = ind_addr1;
> *ptr_l*->ind_addr2 = ind_addr2;
> #ifdef NO_DERIVS
>    }
> #endif
> }
> The above code is untested and not well though out. It may also 
> require some memory barriers.
>
> Anyway, I may be on the wrong track here, but at a first glance it 
> seems like a real possibility as far simplifying concurrent operations 
> for the average user. Do you think this would work?
>
> Matthew
>
> Matthew Supernaw
> Scientific Programmer
> National Oceanic and Atmospheric Administration
> National Marine Fisheries Service
> Sustainable Fisheries Division
> St. Petersburg, FL, 33701
> Office 727-551-5606
> Fax 727-824-5300
>
> On Aug 13, 2014, at 12:26 PM, developers-request at admb-project.org 
> <mailto:developers-request at admb-project.org> wrote:
>
>> Send Developers mailing list submissions to
>> developers at admb-project.org <mailto:developers at admb-project.org>
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>> http://lists.admb-project.org/mailman/listinfo/developers
>> or, via email, send a message with subject or body 'help' to
>> developers-request at admb-project.org
>>
>> You can reach the person managing the list at
>> developers-owner at admb-project.org
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of Developers digest..."
>>
>>
>> Today's Topics:
>>
>>   1. trying to compare autodif with cppad (dave fournier)
>>   2. Re: trying to compare autodif with cppad (dave fournier)
>>   3. Re: trying to compare autodif with cppad (Steve Martell)
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Tue, 12 Aug 2014 20:26:23 -0700
>> From: dave fournier <davef at otter-rsch.com>
>> To: "developers at admb-project.org" <developers at admb-project.org>
>> Subject: [Developers] trying to compare autodif with cppad
>> Message-ID: <53EADADF.8080002 at otter-rsch.com>
>> Content-Type: text/plain; charset="iso-8859-1"; Format="flowed"
>>
>>
>>    There has been a lot of material about TMB lately.  I think that TMB
>> uses cppad as its underlying AD engine.   I am interested in
>> trying to understand if cppad is superior to autodif and if so whether
>> ADMB could be modified to use cppad.
>>
>> As a first attempt I have been working at reproducing the LU
>> decomposition to calculate the log of
>> (the absolutevalue of ) the determinant of a matrix.  The code is
>> attached.  myreverse.cpp calculates the log det and
>> the gradient via reverse model AD using cppad.  myreverse_admb.cpp does
>> the same thing using autodif.
>>
>> For a 300x300 matrix the time required for these calculations is
>> approximately  .25 seconds for autodif and 19 seconds for cppad so that
>> autodif is about 75 times faster.  Obviously there may be techniques
>> which can speed up cppad or I may have made
>> some beginners error.  Perhaps the experts among us could comment.
>>
>> I could not compare matrices larger than 300x300 because the cppad code
>> crashed.  The autodif version
>> could do a 500x500 matrix in 1.23 seconds and a 1000x1000 matrix in 11
>> seconds.
>>
>> -------------- next part --------------
>> A non-text attachment was scrubbed...
>> Name: myreverse_admb.cpp
>> Type: text/x-c++src
>> Size: 1243 bytes
>> Desc: not available
>> URL: 
>> <http://lists.admb-project.org/pipermail/developers/attachments/20140812/649cb355/attachment-0002.cpp>
>> -------------- next part --------------
>> A non-text attachment was scrubbed...
>> Name: myreverse.cpp
>> Type: text/x-c++src
>> Size: 3464 bytes
>> Desc: not available
>> URL: 
>> <http://lists.admb-project.org/pipermail/developers/attachments/20140812/649cb355/attachment-0003.cpp>
>>
>> ------------------------------
>>
>> Message: 2
>> Date: Wed, 13 Aug 2014 06:57:31 -0700
>> From: dave fournier <davef at otter-rsch.com>
>> To: Kasper Kristensen <kaskr at dtu.dk>
>> Cc: "developers at admb-project.org" <developers at admb-project.org>
>> Subject: Re: [Developers] trying to compare autodif with cppad
>> Message-ID: <53EB6ECB.1000604 at otter-rsch.com>
>> Content-Type: text/plain; charset="iso-8859-1"; Format="flowed"
>>
>> On 08/12/2014 10:01 PM, Kasper Kristensen wrote:
>>
>> Sorry about forgetting the hpp file. It is now attached.  CPPAD version
>> is now much faster
>> with the -DNDEBUG option.  However when I increase the matrix size to
>> 500x500  (I'm aiming for fast 2,000x2,000)
>>  the cppad version produces NANS. Also note that the autodif version
>> produces the numbers and stores them
>> in a file named vector for the cppad version.
>>
>>       Dave
>>
>>
>>
>>> Dave,
>>>
>>> I could not run your test because "myldet.hpp" was not attached.
>>> Did you try set the "-DNDEBUG" flag with the cppad compilation? If I 
>>> recall correctly this could make a big difference.
>>>
>>> Kasper
>>>
>>>
>>>
>>> ________________________________________
>>> From: developers-bounces at admb-project.org 
>>> [developers-bounces at admb-project.org] on behalf of dave fournier 
>>> [davef at otter-rsch.com]
>>> Sent: Wednesday, August 13, 2014 5:26 AM
>>> To: developers at admb-project.org
>>> Subject: [Developers] trying to compare autodif with cppad
>>>
>>>     There has been a lot of material about TMB lately.  I think that TMB
>>> uses cppad as its underlying AD engine.   I am interested in
>>> trying to understand if cppad is superior to autodif and if so whether
>>> ADMB could be modified to use cppad.
>>>
>>> As a first attempt I have been working at reproducing the LU
>>> decomposition to calculate the log of
>>> (the absolutevalue of ) the determinant of a matrix.  The code is
>>> attached.  myreverse.cpp calculates the log det and
>>> the gradient via reverse model AD using cppad.  myreverse_admb.cpp does
>>> the same thing using autodif.
>>>
>>> For a 300x300 matrix the time required for these calculations is
>>> approximately  .25 seconds for autodif and 19 seconds for cppad so that
>>> autodif is about 75 times faster.  Obviously there may be techniques
>>> which can speed up cppad or I may have made
>>> some beginners error.  Perhaps the experts among us could comment.
>>>
>>> I could not compare matrices larger than 300x300 because the cppad code
>>> crashed.  The autodif version
>>> could do a 500x500 matrix in 1.23 seconds and a 1000x1000 matrix in 11
>>> seconds.
>>>
>>>
>>
>> -------------- next part --------------
>> A non-text attachment was scrubbed...
>> Name: myldet.hpp
>> Type: text/x-c++hdr
>> Size: 4652 bytes
>> Desc: not available
>> URL: 
>> <http://lists.admb-project.org/pipermail/developers/attachments/20140813/33736199/attachment-0001.hpp>
>> -------------- next part --------------
>> A non-text attachment was scrubbed...
>> Name: myreverse.cpp
>> Type: text/x-c++src
>> Size: 3464 bytes
>> Desc: not available
>> URL: 
>> <http://lists.admb-project.org/pipermail/developers/attachments/20140813/33736199/attachment-0002.cpp>
>> -------------- next part --------------
>> A non-text attachment was scrubbed...
>> Name: reverse_one.cpp
>> Type: text/x-c++src
>> Size: 2858 bytes
>> Desc: not available
>> URL: 
>> <http://lists.admb-project.org/pipermail/developers/attachments/20140813/33736199/attachment-0003.cpp>
>>
>> ------------------------------
>>
>> Message: 3
>> Date: Wed, 13 Aug 2014 16:26:23 +0000
>> From: Steve Martell <SteveM at iphc.int>
>> To: dave fournier <davef at otter-rsch.com>
>> Cc: "developers at admb-project.org" <developers at admb-project.org>,
>> Kasper Kristensen <kaskr at dtu.dk>
>> Subject: Re: [Developers] trying to compare autodif with cppad
>> Message-ID: <29A1581F-2ADE-412F-85C2-6D2904CEA97D at iphc.int>
>> Content-Type: text/plain; charset="iso-8859-1"
>>
>> Dave I was able to compile and run your examples.
>> ---------------------------------------------------------------------
>> With n=300 here are the run times.
>> myreverse_admb (safe mode):
>> real    0m0.643s
>> user    0m0.615s
>> sys     0m0.015s
>>
>>
>> myreverse_admb (optimize):
>> real    0m0.368s
>> user    0m0.337s
>> sys     0m0.014s
>>
>> Using the cppad
>> myreverse:
>> real    0m17.875s
>> user    0m17.010s
>> sys     0m0.847s
>>
>>
>> myreverse with -DNDEBUG flag:
>> real    0m5.287s
>> user    0m4.894s
>> sys     0m0.378s
>>
>> ---------------------------------------------------------------------
>> With n=500
>> myreverse_admb (safe mode):
>> real    0m2.414s
>> user    0m2.341s
>> sys     0m0.035s
>>
>> myreverse_admb (optimize):
>> real    0m1.450s
>> user    0m1.378s
>> sys     0m0.035s
>>
>> Using the cppad
>> myreverse:
>> n = 500
>> cppad-20140530 error from a known source:
>> dw = f.Reverse(q, w): has a nan,
>> but none of its Taylor coefficents are nan.
>> Error detected by false result for
>>    ! ( hasnan(value) && check_for_nan_ )
>> at line 202 in the file
>>    /usr/include/cppad/local/reverse.hpp
>> Assertion failed: (false), function Default, file 
>> /usr/include/cppad/error_handler.hpp, line 210.
>> Abort trap: 6
>>
>> real    1m19.457s
>> user    1m15.951s
>> sys     0m3.180s
>> bash-3.2$
>>
>> myreverse with -DNDEBUG flag:
>> n=500
>> output is nan's
>> real    0m23.766s
>> user    0m22.090s
>> sys     0m1.643s
>> ---------------------------------------------------------------------
>> Steve
>>
>>
>> On Aug 13, 2014, at 6:58 AM, dave fournier <davef at otter-rsch.com> wrote:
>>
>>> On 08/12/2014 10:01 PM, Kasper Kristensen wrote:
>>>
>>> Sorry about forgetting the hpp file. It is now attached.  CPPAD 
>>> version is now much faster
>>> with the -DNDEBUG option.  However when I increase the matrix size 
>>> to 500x500  (I'm aiming for fast 2,000x2,000)
>>> the cppad version produces NANS. Also note that the autodif version 
>>> produces the numbers and stores them
>>> in a file named vector for the cppad version.
>>>
>>>     Dave
>>>
>>>
>>>
>>>> Dave,
>>>>
>>>> I could not run your test because "myldet.hpp" was not attached.
>>>> Did you try set the "-DNDEBUG" flag with the cppad compilation? If 
>>>> I recall correctly this could make a big difference.
>>>>
>>>> Kasper
>>>>
>>>>
>>>>
>>>> ________________________________________
>>>> From: developers-bounces at admb-project.org 
>>>> [developers-bounces at admb-project.org] on behalf of dave fournier 
>>>> [davef at otter-rsch.com]
>>>> Sent: Wednesday, August 13, 2014 5:26 AM
>>>> To: developers at admb-project.org
>>>> Subject: [Developers] trying to compare autodif with cppad
>>>>
>>>>    There has been a lot of material about TMB lately.  I think that TMB
>>>> uses cppad as its underlying AD engine.   I am interested in
>>>> trying to understand if cppad is superior to autodif and if so whether
>>>> ADMB could be modified to use cppad.
>>>>
>>>> As a first attempt I have been working at reproducing the LU
>>>> decomposition to calculate the log of
>>>> (the absolutevalue of ) the determinant of a matrix.  The code is
>>>> attached.  myreverse.cpp calculates the log det and
>>>> the gradient via reverse model AD using cppad.  myreverse_admb.cpp does
>>>> the same thing using autodif.
>>>>
>>>> For a 300x300 matrix the time required for these calculations is
>>>> approximately  .25 seconds for autodif and 19 seconds for cppad so that
>>>> autodif is about 75 times faster.  Obviously there may be techniques
>>>> which can speed up cppad or I may have made
>>>> some beginners error.  Perhaps the experts among us could comment.
>>>>
>>>> I could not compare matrices larger than 300x300 because the cppad code
>>>> crashed.  The autodif version
>>>> could do a 500x500 matrix in 1.23 seconds and a 1000x1000 matrix in 11
>>>> seconds.
>>>>
>>>>
>>>
>>> <myldet.hpp><myreverse.cpp><reverse_one.cpp>_______________________________________________
>>> Developers mailing list
>>> Developers at admb-project.org
>>> http://lists.admb-project.org/mailman/listinfo/developers
>>
>>
>> ________________________________
>>
>> This internet e-mail message, and any files transmitted with it, 
>> contains confidential, privileged information that is intended only 
>> for the addressee. If you have received this e-mail message in error, 
>> please call us at (206) 634-1838 collect if necessary) and ask to 
>> speak to the message sender. Nothing in this e-mail or the act of 
>> transmitting it, is to be construed as a waiver of any rights or 
>> privileges enjoyed by the sender or the International Pacific Halibut 
>> Commission pursuant to the International Organizations Immunities 
>> Act, 22 U.S.C. Sec. 288 et seq.
>>
>>
>> ------------------------------
>>
>> _______________________________________________
>> Developers mailing list
>> Developers at admb-project.org
>> http://lists.admb-project.org/mailman/listinfo/developers
>>
>>
>> End of Developers Digest, Vol 64, Issue 12
>> ******************************************
>
>
>
> _______________________________________________
> Developers mailing list
> Developers at admb-project.org
> http://lists.admb-project.org/mailman/listinfo/developers

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.admb-project.org/pipermail/developers/attachments/20140815/260274d7/attachment-0001.html>