[Developers] trying to compare autodif with cppad

dave fournier davef at otter-rsch.com
Thu Aug 14 12:10:55 PDT 2014


On 08/14/2014 07:27 AM, Matthew Supernaw wrote:

Hi,


Actually I was impressed by how fast cppad appears to be without adjoint 
code.  I'm more worried
about how it seems to choke on a matrix a bit bigger than 300x300.  
Question is  whether this is just
due to my naive implementation.  So far the cppad gurus are silent.

The order of the gradient stack is probably
crucial.  When I developed the multi-threaded autodif version I made the 
gradient stacks thread-local so
each thread has its own gradient stack.  Problem is that performance 
gain is not as much as one might
expect so I suspect cache contention problems.  I tried Intel's tools 
for this but could not identify
where the problem was.

Presumably when using multi-threading the cppad people have solved the 
gradient stack problem.
that would be the thing they refer to as the tape.

         Dave




> Hi Dave,
>
> I'd be surprised if you find that cppad(running sequentially) is 
> superior to autodiff. Autodiff is really fast. If I recall 
> correctly, cppad was used in tmb because it plays well with openmp. 
> And given that most ADMB users no little about c++ development, it may 
> be a desired feature to have Autodiff work with something like openmp.
>
> A while ago I started reviewing the autodiff code, focusing on the 
> operations that interface prevariables and the gradient stack. What I 
> quickly came to realize is almost everything is accessed via pointer, 
> which is good for atomic operations. Without fully understanding the 
> subtle details of these interactions and whether or not order in the 
> gradient stack is important, It appears there are a lot of areas that 
> could benefit from atomic ops and thus allowing admb to 
> run concurrently without doing any locking(accept when absolutely 
> necessary).
>
> For instance:
>
>    if (++gradient_structure::RETURN_PTR > 
> gradient_structure::MAX_RETURN) {
> gradient_structure::RETURN_PTR = gradient_structure::MIN_RETURN;
>     }
>
> may work as:
>
>
>    if (*atomic_increment(*gradient_structure::RETURN_PTR) > 
> gradient_structure::MAX_RETURN) {
> *compare_and_swap*(gradient_structure::RETURN_PTR , 
> gradient_structure::MIN_RETURN);
>     }
>
> *this code is repeated at every prevariable op/function and could 
> probably just be put into a function.
>
> And this:
>
>
> /**
>  * Description not yet available.
>  * \param
>  */
> inline void grad_stack::set_gradient_stack4(void (*func) (void),
>  double *dep_addr,
>  double *ind_addr1,
>  double *ind_addr2)
> {
> #ifdef NO_DERIVS
>    if (!gradient_structure::no_derivatives)
>    {
> #endif
>       if (ptr > ptr_last)
>       {
>  // current buffer is full -- write it to disk and reset pointer
>  // and counter
>  this->write_grad_stack_buffer();
>       }
>       ptr->func = func;
> ptr->dep_addr = dep_addr;
> ptr->ind_addr1 = ind_addr1;
> ptr->ind_addr2 = ind_addr2;
>       ptr++;
> #ifdef NO_DERIVS
>    }
> #endif
> }
>
> Might work as:
>
> /**
>  * Description not yet available.
>  * \param
>  */
> inline void grad_stack::set_gradient_stack4(void (*func) (void),
>  double *dep_addr,
>  double *ind_addr1,
>  double *ind_addr2)
> {
> #ifdef NO_DERIVS
>    if (!gradient_structure::no_derivatives)
>    {
> #endif
> *grad_stack_entry* ptr_l = ptr;*
> *atomic_increment*(ptr);
>
>       if (ptr_l > ptr_last)
>       {
> *lock*();
>  // current buffer is full -- write it to disk and reset pointer
>  // and counter
>  this->write_grad_stack_buffer();
> *unlock*();
>       }
>
> *ptr_l*->func = func;
> *ptr_l*->dep_addr = dep_addr;
> *ptr_l*->ind_addr1 = ind_addr1;
> *ptr_l*->ind_addr2 = ind_addr2;
> #ifdef NO_DERIVS
>    }
> #endif
> }
> The above code is untested and not well though out. It may also 
> require some memory barriers.
>
> Anyway, I may be on the wrong track here, but at a first glance it 
> seems like a real possibility as far simplifying concurrent operations 
> for the average user. Do you think this would work?
>
> Matthew
>
> Matthew Supernaw
> Scientific Programmer
> National Oceanic and Atmospheric Administration
> National Marine Fisheries Service
> Sustainable Fisheries Division
> St. Petersburg, FL, 33701
> Office 727-551-5606
> Fax 727-824-5300
>
> On Aug 13, 2014, at 12:26 PM, developers-request at admb-project.org 
> <mailto:developers-request at admb-project.org> wrote:
>
>> Send Developers mailing list submissions to
>> developers at admb-project.org <mailto:developers at admb-project.org>
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>> http://lists.admb-project.org/mailman/listinfo/developers
>> or, via email, send a message with subject or body 'help' to
>> developers-request at admb-project.org
>>
>> You can reach the person managing the list at
>> developers-owner at admb-project.org
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of Developers digest..."
>>
>>
>> Today's Topics:
>>
>>   1. trying to compare autodif with cppad (dave fournier)
>>   2. Re: trying to compare autodif with cppad (dave fournier)
>>   3. Re: trying to compare autodif with cppad (Steve Martell)
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Tue, 12 Aug 2014 20:26:23 -0700
>> From: dave fournier <davef at otter-rsch.com>
>> To: "developers at admb-project.org" <developers at admb-project.org>
>> Subject: [Developers] trying to compare autodif with cppad
>> Message-ID: <53EADADF.8080002 at otter-rsch.com>
>> Content-Type: text/plain; charset="iso-8859-1"; Format="flowed"
>>
>>
>>    There has been a lot of material about TMB lately.  I think that TMB
>> uses cppad as its underlying AD engine.   I am interested in
>> trying to understand if cppad is superior to autodif and if so whether
>> ADMB could be modified to use cppad.
>>
>> As a first attempt I have been working at reproducing the LU
>> decomposition to calculate the log of
>> (the absolutevalue of ) the determinant of a matrix.  The code is
>> attached.  myreverse.cpp calculates the log det and
>> the gradient via reverse model AD using cppad.  myreverse_admb.cpp does
>> the same thing using autodif.
>>
>> For a 300x300 matrix the time required for these calculations is
>> approximately  .25 seconds for autodif and 19 seconds for cppad so that
>> autodif is about 75 times faster.  Obviously there may be techniques
>> which can speed up cppad or I may have made
>> some beginners error.  Perhaps the experts among us could comment.
>>
>> I could not compare matrices larger than 300x300 because the cppad code
>> crashed.  The autodif version
>> could do a 500x500 matrix in 1.23 seconds and a 1000x1000 matrix in 11
>> seconds.
>>
>> -------------- next part --------------
>> A non-text attachment was scrubbed...
>> Name: myreverse_admb.cpp
>> Type: text/x-c++src
>> Size: 1243 bytes
>> Desc: not available
>> URL: 
>> <http://lists.admb-project.org/pipermail/developers/attachments/20140812/649cb355/attachment-0002.cpp>
>> -------------- next part --------------
>> A non-text attachment was scrubbed...
>> Name: myreverse.cpp
>> Type: text/x-c++src
>> Size: 3464 bytes
>> Desc: not available
>> URL: 
>> <http://lists.admb-project.org/pipermail/developers/attachments/20140812/649cb355/attachment-0003.cpp>
>>
>> ------------------------------
>>
>> Message: 2
>> Date: Wed, 13 Aug 2014 06:57:31 -0700
>> From: dave fournier <davef at otter-rsch.com>
>> To: Kasper Kristensen <kaskr at dtu.dk>
>> Cc: "developers at admb-project.org" <developers at admb-project.org>
>> Subject: Re: [Developers] trying to compare autodif with cppad
>> Message-ID: <53EB6ECB.1000604 at otter-rsch.com>
>> Content-Type: text/plain; charset="iso-8859-1"; Format="flowed"
>>
>> On 08/12/2014 10:01 PM, Kasper Kristensen wrote:
>>
>> Sorry about forgetting the hpp file. It is now attached.  CPPAD version
>> is now much faster
>> with the -DNDEBUG option.  However when I increase the matrix size to
>> 500x500  (I'm aiming for fast 2,000x2,000)
>>  the cppad version produces NANS. Also note that the autodif version
>> produces the numbers and stores them
>> in a file named vector for the cppad version.
>>
>>       Dave
>>
>>
>>
>>> Dave,
>>>
>>> I could not run your test because "myldet.hpp" was not attached.
>>> Did you try set the "-DNDEBUG" flag with the cppad compilation? If I 
>>> recall correctly this could make a big difference.
>>>
>>> Kasper
>>>
>>>
>>>
>>> ________________________________________
>>> From: developers-bounces at admb-project.org 
>>> [developers-bounces at admb-project.org] on behalf of dave fournier 
>>> [davef at otter-rsch.com]
>>> Sent: Wednesday, August 13, 2014 5:26 AM
>>> To: developers at admb-project.org
>>> Subject: [Developers] trying to compare autodif with cppad
>>>
>>>     There has been a lot of material about TMB lately.  I think that TMB
>>> uses cppad as its underlying AD engine.   I am interested in
>>> trying to understand if cppad is superior to autodif and if so whether
>>> ADMB could be modified to use cppad.
>>>
>>> As a first attempt I have been working at reproducing the LU
>>> decomposition to calculate the log of
>>> (the absolutevalue of ) the determinant of a matrix.  The code is
>>> attached.  myreverse.cpp calculates the log det and
>>> the gradient via reverse model AD using cppad.  myreverse_admb.cpp does
>>> the same thing using autodif.
>>>
>>> For a 300x300 matrix the time required for these calculations is
>>> approximately  .25 seconds for autodif and 19 seconds for cppad so that
>>> autodif is about 75 times faster.  Obviously there may be techniques
>>> which can speed up cppad or I may have made
>>> some beginners error.  Perhaps the experts among us could comment.
>>>
>>> I could not compare matrices larger than 300x300 because the cppad code
>>> crashed.  The autodif version
>>> could do a 500x500 matrix in 1.23 seconds and a 1000x1000 matrix in 11
>>> seconds.
>>>
>>>
>>
>> -------------- next part --------------
>> A non-text attachment was scrubbed...
>> Name: myldet.hpp
>> Type: text/x-c++hdr
>> Size: 4652 bytes
>> Desc: not available
>> URL: 
>> <http://lists.admb-project.org/pipermail/developers/attachments/20140813/33736199/attachment-0001.hpp>
>> -------------- next part --------------
>> A non-text attachment was scrubbed...
>> Name: myreverse.cpp
>> Type: text/x-c++src
>> Size: 3464 bytes
>> Desc: not available
>> URL: 
>> <http://lists.admb-project.org/pipermail/developers/attachments/20140813/33736199/attachment-0002.cpp>
>> -------------- next part --------------
>> A non-text attachment was scrubbed...
>> Name: reverse_one.cpp
>> Type: text/x-c++src
>> Size: 2858 bytes
>> Desc: not available
>> URL: 
>> <http://lists.admb-project.org/pipermail/developers/attachments/20140813/33736199/attachment-0003.cpp>
>>
>> ------------------------------
>>
>> Message: 3
>> Date: Wed, 13 Aug 2014 16:26:23 +0000
>> From: Steve Martell <SteveM at iphc.int>
>> To: dave fournier <davef at otter-rsch.com>
>> Cc: "developers at admb-project.org" <developers at admb-project.org>,
>> Kasper Kristensen <kaskr at dtu.dk>
>> Subject: Re: [Developers] trying to compare autodif with cppad
>> Message-ID: <29A1581F-2ADE-412F-85C2-6D2904CEA97D at iphc.int>
>> Content-Type: text/plain; charset="iso-8859-1"
>>
>> Dave I was able to compile and run your examples.
>> ---------------------------------------------------------------------
>> With n=300 here are the run times.
>> myreverse_admb (safe mode):
>> real    0m0.643s
>> user    0m0.615s
>> sys     0m0.015s
>>
>>
>> myreverse_admb (optimize):
>> real    0m0.368s
>> user    0m0.337s
>> sys     0m0.014s
>>
>> Using the cppad
>> myreverse:
>> real    0m17.875s
>> user    0m17.010s
>> sys     0m0.847s
>>
>>
>> myreverse with -DNDEBUG flag:
>> real    0m5.287s
>> user    0m4.894s
>> sys     0m0.378s
>>
>> ---------------------------------------------------------------------
>> With n=500
>> myreverse_admb (safe mode):
>> real    0m2.414s
>> user    0m2.341s
>> sys     0m0.035s
>>
>> myreverse_admb (optimize):
>> real    0m1.450s
>> user    0m1.378s
>> sys     0m0.035s
>>
>> Using the cppad
>> myreverse:
>> n = 500
>> cppad-20140530 error from a known source:
>> dw = f.Reverse(q, w): has a nan,
>> but none of its Taylor coefficents are nan.
>> Error detected by false result for
>>    ! ( hasnan(value) && check_for_nan_ )
>> at line 202 in the file
>>    /usr/include/cppad/local/reverse.hpp
>> Assertion failed: (false), function Default, file 
>> /usr/include/cppad/error_handler.hpp, line 210.
>> Abort trap: 6
>>
>> real    1m19.457s
>> user    1m15.951s
>> sys     0m3.180s
>> bash-3.2$
>>
>> myreverse with -DNDEBUG flag:
>> n=500
>> output is nan's
>> real    0m23.766s
>> user    0m22.090s
>> sys     0m1.643s
>> ---------------------------------------------------------------------
>> Steve
>>
>>
>> On Aug 13, 2014, at 6:58 AM, dave fournier <davef at otter-rsch.com> wrote:
>>
>>> On 08/12/2014 10:01 PM, Kasper Kristensen wrote:
>>>
>>> Sorry about forgetting the hpp file. It is now attached.  CPPAD 
>>> version is now much faster
>>> with the -DNDEBUG option.  However when I increase the matrix size 
>>> to 500x500  (I'm aiming for fast 2,000x2,000)
>>> the cppad version produces NANS. Also note that the autodif version 
>>> produces the numbers and stores them
>>> in a file named vector for the cppad version.
>>>
>>>     Dave
>>>
>>>
>>>
>>>> Dave,
>>>>
>>>> I could not run your test because "myldet.hpp" was not attached.
>>>> Did you try set the "-DNDEBUG" flag with the cppad compilation? If 
>>>> I recall correctly this could make a big difference.
>>>>
>>>> Kasper
>>>>
>>>>
>>>>
>>>> ________________________________________
>>>> From: developers-bounces at admb-project.org 
>>>> [developers-bounces at admb-project.org] on behalf of dave fournier 
>>>> [davef at otter-rsch.com]
>>>> Sent: Wednesday, August 13, 2014 5:26 AM
>>>> To: developers at admb-project.org
>>>> Subject: [Developers] trying to compare autodif with cppad
>>>>
>>>>    There has been a lot of material about TMB lately.  I think that TMB
>>>> uses cppad as its underlying AD engine.   I am interested in
>>>> trying to understand if cppad is superior to autodif and if so whether
>>>> ADMB could be modified to use cppad.
>>>>
>>>> As a first attempt I have been working at reproducing the LU
>>>> decomposition to calculate the log of
>>>> (the absolutevalue of ) the determinant of a matrix.  The code is
>>>> attached.  myreverse.cpp calculates the log det and
>>>> the gradient via reverse model AD using cppad.  myreverse_admb.cpp does
>>>> the same thing using autodif.
>>>>
>>>> For a 300x300 matrix the time required for these calculations is
>>>> approximately  .25 seconds for autodif and 19 seconds for cppad so that
>>>> autodif is about 75 times faster.  Obviously there may be techniques
>>>> which can speed up cppad or I may have made
>>>> some beginners error.  Perhaps the experts among us could comment.
>>>>
>>>> I could not compare matrices larger than 300x300 because the cppad code
>>>> crashed.  The autodif version
>>>> could do a 500x500 matrix in 1.23 seconds and a 1000x1000 matrix in 11
>>>> seconds.
>>>>
>>>>
>>>
>>> <myldet.hpp><myreverse.cpp><reverse_one.cpp>_______________________________________________
>>> Developers mailing list
>>> Developers at admb-project.org
>>> http://lists.admb-project.org/mailman/listinfo/developers
>>
>>
>> ________________________________
>>
>> This internet e-mail message, and any files transmitted with it, 
>> contains confidential, privileged information that is intended only 
>> for the addressee. If you have received this e-mail message in error, 
>> please call us at (206) 634-1838 collect if necessary) and ask to 
>> speak to the message sender. Nothing in this e-mail or the act of 
>> transmitting it, is to be construed as a waiver of any rights or 
>> privileges enjoyed by the sender or the International Pacific Halibut 
>> Commission pursuant to the International Organizations Immunities 
>> Act, 22 U.S.C. Sec. 288 et seq.
>>
>>
>> ------------------------------
>>
>> _______________________________________________
>> Developers mailing list
>> Developers at admb-project.org
>> http://lists.admb-project.org/mailman/listinfo/developers
>>
>>
>> End of Developers Digest, Vol 64, Issue 12
>> ******************************************
>
>
>
> _______________________________________________
> Developers mailing list
> Developers at admb-project.org
> http://lists.admb-project.org/mailman/listinfo/developers

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.admb-project.org/pipermail/developers/attachments/20140814/30f556f8/attachment-0001.html>


More information about the Developers mailing list