[Developers] faster ludecomp in C++ for creating adjoint code
dave fournier
davef at otter-rsch.com
Mon Sep 15 12:38:22 PDT 2014
On 09/15/2014 12:23 PM, Johnoel Ancheta wrote:
To compile it I used something like
CXX=g++
ADMB_HOME=~/admodel
echo "!!! Note using the version found in directory "
echo ${ADMB_HOME}
${CXX} -march=native -ggdb -Ofast -funroll-loops -DOPT_LIB
-ffast-math -pthread -DUSE_PTHREADS -W -fpermissive -DUSE_LAPLACE -Dlinux \
avx_vecdot.o \
-L/home/dave/opt/OpenBLAS/lib \
-I/home/dave/opt/OpenBLAS/include \
-D__GNUDOS__ -o$1 $1.cpp \
-I. -I${ADMB_HOME}/include \
-I/home/dave/include \
-L${ADMB_HOME}/lib \
-L/home/dave/lib \
-ladmbo \
-lopenblas \
-lgfortran
Near the top of main you set the matrix size n and the block size m. At
present n must be a multiple of m.
main()
{
ad_set_new_handler();
ad_exit=&ad_boundf;
int n=2000; // we will work with an nxn symmetric matrix
const int m=100; // block size for blocked LU code
To restrict openblas to one thread for comparison you can use the
environment string
export OPENBLAS_NUM_THREADS=1
> Sure, it would be nice to do the comparison.
>
> Thanks in advance,
>
> Johnoel
>
> On Mon, Sep 15, 2014 at 9:19 AM, dave fournier <davef at otter-rsch.com
> <mailto:davef at otter-rsch.com>> wrote:
>
> On 09/15/2014 12:14 PM, Johnoel Ancheta wrote:
>
> If you wan t I have code to call the Openblas version as well as a
> blocked version which stores the relevant matrices in
> blocks as recommended by Dongarra et al (but it does not seem to
> be worth the effort).
>
>
>> Thanks Dave! I'll provide feedback after testing...
>>
>>
>>
>> On Mon, Sep 15, 2014 at 9:03 AM, otter <otter at otter-rsch.com
>> <mailto:otter at otter-rsch.com>> wrote:
>>
>> After much pain I have produced a C++ version of the LU
>> decomposition which
>> is suitable for producing adjoint code for ADMB and perhaps
>> cppad. (Don't know what adjoint
>> code for cppad is as yet!) This code is about 25 times
>> faster than the current
>> ADMB code for a 2,000 x 2,000 matrix and about 4 times slower
>> than the Openblas
>> code which contains optimized assembler and Fortrash. Any
>> suggestion for improvements
>> would be welcome. Per usual I will hold my breath.
>>
>>
>>
>> _______________________________________________
>> Developers mailing list
>> Developers at admb-project.org <mailto:Developers at admb-project.org>
>> http://lists.admb-project.org/mailman/listinfo/developers
>>
>>
>>
>>
>> _______________________________________________
>> Developers mailing list
>> Developers at admb-project.org <mailto:Developers at admb-project.org>
>> http://lists.admb-project.org/mailman/listinfo/developers
>
>
> _______________________________________________
> Developers mailing list
> Developers at admb-project.org <mailto:Developers at admb-project.org>
> http://lists.admb-project.org/mailman/listinfo/developers
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.admb-project.org/pipermail/developers/attachments/20140915/d7f25670/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: yblocklu10.cpp
Type: text/x-c++src
Size: 18081 bytes
Desc: not available
URL: <http://lists.admb-project.org/pipermail/developers/attachments/20140915/d7f25670/attachment-0001.cpp>
More information about the Developers
mailing list