[Developers] A possible GPU project - first go

Sat Apr 21 13:48:39 PDT 2012

On 12-04-18 08:07 AM, dave fournier wrote:

There may also be problems with ff 32,34 and 44 if not others as well.

> On second thought if we are to have anything at all before next years 
> meeting
> we better get moving.   The code from newfmin.cpp I addressed is like
>
>  for (i=2; i<=N; i++)
>      {
>        int i1=i-1;
>        double z=-g(i);
>        for (int j=1; j<=i1; j++)
>        {
>           int j1=j-1;
>           int offset=((2*N-j1)*j1+j1)/2;
>           z-=h(i-j+1+offset)*w1(j);
>        }
>        w1(i)=z;
>      }
>
> Here N is the size of a symmetric NxN matrix.  g  and w1 are N 
> dimensional vectors.
>
> I used N=5120.  The increase in speed on the GPU was approx 20 to 1.
> Code is attached.  It would be interesting if other people with opencl 
> capable
> code tried this out.  I have no idea what the portability issues might 
> be as yet.
>
> The GPU code consists of 3 kernels.  Fairly steep learning curve to 
> figure out
> how to do this stuff.  Most of the examples one finds are trivial.  
> The difficulty is
> that to calculate w1(i) you must already know the previous w1's.  However
> you can partially calculate w1(i)  if you know w1(1) up to w1(m) for 
> some m<i.
> This is what has been used to parallelize the code.  The matrix is 
> split up into
> vertical strips indexed in the code by k. Enjoy!
>
>
>
>
> _______________________________________________
> Developers mailing list
> Developers at admb-project.org
> http://lists.admb-project.org/mailman/listinfo/developers

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.admb-project.org/pipermail/developers/attachments/20120421/d6393ed0/attachment.html>