[Developers] A possible GPU project
dave fournier
davef at otter-rsch.com
Wed Apr 11 13:53:57 PDT 2012
On 12-04-11 12:39 PM, Matthew Supernaw wrote:
Well there are some interesting design questions to understand before one
needs a computer with suitable GPU although I have one.
As I understand it an important part of GPU
programming is to access contiguous memory as much as possible.
So if we are doing a dot product
z +=x[i]*y[i]
and you have 1000 threads operating at the same time they are accessing
x[1],x[2], ... x[1000] y[1],y[2],...y[1000] ie contiguous memory
for x and z so this is good.
Lets consider the first part of
void hcalcs1(int n,int n1,int np,int is,
dfsdmat & h,dvector & g,dvector & w)
{
int i,j,i1;
double z;
for (i=2; i<=n; i++)
{
i1=i-1;
z=-g.elem(i);
double * pd=&(h.elem(i,1));
double * pw=&(w.elem(1));
for (j=1; j<=i1; j++)
{
z-=*pd++ * *pw++;
}
w.elem(i)=z;
}
this can be written as
w(2) = -( g(2) + h(2,1)*w(1) )
w(3) = -( g(3) + h(3,1)*w(1)+ h(3,2)*w(2) )
w(4) = -( g(4) + h(4,1)*w(1)+ h(4,2)*w(2)+ h(4,3)*w(3) )
......
w(k) = -( g(k) + h(k,1)*w(1)+ h(k,2)*w(2)+ ... + h(k,k-1)*w(k-1) )
....
There are two ways to parallelize this
g(2) + h(2,1)*w(1)
g(3) + h(3,1)*w(1)
g(4) + h(4,1)*w(1)
......
g(k) + h(k,1)*w(1)
then wait until
w(2) = -( g(2) + h(2,1)*w(1) )
so you need to synchronize here
then the next step is (so the first thread is finished what toi do with it)
g(3) + h(3,1)*w(1)+ h(3,2)*w(2)
g(4) + h(4,1)*w(1)+ h(4,2)*w(2)
......
g(k) + h(k,1)*w(1)+ h(k,2)*w(2)
then wait until
w(3) = -( g(3) + h(3,1)*w(1)+ h(3,2)*w(2) )
etc.
Note that for this to be good one wants h(i,j) to be near h(i+1,j) i.e
store by column
The other way is to parallelize the dot product
w(k) = -( g(k) + h(k,1)*w(1)+ h(k,2)*w(2)+ ... + h(k,k-1)*w(k-1) )
or
w(k) = -g(k) + dot(&(h(k,1)),&(w(1)),k-1)
where dot(double * x,double * y,int k) is the dot product of two vector
of length n).
Note: for this you want h(i,j) to be close to h(i,j+1) i.e. store by row.
This looks simpler and we can include g(k) into the dot product with a
kludge.
> I did some GPU computing a few years ago for the park service. The big
> problem was that not all GPU/GPU drivers where created equal.
> Stability was a major issue. I haven't looked into it for a while, but
> maybe they have all the bugs worked out? We mostly used Nvidia FX
> 3800, which have 192 cores. Two FX 3800 per computer in a linux
> cluster, that was an awesome amount of power! I currently don't have
> access to any OpenCL/CUDA capable GPUs. I'll be happy to look into it
> when my development machine arrives!
>
> On Wed, Apr 11, 2012 at 3:00 PM,<developers-request at admb-project.org> wrote:
>> Send Developers mailing list submissions to
>> developers at admb-project.org
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>> http://lists.admb-project.org/mailman/listinfo/developers
>> or, via email, send a message with subject or body 'help' to
>> developers-request at admb-project.org
>>
>> You can reach the person managing the list at
>> developers-owner at admb-project.org
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of Developers digest..."
>>
>>
>> Today's Topics:
>>
>> 1. ADMB version control maintenance (Johnoel Ancheta)
>> 2. Re: A possible GPU project (dave fournier)
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Tue, 10 Apr 2012 21:35:15 -1000
>> From: Johnoel Ancheta<johnoel at hawaii.edu>
>> To: ADMB Users<users at admb-project.org>, developers at admb-project.org
>> Subject: [Developers] ADMB version control maintenance
>> Message-ID:
>> <CAJMx2XUmDYt8L0jR8LXK1R_=6hxdqDe8JXna99OsyrxsTco60g at mail.gmail.com>
>> Content-Type: text/plain; charset="iso-8859-1"
>>
>> Hi all,
>>
>> The ADMB version control will be offline April 11, 2012 for maintenance.
>>
>> Johnoel
>> -------------- next part --------------
>> An HTML attachment was scrubbed...
>> URL:<http://lists.admb-project.org/pipermail/developers/attachments/20120410/6472abd7/attachment-0001.html>
>>
>> ------------------------------
>>
>> Message: 2
>> Date: Wed, 11 Apr 2012 10:39:16 -0700
>> From: dave fournier<davef at otter-rsch.com>
>> To: developers at admb-project.org
>> Subject: Re: [Developers] A possible GPU project
>> Message-ID:<4F85C1C4.60500 at otter-rsch.com>
>> Content-Type: text/plain; charset="iso-8859-1"; Format="flowed"
>>
>> On 12-04-07 07:15 PM, Matthew Supernaw wrote:
>>
>>
>> To get back to your original question "I" am not going to do anything.
>> At our meeting over a year ago I heard (between discussions on
>> organizing organizing)
>> talk about how wonderful GPU programming is. So I studied it a bit and
>> did a few
>> examples to understand what it was like and posted the examples. The
>> usual lack
>> of interest in any real development ensued. But that involved AD so
>> extra technical
>> stuff besides simple GPU coding.
>>
>> So here is an example which involves really standard matrix-vector
>> calculations.
>> I have isolated that into 3 functions which are in the attached hcalcs.
>> (One change is that the matrix h should be stored as a vector.)
>> This is archetypical code for adapting to GPU calculations. The rest is
>> up to
>> whomever is so convinced that GPU calculations are great. Of course I am
>> happy to collaborate on the details.
>>
>>
>>> Dave,
>>> Great idea! Would you use opencl or cuda? I believe double precision is a add on for opencl, not sure about cuda.
>>> Matthew
>>>
>>>
>>>
>>> On Apr 6, 2012, at 3:00 PM, developers-request at admb-project.org wrote:
>>>
>>>> A possible GPU project
>>> _______________________________________________
>>> Developers mailing list
>>> Developers at admb-project.org
>>> http://lists.admb-project.org/mailman/listinfo/developers
>>>
>> -------------- next part --------------
>> An embedded and charset-unspecified text was scrubbed...
>> Name: hcalcs
>> URL:<http://lists.admb-project.org/pipermail/developers/attachments/20120411/f8810b6b/attachment-0001.ksh>
>> -------------- next part --------------
>> A non-text attachment was scrubbed...
>> Name: newfmin.zip
>> Type: application/zip
>> Size: 5111 bytes
>> Desc: not available
>> URL:<http://lists.admb-project.org/pipermail/developers/attachments/20120411/f8810b6b/attachment-0001.zip>
>>
>> ------------------------------
>>
>> _______________________________________________
>> Developers mailing list
>> Developers at admb-project.org
>> http://lists.admb-project.org/mailman/listinfo/developers
>>
>>
>> End of Developers Digest, Vol 38, Issue 15
>> ******************************************
>
>
More information about the Developers
mailing list