Ian Taylor ian.taylor at noaa.gov
Mon May 14 14:45:03 PDT 2012

Hi Dave,
I'd like to say I had some success recreating your example, but instead
I've had a series of failures that probably only indicate two things: your
code doesn't yet work well on Windows and/or I have no idea what I'm doing.

I started on a linux virtual machine, but didn't get past the stage of
installing OpenCL because GPU calcs are apparently not supported on virtual
machines. Then I tried a linux cluster, but it doesn't have a supported
GPU. Then I tried using the BFGS update on the CPU on the linux virtual
machine. However, it's not as easy as just setting USE_GPU_FLAG=0 because
without being able to include OpenCL and CL, you get lots of errors like
 "newfmin.cpp:247:3: error: ‘cl_int’ does not name a type"

I then tried compiling on some Windows computers (one with Nvidia GPU and
MS Visual C++, and another with an AMD GPU and MingGW). Those efforts
didn't bring any more luck. With both Visual C++ and MinGW, I got similar
errors in newfmin.cpp (pasted below). I don't know enough to make the
errors go away, so I think my next step will be to wait until I'm in front
of a non-virtual linux computer.

#### some errors while compiling the new newfmin.cpp in Windows ####

VC error snippet:

..\..\..\..\src\linad99\newfmin.cpp(214) : error C4430: missing type
specifier -
 int assumed. Note: C++ does not support default-int
..\..\..\..\src\linad99\newfmin.cpp(214) : error C2146: syntax error :
missing '
;' before identifier 'MAX_SOURCE_SIZE'
..\..\..\..\src\linad99\newfmin.cpp(258) : warning C4512: 'opencl_manager'
: ass
ignment operator could not be generated

MinGW error snippet:

newfmin.cpp:214:9: error: 'uint' does not name a type
newfmin.cpp: In constructor 'opencl_manager::opencl_manager()':
newfmin.cpp:282:40: error: class 'opencl_manager' does not have any field
newfmin.cpp: In member function 'cl_int
opencl_manager::LoadKernelSource(const c
newfmin.cpp:298:31: error: 'MAX_SOURCE_SIZE' was not declared in this scope

On Sat, May 12, 2012 at 8:31 AM, dave fournier <davef at otter-rsch.com> wrote:

> Has anyone else actually got this example to work?
> Some advice. Older GPU's (whatever that is) probably
> do not support double precision.
> WRT using the BFGS update on the CPU. It does not seem
> to perform as well as doing iton the GPU. I think this is
> due to roundoff error.  The CPU is carrying out additions in a different
> way. It may be that with say 4K or more parameters and this
> (artificial) example roundoff error becomes important.
> I stored the matrix by rows. It is now appears that it should be stored
> by columns for the fastest matrix * vector multiplication.
