[SOLVED] Problem with running a kernel via JME3 OpenCL, kernel arguments issues

Hello. I try to use the JME3 OpenCL binding to LWJGL3. A kernel without any arguments runs, but a kernel with arguments doesn’t. Did I get it wrong how to pass arguments to a kernel?

You can see from the output that test_no_args runs, but the other doesn’t.

    @Test
    void "build compile kernel"() {
        def anlKernel = injector.getInstance(AnlKernelFactory).create(context)
        anlKernel.buildLib()
        anlKernel.compileKernel("""
#include <opencl_utils.h>
#include <noise_gen.h>
#include <kernel.h>

kernel void test_no_args(
) {
    int id0 = get_global_id(0);
    printf("[test_no_args] id0=%d\\n", id0);
}

kernel void value_noise2D_noInterp(
global float2 *input,
global float *output
) {
    int id0 = get_global_id(0);
    printf("[value_noise2D_noInterp] id0=%d %f/%f\\n", id0, input[id0].x, input[id0].y);
    output[id0] = value_noise2D(input[id0], 200, noInterp);
}
""")
        anlKernel.createKernel("test_no_args")
        def workSize = new WorkSize(1)
        def event = anlKernel.run1(queue, workSize)
        event.waitForFinished()

        anlKernel.createKernel("value_noise2D_noInterp")
        def err = MemoryStack.stackMallocInt(1)
        long size = 1 * 4
        def outputb = new LwjglBuffer(clCreateBuffer(clcontext, CL_MEM_WRITE_ONLY, size, err))
        checkCLError(err.get(0))
        workSize = new WorkSize(1)
        event = anlKernel.run1(queue, workSize, new Vector2f(0, 0), outputb)
        event.waitForFinished()
    }

    public Event run1(CommandQueue queue, WorkSize globalWorkSize, Object... args) {
        return kernel.Run1(queue, globalWorkSize, args);
    }

Output:

Picked up _JAVA_OPTIONS: -Dawt.useSystemAAFontSettings=on -Dswing.aatext=true
[DEBUG] 626  [main] c.a.a.j.o.AnlKernel - Program created: 140643293669440 
[INFO ] 715  [main] c.a.a.j.o.LwjglProgramEx - Program compiled LwjglProgramEx[context=Context ([NVIDIA GeForce GTX 1050]),program=140643293669440] 
[DEBUG] 733  [main] c.a.a.j.o.HeaderProgramsBuilder - Program created 'opencl_utils.h': 140643295491616 {} 
[DEBUG] 733  [main] c.a.a.j.o.HeaderProgramsBuilder - Program created 'qsort.h': 140643295698368 {} 
[DEBUG] 734  [main] c.a.a.j.o.HeaderProgramsBuilder - Program created 'utility.h': 140643295698560 {} 
[DEBUG] 734  [main] c.a.a.j.o.HeaderProgramsBuilder - Program created 'hashing.h': 140643295698784 {} 
[DEBUG] 735  [main] c.a.a.j.o.HeaderProgramsBuilder - Program created 'noise_lut.h': 140643295500288 {} 
[DEBUG] 735  [main] c.a.a.j.o.HeaderProgramsBuilder - Program created 'noise_gen.h': 140643295500480 {} 
[DEBUG] 736  [main] c.a.a.j.o.HeaderProgramsBuilder - Program created 'imaging.h': 140643300221920 {} 
[DEBUG] 737  [main] c.a.a.j.o.HeaderProgramsBuilder - Program created 'kernel.h': 140643295500672 {} 
[DEBUG] 738  [main] c.a.a.j.o.AnlKernel - Kernel program created: 140643294896192 
[INFO ] 755  [main] c.a.a.j.o.LwjglProgramEx - Program compiled LwjglProgramEx[context=Context ([NVIDIA GeForce GTX 1050]),program=140643294896192] 
[DEBUG] 755  [main] c.a.a.j.o.AnlKernel - Kernel program compiled: LwjglProgramEx[context=Context ([NVIDIA GeForce GTX 1050]),program=140643294896192] 
[INFO ] 1033 [main] c.a.a.j.o.LwjglProgramEx - Program linked LwjglProgramEx[context=Context ([NVIDIA GeForce GTX 1050]),program=140643293669440] 
[DEBUG] 1033 [main] c.a.a.j.o.AnlKernel - Kernel program linked: LwjglProgramEx[context=Context ([NVIDIA GeForce GTX 1050]),program=140643294415824] 
[DEBUG] 1035 [main] c.a.a.j.o.AnlKernel - Kernel created Kernel (test_no_args) 
[test_no_args] id0=0
[DEBUG] 1047 [main] c.a.a.j.o.AnlKernel - Kernel created Kernel (value_noise2D_noInterp) 

Must be something with the arguments. Because if I run this kernel below then it runs. I still pass the arguments but I don’t access them.

kernel void value_noise2D_noInterp(
global float2 *input,
global float *output
) {
    int id0 = get_global_id(0);
    printf("[value_noise2D_noInterp] id0=%d\\n", id0);
//    printf("[value_noise2D_noInterp] id0=%d %f/%f\\n", id0, input[id0].x, input[id0].y);
//    output[id0] = value_noise2D(input[id0], 200, noInterp);
}

I was so stupid. That’s a float2* (a pointer to float2) and I pass in a float2 argument. Of course I need to use a buffer. This works now.

        anlKernel.createKernel("value_noise2D_noInterp")
        def err = stackMallocInt(1)
        long size = 1 * 4
        def input = stackMallocFloat(2)
        input.put(0)
        input.put(0)
        input.flip()
        def inputb = new LwjglBuffer(clCreateBuffer(clcontext, CL_MEM_WRITE_ONLY | CL_MEM_USE_HOST_PTR, input, err))
        checkCLError(err.get(0))
        def outputb = new LwjglBuffer(clCreateBuffer(clcontext, CL_MEM_WRITE_ONLY, size, err))
        checkCLError(err.get(0))
        workSize = new WorkSize(1)
        event = anlKernel.run1(queue, workSize, inputb, outputb)
        event.waitForFinished()
        def out = stackMallocFloat(1)
        outputb.read(queue, MemoryUtil.memByteBuffer(out), size)
        println out.get(0)
2 Likes