I am aware of:
https://github.com/lsegal/barracuda
Which hasn't been updated since 01/11
And
http://rubyforge.org/projects/ruby-opencl/
Which hasn't been updated since 03/10.
Are these projects dead? Or have they simply not changed because their functioning, and OpenCL/Ruby haven't changed since then. Is anybody using these projects? Any luck?
If not, can you recommend another opencl gem for Ruby? Or how is this sort of call done usually? Just call raw C from Ruby?
You can try opencl_ruby_ffi, it's actively developed (by a colleague of mine) and working well with OpenCL version 1.2. OpenCL 2.0 should also be available soon.
sudo gem install opencl_ruby_ffi
In Khronos forum you can find a quick example that shows how it works:
require 'opencl_ruby_ffi'
# select the first platform/device available
# improve it if you have multiple GPU on your machine
platform = OpenCL::platforms.first
device = platform.devices.first
# prepare the source of GPU kernel
# this is not Ruby but OpenCL C
source = <<EOF
__kernel void addition( float2 alpha, __global const float *x, __global float *y) {\n\
size_t ig = get_global_id(0);\n\
y[ig] = (alpha.s0 + alpha.s1 + x[ig])*0.3333333333333333333f;\n\
}
EOF
# configure OpenCL environment, refer to OCL API if necessary
context = OpenCL::create_context(device)
queue = context.create_command_queue(device, :properties => OpenCL::CommandQueue::PROFILING_ENABLE)
# create and compile the OpenCL C source code
prog = context.create_program_with_source(source)
prog.build
# allocate CPU (=RAM) buffers and
# fill the input one with random values
a_in = NArray.sfloat(65536).random(1.0)
a_out = NArray.sfloat(65536)
# allocate GPU buffers matching the CPU ones
b_in = context.create_buffer(a_in.size * a_in.element_size, :flags => OpenCL::Mem::COPY_HOST_PTR, :host_ptr => a_in)
b_out = context.create_buffer(a_out.size * a_out.element_size)
# create a constant pair of float
f = OpenCL::Float2::new(3.0,2.0)
# trigger the execution of kernel 'addition' on 128 cores
event = prog.addition(queue, [65536], f, b_in, b_out,
:local_work_size => [128])
# #Or if you want to be more OpenCL like:
# k = prog.create_kernel("addition")
# k.set_arg(0, f)
# k.set_arg(1, b_in)
# k.set_arg(2, b_out)
# event = queue.enqueue_NDrange_kernel(k, [65536],:local_work_size => [128])
# tell OCL to transfer the content GPU buffer b_out
# to the CPU memory (a_out), but only after `event` (= kernel execution)
# has completed
queue.enqueue_read_buffer(b_out, a_out, :event_wait_list => [event])
# wait for everything in the command queue to finish
queue.finish
# now a_out contains the result of the addition performed on the GPU
# add some cleanup here ...
# verify that the computation went well
diff = (a_in - a_out*3.0)
65536.times { |i|
raise "Computation error #{i} : #{diff[i]+f.s0+f.s1}" if (diff[i]+f.s0+f.s1).abs > 0.00001
}
puts "Success!"
You may want to package whatever C functionality you would like as a gem. This is pretty straightforward and this way you can wrap all your c logic in a specific namespace that you can reuse in other projects.
http://guides.rubygems.org/c-extensions/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With