Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

tex object access always returns zero -- any ideas?

Tags:

cuda

I'm running CUDA 5.0, with compute_30,sm_30 set using a 670.

I create a mipmapped array via:

cudaExtent size;
size.width = window_width;      // 600
size.height = window_height;    // 600
size.depth = 1;
int levels = getMipMapLevels(size);
levels = MIN(levels, 9);        // 9
cudaChannelFormatDesc fp32;
fp32.f = cudaChannelFormatKindFloat;
fp32.x = fp32.y = fp32.z = fp32.w = 32;
cudaMipmappedArray_t A;
checkCuda(cudaMallocMipmappedArray(&A, &fp32, size, levels, cudaArraySurfaceLoadStore));

I load the first level of A with surf2Dwrites. I know that works since I copy that array to the host and dump it to an image file. I now wish to fill the other miplevels of A with the mipmaps. One iteration through that loop looks like:

width >>= 1; width = MAX(1, width);
height >>= 1; height = MAX(1, height);

cudaArray_t from, to;
checkCuda(cudaGetMipmappedArrayLevel(&from, A, newlevel-1));
checkCuda(cudaGetMipmappedArrayLevel(&to, A, newlevel));

cudaTextureObject_t from_texture;
create_texture_object(from, true, &from_texture);
cudaSurfaceObject_t to_surface;
create_surface_object(to, &to_surface);

dim3 blocksize(16, 16, 1);
dim3 gridsize((width+blocksize.x-1)/blocksize.x,(height+blocksize.y-1)/blocksize.y, 1);
d_mipmap<<<gridsize, blocksize>>>(to_surface, from_texture, width, height);

checkCuda(cudaDeviceSynchronize());
checkCuda(cudaGetLastError());
uncreate_texture_object(&from_texture);
uncreate_surface_object(&to_surface);

The create_surface_object() code is known to work. Just in case, here's the create_texture_object() code:

static void create_texture_object(cudaArray_t tarray, bool filter_linear, cudaTextureObject_t *tobject)
{
    assert(tarray && tobject);
    // build the resource
    cudaResourceDesc    color_res;
    memset(&color_res, 0, sizeof(cudaResourceDesc));
    color_res.resType = cudaResourceTypeArray;
    color_res.res.array.array = tarray;

    // the texture descriptor
    cudaTextureDesc     texdesc;
    memset(&texdesc, 0, sizeof(cudaTextureDesc));
    texdesc.addressMode[0] = cudaAddressModeClamp;
    texdesc.addressMode[1] = cudaAddressModeClamp;
    texdesc.addressMode[2] = cudaAddressModeClamp;
    texdesc.filterMode = filter_linear ? cudaFilterModeLinear : cudaFilterModePoint;
    texdesc.normalizedCoords = 1;

    checkCuda(cudaCreateTextureObject(tobject, &color_res, &texdesc, NULL));
}

The d_mipmap device function is the following:

__global__ void
d_mipmap(cudaSurfaceObject_t out, cudaTextureObject_t in, int w, int h)
{
    float x = blockIdx.x * blockDim.x + threadIdx.x;
    float y = blockIdx.y * blockDim.y + threadIdx.y;

    float dx = 1.0/float(w);
    float dy = 1.0/float(h);

    if ((x < w) && (y < h))
    {
#if 0
        float4 color = 
            (tex2D<float4>(in, (x + .25f) * dx, (y + .25f) * dy)) +
            (tex2D<float4>(in, (x + .75f) * dx, (y + .25f) * dy)) +
            (tex2D<float4>(in, (x + .25f) * dx, (y + .75f) * dy)) +
            (tex2D<float4>(in, (x + .75f) * dx, (y + .75f) * dy));
        color /= 4.0f;
        surf2Dwrite(color, mipOutput, x * sizeof(float4), y);
#endif
     float4 color0 = tex2D<float4>(in, (x + .25f) * dx, (y + .25f) * dy);
     surf2Dwrite(color0, out, x * sizeof(float4), y);
    }
}

That contains both the mipmap sampling code (if'd out) plus debugging code.

The problem is, color0 is always uniformly zero, and I've been unable to understand why. I've changed the filtering to point (from linear) with no success. I've checked for errors. Nothing.

I am using CUDA/OpenGL interop here, but the mipmap generation is being done on CUDA arrays only.

I really really do not want to have to use texture references.

Any suggestions on where to look?

like image 763
Walt Donovan Avatar asked Feb 28 '26 00:02

Walt Donovan


1 Answers

The bug turns out to be the use of cudaMipmappedArrays (either the array or the texture object -- I'm unable to tell which is broken.)

When I modify the code to use cudaArrays only, the texture reference starts working again.

Since the bindless texture program sample works, the bug appears to be limited to float32 channel mipmapped textures only. (I have a test program that shows the bug occurs with both 1 and 4 channel float32 mipmapped textures.)

I've reported the bug to Nvidia.

like image 80
Walt Donovan Avatar answered Mar 01 '26 22:03

Walt Donovan



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!