In an effort to teach myself OpenGL, I am working my way trough the 5th edition of the Superbible.
I am currently trying to figure out how to combine HDR and MSAA (as described in chapter 9).
For HDR, the book suggests a method for adaptive tone mapping that is based on calculating the average luminance for a 5x5 convolution filter for each fragment.
For MSAA, the method used averages all samples by weights calculated from the sample distance.
My attempt at combining both, found in the pastebin below, applies tone mapping to each sample then averages them to compute the final fragment color.
Performance is (as one should perhaps have expected ?) terrible: at 25 lookups per sample, times 4 for 4xMSAA, I'm guessing the GPU spends much of its time looking up my FBO texture. Switching to the code path controlled by the use_HDR uniform in the code drops performance frop 400+fps to under 10 for a simple scene.
My question is twofold:
is this a sane method of performing tone mapping ? If not, what would you suggest ?
how should MSAA and convolution based filters be combined ? I'm guessing I'll have this problem again for any filter that needs to look up neighboring texels, i.e. pretty much anything like bloom, blur, etc ?
Code:
#version 330
in Data
{
vec4 position;
vec4 normal;
vec4 color;
vec2 texCoord;
mat4 mvp;
mat4 mv;
} gdata;
out vec4 outputColor;
uniform sampler2DMS tex;
uniform sampler1D lum_to_exposure;
uniform samplerBuffer weights;
uniform int samplecount;
uniform bool use_HDR;
vec4 tone_map(vec4 color, float exp)
{
return 1.0f - exp2(-color * exp);
}
const ivec2 tc_offset[25] = ivec2[](ivec2(-2, -2), ivec2(-1, -2), ivec2(0, -2), ivec2(1, -2), ivec2(2, -2),
ivec2(-2, -1), ivec2(-1, -1), ivec2(0, -1), ivec2(1, -1), ivec2(2, -1),
ivec2(-2, 0), ivec2(-1, 0), ivec2(0, 0), ivec2(1, 0), ivec2(2, 0),
ivec2(-2, 1), ivec2(-1, 1), ivec2(0, 1), ivec2(1, 1), ivec2(2, 1),
ivec2(-2, 2), ivec2(-1, 2), ivec2(0, 2), ivec2(1, 2), ivec2(2, 2));
void main()
{
ivec2 itexcoords = ivec2(floor(textureSize(tex) * gdata.texCoord));
float tex_size_x = textureSize(tex).x;
float tex_size_y = textureSize(tex).y;
outputColor = vec4(0.0f, 0.0f, 0.0f, 1.0f);
// for each sample in the multi sample buffer...
for (int i = 0; i < samplecount; i++)
{
// ... calculate exposure based on the corresponding sample of nearby texels
vec4 sample;
if (use_HDR)
{
sample = texelFetch(tex, itexcoords, i);
// look up a 5x5 area around the current texel
vec4 hdr_samples[25];
for (int j = 0; j < 25; ++j)
{
ivec2 coords = clamp(itexcoords + tc_offset[j], ivec2(0, 0), ivec2(tex_size_x, tex_size_y));
hdr_samples[j] = texelFetch(tex, coords, i);
}
// average the surrounding texels
vec4 area_color = (
( 1.0f * (hdr_samples[0] + hdr_samples[4] + hdr_samples[20] + hdr_samples[24])) +
( 4.0f * (hdr_samples[1] + hdr_samples[3] + hdr_samples[5] + hdr_samples[9]
+ hdr_samples[15] + hdr_samples[19] + hdr_samples[21] + hdr_samples[23])) +
( 7.0f * (hdr_samples[2] + hdr_samples[10] + hdr_samples[14] + hdr_samples[22])) +
(16.0f * (hdr_samples[6] + hdr_samples[8] + hdr_samples[16] + hdr_samples[18])) +
(26.0f * (hdr_samples[7] + hdr_samples[11] + hdr_samples[13] + hdr_samples[17])) +
(41.0f * (hdr_samples[12]))
) / 273.0f;
// RGB to luminance formula : lum = 0.3R + 0.59G + 0.11B
float area_luminance = dot(area_color.rgb, vec3(0.3, 0.59, 0.11));
float exposure = texture(lum_to_exposure, area_luminance/2.0).r;
exposure = clamp(exposure, 0.02f, 20.0f);
sample = tone_map(sample, exposure);
}
else
sample = texelFetch(tex, itexcoords, i);
// weight the sample based on its position
float weight = texelFetch(weights, i).r;
outputColor += sample * weight;
}
}
The display adaptive tone mapping can account for screen reflections when generating images that optimize visible contrast. Abstract. We propose a tone-mapping operator that can minimize visible con- trast distortions for a range of output devices, ranging from e-paper to HDR displays.
By allowing fragment colors to exceed 1.0 we have a much higher range of color values available to work in known as high dynamic range (HDR). With high dynamic range, bright things can be really bright, dark things can be really dark, and details can be seen in both.
High-dynamic-range rendering (HDRR or HDR rendering), also known as high-dynamic-range lighting, is the rendering of computer graphics scenes by using lighting calculations done in high dynamic range (HDR). This allows preservation of details that may be lost due to limiting contrast ratios.
Tone Mapping is the process of converting the tonal values of an image from a high range to a lower one. For instance, an HDR image with a dynamic range of 100,000:1 will be converted into an image with tonal values ranging from just 1 to 255.
I don't have a copy of the Superbible, so I don't know their exact proposition, but this approach seems very inefficient, and imprecise : your 5x5 filter is only accessing the 'i'th sample of each texel, and totally misses the other samples.
For the filtering phase, I'd go, as kvark already suggested, for a resolve in another texture using glBlitFramebuffer
to have all samples accumulated in HDR. After that, doing the filter in another HDR texture, probably using a separable filter to gain performance, or even using GPU hardware to help increasing further the performance, using bilinear filtering.
This would given you a blurred texture that you could then sample in your tone mapping shader. This should vastly improve performance, but use more memory.
Note that other tone mapping operators exist, and that there is no 'ground truth' in this domain. You could choose to use more a performant approach by not using a such fine grained luminosity estimate.
You could look at Matt Pettineo's recent blog post about tone mapping, this could give you hints about how to improve things, perhaps by using glGenerateMipMaps
to create the luminosity texture.
Regarding the specific issues about tone mapping with MSAA, the only thing I'm aware of is that it's recommended to tone map individual samples before the MSAA resolve, to prevent aliasing artifacts from appearing.
As far as I see from your GLSL code the weight for all samples of a pixel are equal. From that I conclude that the code is interested in the sum of those samples for each pixel. The sum is an average multiplied by the number of samples. From here at least two optimization techniques reveal. Both are using an intermediate single-sampled texture, from which your code is supposed to sample instead of the original multi-sampled one:
(doing it precise to what you are doing). Produce an intermediate texture with a shader that writes average of the samples for each pixel.
(approximating quickly). Let the intermediate texture to be just the resolved original one. Can be done effectively by calling glBlitFramebuffer()
. This will produce slightly different result (because the sample locations are not on a grid), but for you task - HDR - it shouldn't matter, as it's all pretty much an approximation :)
Good luck!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With