The built in Unity shaders supports a technique for encoding and decoding a 32-bit RGBA-value into a 32-bit float. This can be done by simply multiplying each channel with the highest possible value of the channel before it. Some loss of precision is expected since it is stored in a float.
The shader clearly has some optimization going for it that I am trying to understand.
The shader in UnityCG.cginc code looks like this:
// Encoding/decoding [0..1) floats into 8 bit/channel RGBA. Note that 1.0 will not be encoded properly.
inline float4 EncodeFloatRGBA( float v )
{
float4 kEncodeMul = float4(1.0, 255.0, 65025.0, 16581375.0);
float kEncodeBit = 1.0/255.0;
float4 enc = kEncodeMul * v;
enc = frac (enc);
enc -= enc.yzww * kEncodeBit;
return enc;
}
inline float DecodeFloatRGBA( float4 enc )
{
float4 kDecodeDot = float4(1.0, 1/255.0, 1/65025.0, 1/16581375.0);
return dot( enc, kDecodeDot );
}
So my questions:
f = R + 255 * G + 65025 * B + 16581375 * A
would not give compatible result. Why this choice?From inspection, the Unity code looks like it wants to convert float values that are between 0.0
and 1.0
(not including the 1) into 4 float values that are between 0.0
and 1.0
such that those values can be converted into integer values from 0 to 255 by multiplying by 255.
But, dang, you are really correct to be skeptical about this code. It has many flaws (but usually produces results close enough to be mostly usable).
The reason why they multiply by 255 instead of 256 is because they have the erroneous belief that they can get reasonable results by keeping values as floats (and plan to convert the floats to 0-255 valued integers at a later time as others have mentioned in comments). But, then they use that frac()
call. You need to recognize floating point code that looks like this as having bad code smellTM.
Correct code would look something like this:
inline float4 EncodeFloatRGBA(float v)
{
var vi = (uint)(v * (256.0f * 256.0f * 256.0f * 256.0f));
var ex = (int)(vi / (256 * 256 * 256) % 256);
var ey = (int)((vi / (256 * 256)) % 256);
var ez = (int)((vi / (256)) % 256);
var ew = (int)(vi % 256);
var e = float4(ex / 255.0f, ey / 255.0f, ez / 255.0f, ew / 255.0f);
return e;
}
and
inline float DecodeFloatRGBA(float4 enc)
{
var ex = (uint)(enc.x * 255);
var ey = (uint)(enc.y * 255);
var ez = (uint)(enc.z * 255);
var ew = (uint)(enc.w * 255);
var v = (ex << 24) + (ey << 16) + (ez << 8) + ew;
return v / (256.0f * 256.0f * 256.0f * 256.0f);
}
The Unity code fails to accurately do a round trip about 23% of the time given random input (it fails about 90% of the time if you don't use extra processing like rounding the encoded values after multiplying by 255). The code above works 100% of the time.
Note that 32-bit floats only have 23 bits of precision so the 32-bit RGBA values will have leading or trailing 0 bits. The cases where you care to use the trailing bits when you have 0s at the start are probably few and far between so you could probably simplify the code to not use the ew
values at all and encode as RGB instead of RGBA.
<rant>
All in all, I find the Unity code disturbing because it tries to reinvent something we already have. We have a nice IEEE 754 standard for encoding floats into 32-bit values and RGBA is usually at least 32-bits (the Unity code certainly assumes it is). I'm not sure why they don't just plop the float into the RGBA (you could still use an intermediate float4 as the code does below if you want). If you just put the float into the RGBA, you don't have to worry about the 23-bits of precision and you are not limited to values between 0.0
and 1.0
. You can even encode infinities and NaN
s. That code looks like:
inline float4 EncodeFloatRGBA(float v)
{
byte[] eb = BitConverter.GetBytes(v);
if (BitConverter.IsLittleEndian)
{
return float4(eb[3] / 255.0f, eb[2] / 255.0f, eb[1] / 255.0f, eb[0] / 255.0f);
}
return float4(eb[0] / 255.0f, eb[1] / 255.0f, eb[2] / 255.0f, eb[3] / 255.0f);
}
and
inline float DecodeFloatRGBA(float4 enc)
{
var eb = BitConverter.IsLittleEndian ?
new[] { (byte)(enc.w * 255), (byte)(enc.z * 255),
(byte)(enc.y * 255), (byte)(enc.x * 255) } :
new[] { (byte)(enc.x * 255), (byte)(enc.y * 255),
(byte)(enc.z * 255), (byte)(enc.w * 255) };
return BitConverter.ToSingle(eb, 0);
}
</rant>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With