Say I have a float in the range of [0, 1] and I want to quantize and store it in an unsigned byte. Sounds like a no-brainer, but in fact it's quite complicated: The obvious solution looks like this: <pre class="prettyprint"><code>unsigned char QuantizeFloat(float a) { return (unsigned char)(a * 255.0f); } </code></pre> This works in so far that I get all numbers from 0 to 255, but the distribution of the integers is not even. The function only returns <code>255</code> if a is exactly <code>1.0f</code>. Not a good solution. If I do proper rounding I just shift the problem: <pre class="prettyprint"><code>unsigned char QuantizeFloat(float a) { return (unsigned char)(a * 255.0f + 0.5f); } </code></pre> Here the the result <code>0</code> only covers half of the float-range than any other number. How do I do a quantization with equal distribution of the floating point range? Ideally I would like to get a equal distribution of integers if I quantize equally distributed random floats. Any ideas? <hr> Btw: Also my code is in C the problem is language-agnostic. For the non-C people: Just assume that <code>float</code> to <code>int</code> conversion truncates the float. EDIT: Since we had some confusion here: I need a mapping that maps the smallest input float (0) to the smallest unsigned char, and the highest float of my range (1.0f) to the highest unsigned byte (255).

How about <code>a * 256f</code> with a check to reduce 256 to 255? So something like: <pre class="prettyprint"><code>return (unsigned char) (min(255, (int) (a * 256f))); </code></pre> (For a suitable min function on your platform - I can't remember the C function for it.) Basically you want to divide the range into 256 equal portions, which is what that should do. The edge case for 1.0 going to 256 and requiring rounding down is just because the domain is inclusive at both ends.

Convert/Quantize Float Range to Integer Range

Tags:

integer

floating-point

rounding

quantization

Say I have a float in the range of [0, 1] and I want to quantize and store it in an unsigned byte. Sounds like a no-brainer, but in fact it's quite complicated:

The obvious solution looks like this:

unsigned char QuantizeFloat(float a)
{
  return (unsigned char)(a * 255.0f);
}

This works in so far that I get all numbers from 0 to 255, but the distribution of the integers is not even. The function only returns 255 if a is exactly 1.0f. Not a good solution.

If I do proper rounding I just shift the problem:

unsigned char QuantizeFloat(float a)
{
  return (unsigned char)(a * 255.0f + 0.5f);
}

Here the the result 0 only covers half of the float-range than any other number.

How do I do a quantization with equal distribution of the floating point range? Ideally I would like to get a equal distribution of integers if I quantize equally distributed random floats.

Any ideas?

Btw: Also my code is in C the problem is language-agnostic. For the non-C people: Just assume that float to int conversion truncates the float.

EDIT: Since we had some confusion here: I need a mapping that maps the smallest input float (0) to the smallest unsigned char, and the highest float of my range (1.0f) to the highest unsigned byte (255).

542

asked Mar 01 '09 15:03

Nils Pipenbrinck

2 Answers

How about a * 256f with a check to reduce 256 to 255? So something like:

return (unsigned char) (min(255, (int) (a * 256f)));

(For a suitable min function on your platform - I can't remember the C function for it.)

Basically you want to divide the range into 256 equal portions, which is what that should do. The edge case for 1.0 going to 256 and requiring rounding down is just because the domain is inclusive at both ends.

141

answered Sep 21 '22 20:09

Jon Skeet

I think what you are looking for is this:

unsigned char QuantizeFloat (float a)
{
  return (unsigned char) (a * 256.0f);
}

This will map uniform float values in [0, 1] to uniform byte values in [0, 255]. All values in [i/256, (i+1)/256[ (that is excluding (i+1)/256), for i in 0..255, are mapped to i. What might be undesirable is that 1.0f is mapped to 256.0f which wraps around to 0.

answered Sep 22 '22 20:09

cr333

Related questions
                            
                                Warning for inexact floating-point constants
                            
                                Int to Float to Int conversion precision loss
                            
                                Expect an array of float numbers to be close to another array in Jasmine
                            
                                GCC equivalent to VC's floating point model switch?
                            
                                Java floating-point numbers representation as a hexadecimal numbers
                            
                                In what way do relational operators not obey the compareTo contract with floating point values?
                            
                                What is the difference between "1.0f" and "1.f"?
                            
                                Python Numpy : np.int32 "slower" than np.float64
                            
                                Convert Float to Decimal (SQL Server)
                            
                                Double equals 0 problem in C
                            
                                how to convert a string to float and avoid using try/catch in java?
                            
                                Why does frexp() not yield scientific notation?
                            
                                How to format an f32 with a specific precision and prepended zeros?
                            
                                Converting float to double
                            
                                Diff tool that ignores floating-point formats (but not values) in text?
                            
                                Floats vs rationals in arbitrary precision fractional arithmetic (C/C++)
                            
                                Two very similar functions involving sin() exhibit vastly different performance -- why?
                            
                                Constant truncated to integer
                            
                                Harsh differences in generated assembly of floating-point comparisons < and >=
                            
                                better understanding type promotion of variadic parameters in c

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With