My use case is writing numbers to a JSON document in which size minimisation is more important than the precision of very small/large numbers. The numbers commonly represent common units such as milliseconds or metres, which tend to fall into the [0.001,1000] range.
Essentially I'd like to set a maximum character length. For example, if the limit were five characters, then:
from to
1234567 123e4
12345.6 12346
1234.56 1235
123.456 123.5
12.3456 12.35
1.23456 1.235
1.23450 1.235
1.23400 1.234
1.23000 1.23
1.20000 1.2
1.00000 1
0.11111 0.111
0.01111 0.011
0.00111 0.001
0.00011 11e-4
0.00001 1e-5
0.11111 0.111
0.01111 0.011
0.00111 0.001
0.00011 11e-4
0.00001 1e-5
This test case seems to convey the most information within a length constraint.
It does fail with numbers raised to powers outside the range [-99,999], and that range will vary according to the imposed restriction. Perhaps the failure case here is just to write a longer string in these rare cases.
This is the ideal, though I could live without implementing it myself if another solution is relatively close, perhaps truncating instead of rounding, and not taking advantage of scientific/exponentiated notation.
EDIT here's what printf
with %.3f
, %.3g
, %.4g
produce by comparison (code here):
printf("%.3f");
match 0 - 1.23457e+06 -> 1234567.000 expected 12e5
match 0 - 12345.6 -> 12345.600 expected 12346
match 0 - 1234.56 -> 1234.560 expected 1235
match 0 - 123.456 -> 123.456 expected 123.5
match 0 - 12.3456 -> 12.346 expected 12.35
match 1 - 1.23456 -> 1.235
match 0 - 1.2345 -> 1.234 expected 1.235
match 1 - 1.234 -> 1.234
match 0 - 1.23 -> 1.230 expected 1.23
match 0 - 1.2 -> 1.200 expected 1.2
match 0 - 1 -> 1.000 expected 1
match 1 - 0.11111 -> 0.111
match 1 - 0.01111 -> 0.011
match 1 - 0.00111 -> 0.001
match 0 - 0.00011 -> 0.000 expected 11e-4
match 0 - 1e-05 -> 0.000 expected 1e-5
match 1 - 0.11111 -> 0.111
match 1 - 0.01111 -> 0.011
match 1 - 0.00111 -> 0.001
match 0 - 0.00011 -> 0.000 expected 11e-4
match 0 - 1e-05 -> 0.000 expected 1e-5
printf("%.3g");
match 0 - 1.23457e+06 -> 1.23e+06 expected 12e5
match 0 - 12345.6 -> 1.23e+04 expected 12346
match 0 - 1234.56 -> 1.23e+03 expected 1235
match 0 - 123.456 -> 123 expected 123.5
match 0 - 12.3456 -> 12.3 expected 12.35
match 0 - 1.23456 -> 1.23 expected 1.235
match 0 - 1.2345 -> 1.23 expected 1.235
match 0 - 1.234 -> 1.23 expected 1.234
match 1 - 1.23 -> 1.23
match 1 - 1.2 -> 1.2
match 1 - 1 -> 1
match 1 - 0.11111 -> 0.111
match 0 - 0.01111 -> 0.0111 expected 0.011
match 0 - 0.00111 -> 0.00111 expected 0.001
match 0 - 0.00011 -> 0.00011 expected 11e-4
match 0 - 1e-05 -> 1e-05 expected 1e-5
match 1 - 0.11111 -> 0.111
match 0 - 0.01111 -> 0.0111 expected 0.011
match 0 - 0.00111 -> 0.00111 expected 0.001
match 0 - 0.00011 -> 0.00011 expected 11e-4
match 0 - 1e-05 -> 1e-05 expected 1e-5
printf("%.4g");
match 0 -> 1.23457e+06 -> 1.235e+06 expected 12e5
match 0 -> 12345.6 -> 1.235e+04 expected 12346
match 1 -> 1234.56 -> 1235
match 1 -> 123.456 -> 123.5
match 1 -> 12.3456 -> 12.35
match 1 -> 1.23456 -> 1.235
match 0 -> 1.2345 -> 1.234 expected 1.235
match 1 -> 1.234 -> 1.234
match 1 -> 1.23 -> 1.23
match 1 -> 1.2 -> 1.2
match 1 -> 1 -> 1
match 0 -> 0.11111 -> 0.1111 expected 0.111
match 0 -> 0.01111 -> 0.01111 expected 0.011
match 0 -> 0.00111 -> 0.00111 expected 0.001
match 0 -> 0.00011 -> 0.00011 expected 11e-4
match 0 -> 1e-05 -> 1e-05 expected 1e-5
match 0 -> 0.11111 -> 0.1111 expected 0.111
match 0 -> 0.01111 -> 0.01111 expected 0.011
match 0 -> 0.00111 -> 0.00111 expected 0.001
match 0 -> 0.00011 -> 0.00011 expected 11e-4
match 0 -> 1e-05 -> 1e-05 expected 1e-5
For packing numbers within a certain range into the smallest unsigned integer:
1) Subtract the smallest possible value. For example, if your numbers may range from 0.001 to 100000 and a specific number is 123.456, then subtract 0.001 to get 123.455
2) Divide by the precision you care about. For example, if you care about thousandths then divide by 0.001. In this case the number 123.455 becomes 123455
Once you've done this and have the smallest width unsigned integer, convert it to hexadecimal digits (or maybe "base 32 digits"). For the example above, 0.001 would become 0x00000000, 123.456 would become 0x0001E23F and 100000 would become 0x05F5E0FF.
If you want "variable precision", you can add a third step that splits the unsigned integer value into "value and shift count" form. For example:
shift_count = 0;
while(value > 0xFFF) {
value = value >> 1;
shift_count++;
}
Then you can concatenate with something like value = (value << 4) | shift_count
.
In that way, you could compress your numbers down to 4 hexadecimal digits. For the examples above, 0.001 would become 0x0000 (exactly representing 0.001), 123.456 would become 0xF115 (actually representing 123.425) and 100000 would become 0xBEBF (actually representing 99975.169).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With