To explain this, this is basically a way to shrink floating point vector data into 8-bit or 16-bit signed or unsigned integers with a single common unsigned exponent (the most common of which being bs16
for precision with a common exponent of 11).
I'm not sure what this pseudo-float method is called; all I know is to get the resulting float, you need to do this:
float_result = int_value / ( 2.0 ** exponent )
What I'd like to do is match this data by basically guessing the exponent by attempting to re-calculate it from the given floats. (if done properly, it should be able to be re-calculated in other formats as well)
So if all I'm given is a large group of 1140 floats to work with, how can I find the common exponent and convert these floats into this shrunken bu8
, bs8
, bu16
, or bs16
(specified) format?
EDIT: samples
>>> for value in array('h','\x28\xC0\x04\xC0\xF5\x00\x31\x60\x0D\xA0\xEB\x80'):
print( value / ( 2. ** 11 ) )
-7.98046875
-7.998046875
0.11962890625
12.0239257812
-11.9936523438
-15.8852539062
EDIT2: I wouldn't exactly call this "compression", as all it really is, is an extracted mantissa to be re-computed via the shared exponent.
Maybe something like this:
def validExponent(x,e,a,b):
"""checks if x*2.0**e is an integer in range [a,b]"""
y = x*2.0**e
return a <= y <= b and y == int(y)
def allValid(xs,e,a,b):
return all(validExponent(x,e,a,b) for x in xs)
def firstValid(xs,a,b,maxE = 100):
for e in xrange(1+maxE):
if allValid(xs,e,a,b):
return e
return "None found"
#test:
xs = [x / ( 2. ** 11 ) for x in [-12,14,-5,16,28]]
print xs
print firstValid(xs,-2**15,2**15-1)
Output:
[-0.005859375, 0.0068359375, -0.00244140625, 0.0078125, 0.013671875]
11
You could of course write a wrapper function which will take a string argument such as 'bs16'
and automatically compute the bounds a
,b
On Edit:
1) If you have the exact values of the floats the above should work. It anything has introduced any round-off error you might want to replace y == int(y)
by abs(y-round(y)) < 0.00001
(or something similar).
2) The first valid exponent will be the exponent you want unless all of the integers in the original integer list are even. If you have 1140 values and they are in some sense random, the chance of this happening is vanishingly small.
On Further Edit: If the floats in question are not generated by this process but you want to find an optimal exponent which allows for (lossy) compression to ints of a given size you can do something like this (not thoroughly tested):
import math
def maxExp(x,a,b):
"""returns largest nonnegative integer exponent e with
a <= x*2**e <= b, where a, b are integers with a <= 0 and b > 0
Throws an error if no such e exists"""
if x == 0.0:
e = -1
elif x < 0.0:
e = -1 if a == 0 else math.floor(math.log(a/float(x),2))
else:
e = math.floor(math.log(b/float(x),2))
if e >= 0:
return int(e)
else:
raise ValueError()
def bestExponent(floats,a,b):
m = min(floats)
M = max(floats)
e1 = maxExp(m,a,b)
e2 = maxExp(M,a,b)
MSE = []
for e in range(1+min(e1,e2)):
MSE.append(sum((x - round(x*2.0**e)/2.0**e)**2 for x in floats)/float(len(floats)))
minMSE = min(MSE)
for e,error in enumerate(MSE):
if error == minMSE:
return e
To test it:
>>> import random
>>> xs = [random.uniform(-10,10) for i in xrange(1000)]
>>> bestExponent(xs,-2**15,2**15-1)
11
It seems like the common exponent 11 is chosen for a reason.
If you've got the original values, and the corresponding result, you can use log to find the exponent. Math has a log function you can use. You'd have to log Int_value/float_result to the base 2.
EG:
import Math
x = (int_value/float_result)
math.log(x,2)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With