Ok, I know float16
is not a real primitive type, but it's simulated by Python/numpy. However, the question is: if that exists and Python allows to use it in arrays multiplication using the numpy.dot()
function, why doesn't OpenBlas (or ATLAS) properly work? I mean, the multiplication works, but the parallel computation doesn't. Or again, in a different way (better in my opinion), why does Python/numpy allow to use float16
if then we cannot exploit the advanced functionalities offered by OpenBlas/ATLAS?
Numpy float16
is a strange and possibly evil beast. It is an IEEE 754 half-precision floating point number with 1-bit of sign, 5 bits of exponent and 10 bits of mantissa.
While it is a standard floating point number, it is a newcomer and not in wide use. Some GPUs support it, but the hardware support is not common in CPUs. Newer processors have commands to convert between 16-bit and 32-bit floats, but no support to use it directly in mathematical operations. Due to this and due to the lack of suitable data types in common lower level languages, the 16-bit float is slower to use than its 32-bit counterpart.
Only few tools support it. Usually, the 16-bit float is regarded as a storage format which is then converted into a 32-bit float before use.
Some benchmarks:
In [60]: r=random.random(1000000).astype('float32')
In [61]: %timeit r*r
1000 loops, best of 3: 435 us per loop
In [62]: r=random.random(1000000).astype('float16')
In [63]: %timeit r*r
100 loops, best of 3: 10.9 ms per loop
As a general use, do not use it for anything else than as compressed storage. and even then be aware of the compromise:
In [72]: array([3001], dtype='float16') - array([3000], dtype='float16')
Out[72]: array([ 0.], dtype=float32)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With