Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Half-precision: Difference between __float2half vs __float2half_rn

Tags:

cuda

There seems to be no documentation about these two functions.

What is the difference between __float2half and __float2half_rn?

like image 848
user2970139 Avatar asked Mar 14 '23 09:03

user2970139


1 Answers

It seems the CUDA documentation is indeed a bit inadequate here.

The function unsigned short __float2half_rn(float) in combination with float __half2float(unsigned short x) was already present in CUDA before the new half datatype was introduced in CUDA 7.5. It is defined in device_functions.h. The comment there reads:

Convert the single-precision float value x to a half-precision floating point value represented in unsigned short format, in round-to-nearest-even mode.

The function half __float2half(float) is defined in cuda_fp16.h and does apparently the same, but returns a half:

Converts float number a to half precision in round-to-nearest mode.


However, since half is a typedef to unsigned short, I checked if they do the same, with the following code:

#include <stdio.h>
#include "cuda_fp16.h"
#include "device_functions.h"
__global__ void test()
{
//  auto test = __float2half( 1.4232 );
    auto test = __float2half_rn( 1.4232 );
    printf( "%hu\n", test );
}

int main()
{
    test<<<1,1>>>();
    cudaDeviceSynchronize();
}

I found that (for sm_20) the old __float2half_rn() has an additional int16 to int32 operation and does a 32bit store. On the other hand, __float2half_() does not have this conversion and does a 16bit store.

Relevant SASS code for __float2half_rn():

/*0040*/         I2I.U32.U16 R0, R0;
/*0050*/         STL [R2], R0;

For __float2half():

/*0048*/         STL.U16 [R2], R0;
like image 102
havogt Avatar answered Apr 07 '23 00:04

havogt