fp16 inference on cpu Pytorch

Question

I have a pretrained pytorch model I want to inference on fp16 instead of fp32, I have already tried this while using the gpu but when I try it on cpu I get: "sum_cpu" not implemented for 'Half' torch. any fixes?

Jerry Xie · Accepted Answer

As I know, a lot of CPU-based operations in Pytorch are not implemented to support FP16; instead, it's NVIDIA GPUs that have hardware support for FP16(e.g. tensor cores in Turing arch GPU) and PyTorch followed up since CUDA 7.0(ish). To accelerate inference on CPU by quantization to FP16, you may wanna try torch.bfloat16 dtype(https://github.com/pytorch/pytorch/issues/23509).

fp16 inference on cpu Pytorch

Tags:

python

cpu

eval

pytorch

half-precision-float

user123

1 Answers

Jerry Xie

Recent Activity

Donate For Us

fp16 inference on cpu Pytorch

Tags:

python

cpu

eval

pytorch

half-precision-float

user123

1 Answers

Jerry Xie

Related questions

Recent Activity

Donate For Us