Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

fp16 inference on cpu Pytorch

I have a pretrained pytorch model I want to inference on fp16 instead of fp32, I have already tried this while using the gpu but when I try it on cpu I get: "sum_cpu" not implemented for 'Half' torch. any fixes?

like image 371
user123 Avatar asked Nov 06 '25 15:11

user123


1 Answers

As I know, a lot of CPU-based operations in Pytorch are not implemented to support FP16; instead, it's NVIDIA GPUs that have hardware support for FP16(e.g. tensor cores in Turing arch GPU) and PyTorch followed up since CUDA 7.0(ish). To accelerate inference on CPU by quantization to FP16, you may wanna try torch.bfloat16 dtype(https://github.com/pytorch/pytorch/issues/23509).

like image 155
Jerry Xie Avatar answered Nov 09 '25 07:11

Jerry Xie



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!