Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

fastai error predicting with exported/reloaded model: "Input type and weight type should be the same"

Whenever I export a fastai model and reload it, I get this error (or a very similar one) when I try and use the reloaded model to generate predictions on a new test set:

RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.cuda.HalfTensor) should be the same

Minimal reprodudeable code example below, you just need to update your FILES_DIR variable to where the MNIST data gets deposited on your system:

from fastai import *
from fastai.vision import *

# download data for reproduceable example
untar_data(URLs.MNIST_SAMPLE)
FILES_DIR = '/home/mepstein/.fastai/data/mnist_sample'  # this is where command above deposits the MNIST data for me


# Create FastAI databunch for model training
tfms = get_transforms()
tr_val_databunch = ImageDataBunch.from_folder(path=FILES_DIR,  # location of downloaded data shown in log of prev command
                                train = 'train',
                                valid_pct = 0.2,
                                ds_tfms = tfms).normalize()

# Create Model
conv_learner = cnn_learner(tr_val_databunch, 
                           models.resnet34, 
                           metrics=[error_rate]).to_fp16()

# Train Model
conv_learner.fit_one_cycle(4)

# Export Model
conv_learner.export()  # saves model as 'export.pkl' in path associated with the learner

# Reload Model and use it for inference on new hold-out set
reloaded_model = load_learner(path = FILES_DIR,
                              test = ImageList.from_folder(path = f'{FILES_DIR}/valid'))

preds = reloaded_model.get_preds(ds_type=DatasetType.Test)

Output:

"RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.cuda.HalfTensor) should be the same"

Stepping through the code statement by statement, everything works fine until the last line pred = ... which is where the torch error above pops up.

Relevant software versions:

Python 3.7.3 fastai 1.0.57
torch 1.2.0
torchvision 0.4.0

like image 596
Max Power Avatar asked Aug 23 '19 00:08

Max Power


2 Answers

So the answer to this ended up being relatively simple:

1) As noted in my comment, training in mixed precision mode (setting conv_learner to_fp16()) caused the error with the exported/reloaded model

2) To train in mixed precision mode (which is faster than regular training) and enable export/reload of the model without errors, simply set the model back to default precision before exporting.

...In code, simply changing the example above:

# Export Model
conv_learner.export()

to:

# Export Model (after converting back to default precision for safe export/reload
conv_learner = conv_learner.to_fp32()
conv_learner.export()

...and now the full (reproduceable) code example above runs without errors, including the prediction after model reload.

like image 152
Max Power Avatar answered Oct 30 '22 18:10

Max Power


Your model is in half precision if you have .to_fp16, which would be the same if you would model.half() in PyTorch.

Actually if you trace the code .to_fp16 will call model.half() But there is a problem. If you convert the batch norm layer also to half precision you may get the convergence problem.

This is why you would typically do this in PyTorch:

model.half()  # convert to half precision
for layer in model.modules():
  if isinstance(module, torch.nn.modules.batchnorm._BatchNorm):      
    layer.float()

This will convert any layer to half precision other than batch norm.

Note that code from PyTorch forum is also OK, but just for nn.BatchNorm2d.

Then make sure your input is in half precision using to() like this:

import torch
t = torch.tensor(10.)
print(t)
print(t.dtype)
t=t.to(dtype=torch.float16)
print(t)
print(t.dtype)
# tensor(10.)
# torch.float32
# tensor(10., dtype=torch.float16)
# torch.float16
like image 27
prosti Avatar answered Oct 30 '22 18:10

prosti