AssertionError: No inf checks were recorded for this optimizer in Pytorch's AutomaticMixedPrecision

Question

I'm using AutomaticMixedPrecision feature of PyTorch to train a network with smaller footprint and precision.
At a certain point some embeddings from the network have NaNs in their tensors, so I'd like to replace those with 0s in order to perform online hard negative samples mining.

However, after replacing the NaNs in the tensor like this:

tensor[torch.isnan(tensor)] = 0

I get the following error while doing the next scaler ste (scaler.step(optimizer):

    assert len(optimizer_state["found_inf_per_device"]) > 0, "No inf checks were recorded for this optimizer."
AssertionError: No inf checks were recorded for this optimizer.

What's the correct way to zero out NaNs while getting rid of this error?

FarisHijazi · Accepted Answer

could you show us your full code. Generally it is advisable to just skip the step (batch) if it has NaNs.

Also take a look at torch.nan_to_num.

AssertionError: No inf checks were recorded for this optimizer in Pytorch's AutomaticMixedPrecision

Tags:

python

deep-learning

pytorch

Jjang

1 Answers

FarisHijazi

Recent Activity

Donate For Us

AssertionError: No inf checks were recorded for this optimizer in Pytorch's AutomaticMixedPrecision

Tags:

python

deep-learning

pytorch

Jjang

1 Answers

FarisHijazi

Related questions

Recent Activity

Donate For Us