Say that I have a CNN model in Pytorch and 2 inputs of the following sizes:
Notes:
My question is: how does the CNN process the images in both inputs? I.e. does the CNN process every image in the batch sequentially? Or does it concatenate all of the images in the batch size and then perform convolutions per usual?
The reason I ask is because:
That is, the difference in batch_size results in slightly different CNN outputs for both inputs for the same positions.
CNN is a general term for convolutional neural networks. Depending on the particular architecture it may do different things. The main building blocks of CNNs are convolutions which do not cause any "crosstalk" between items in batch and pointwise nonlinearities like ReLU which do not either. However, most architectures also involve other operations, such as normalization layers - arguably the most popular is batch norm which does introduce crosstalk. Many models will also use dropout which behaves stochastically outside of eval mode (by default models are in train mode). Both above effects could lead to the observed outcome above, as well as other custom operations which could cause cross-talk across the batch.
Aside from that, because of numeric precision issues, your code may not give exactly the same results, even if it doesn't feature any cross-batch operations. This error is very minor but sufficient to manifest itself when checking with CNN(input_1) == CNN(input_2)[:2]
. It is better to use allclose
instead, with a suitable epsilon.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With