I am trying to compute a per-channel gradient image in PyTorch. To do this, I want to perform a standard 2D convolution with a Sobel filter on each channel of an image. I am using the torch.nn.functional.conv2d
function for this
In my minimum working example code below, I get an error:
import torch
import torch.nn.functional as F
filters = torch.autograd.Variable(torch.randn(1,1,3,3))
inputs = torch.autograd.Variable(torch.randn(1,3,10,10))
out = F.conv2d(inputs, filters, padding=1)
RuntimeError: Given groups=1, weight[1, 1, 3, 3], so expected input[1, 3, 10, 10] to have 1 channels, but got 3 channels instead
This suggests that groups
need to be 3. However, when I make groups=3
, I get a different error:
import torch
import torch.nn.functional as F
filters = torch.autograd.Variable(torch.randn(1,1,3,3))
inputs = torch.autograd.Variable(torch.randn(1,3,10,10))
out = F.conv2d(inputs, filters, padding=1, groups=3)
RuntimeError: invalid argument 4: out of range at /usr/local/src/pytorch/torch/lib/TH/generic/THTensor.c:440
When I check that code snippet in the THTensor class, it refers to a bunch of dimension checks, but I don't know where I'm going wrong.
What does this error mean? How can I perform my intended convolution with this conv2d
function? I believe I am misunderstanding the groups
parameter.
groups controls the connections between inputs and outputs. in_channels and out_channels must both be divisible by groups .
out_channels (int) – Number of channels produced by the convolution. kernel_size (int or tuple) – Size of the convolving kernel. stride (int or tuple, optional) – Stride of the convolution. ( Default: 1)
The following parameters are used in PyTorch Conv2d. in_channels are used to describe how many channels are present in the input image whereas out_channels are used to describe the number of channels present after convolution happened in the system. The breadth and height of the filter is provided by the kernel.
Applies a 2D convolution over an input image composed of several input planes. This operator supports TensorFloat32.
If you want to apply a per-channel convolution then your out-channel
should be the same as your in-channel
. This is expected, considering each of your input channels creates a separate output channel that it corresponds to.
In short, this will work
import torch
import torch.nn.functional as F
filters = torch.autograd.Variable(torch.randn(3,1,3,3))
inputs = torch.autograd.Variable(torch.randn(1,3,10,10))
out = F.conv2d(inputs, filters, padding=1, groups=3)
whereas, filters of size (2, 1, 3, 3)
or (1, 1, 3, 3)
will not work.
Additionally, you can also make your out-channel
a multiple of in-channel
. This works for instances where you want to have multiple convolution filters for each input channel.
However, This only makes sense if it is a multiple. If not, then pytorch falls back to its closest multiple, a number less than what you specified. This is once again expected behavior. For example a filter of size (4, 1, 3, 3)
or (5, 1, 3, 3)
, will result in an out-channel
of size 3.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With