Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tensorflow ConcatOp Error with Object Detection API

I'm following tensorflow object detection api instructions and trying to train existing object-detection model("faster_rcnn_resnet101_coco") with my own dataset having 50 classes.

So according to my own dataset, I created

  1. TFRecord (FOR training,evaluation and testing separately)
  2. labelmap.pbtxt

Next, I edited model.config only for model-faster_rcnn-num_classes(90 -> 50(the number of classes of my own dataset), train_config-batch_size(1 -> 10), train_config-num_steps(200000 -> 100), train_input_reader-tf_record_input_reader-input_path(to the path where TFRecord resides) and train_input_reader-label_map_path(to the path where labelmap.pbtxt resides).

Finally, I run the command

python train.py \
--logtostderr \
--pipeline_config_path="PATH WHERE CONFIG FILE RESIDES" \
--train_dir="PATH WHERE MODEL DIRECTORY RESIDES"

And I met the error below:

InvalidArgumentError (see above for traceback): ConcatOp : Dimensions of inputs should match: shape[0] = [1,890,600,3] vs. shape[1] = [1,766,600,3] [[Node: concat_1 = ConcatV2[N=10, T=DT_FLOAT, Tidx=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](Preprocessor/sub, Preprocessor_1/sub, Preprocessor_2/sub, Preprocessor_3/sub, Preprocessor_4/sub, Preprocessor_5/sub, Preprocessor_6/sub, Preprocessor_7/sub, Preprocessor_8/sub, Preprocessor_9/sub, concat_1/axis)]]

It seems like the dimension of input images so it may be caused by not resizing the raw image data.

But As I know, model automatically resizes the input image to train (isn't it?)

Then I'm stuck with this issue.

If there is solution, I'll appreciate it for your answer. Thanks.

UPDATE

When I updated my batch_size field from 10 to one(original one), it seems to train without any problem... but I don't understand why...

like image 782
LKM Avatar asked Sep 08 '17 03:09

LKM


2 Answers

TaeWoo is right, you have to set batch_size to 1 in order to train Faster RCNN.

This is because FRCNN uses a keep_aspect_ratio_resizer, which in turn means that if you have images of different sizes, they will also be different sizes after the preprocessing. This practically makes batching impossible, since a batch tensor has a shape [num_batch, height, width, channels]. You can see this is a problem when (height, width) differ from one example to the next.

This is in contrast to the SSD model, which uses a "normal" resizer, i.e. regardless of the input image, all preprocessed examples will end-up having the same size, which allows them to be batched together.

Now, if you have images of different sizes, you practically have two ways of using batching:

  • use Faster RCNN and pad your images before, either one time before training, or continuously as a preprocessing step. I'd suggest the former, since this type of preprocessing seems to slow down learning a lot
  • use SSD, but be sure that your objects are not affected too much by distortion. This shouldn't be a very big problem, it works as a way of data augmentation.
like image 143
Ciprian Tomoiagă Avatar answered Oct 02 '22 07:10

Ciprian Tomoiagă


I had the same problem. Setting batch_size=1 does indeed seem to solve this problem but i am not sure if this will have any effect on accuracy of the model. Would love to get TF team's answer to this.

like image 33
TaeWoo Avatar answered Oct 02 '22 07:10

TaeWoo