I'm following tensorflow object detection api instructions and trying to train existing object-detection model("faster_rcnn_resnet101_coco") with my own dataset having 50 classes.
So according to my own dataset, I created
Next, I edited model.config only for model-faster_rcnn-num_classes(90 -> 50(the number of classes of my own dataset), train_config-batch_size(1 -> 10), train_config-num_steps(200000 -> 100), train_input_reader-tf_record_input_reader-input_path(to the path where TFRecord resides) and train_input_reader-label_map_path(to the path where labelmap.pbtxt resides).
Finally, I run the command
python train.py \
--logtostderr \
--pipeline_config_path="PATH WHERE CONFIG FILE RESIDES" \
--train_dir="PATH WHERE MODEL DIRECTORY RESIDES"
And I met the error below:
InvalidArgumentError (see above for traceback): ConcatOp : Dimensions of inputs should match: shape[0] = [1,890,600,3] vs. shape[1] = [1,766,600,3] [[Node: concat_1 = ConcatV2[N=10, T=DT_FLOAT, Tidx=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](Preprocessor/sub, Preprocessor_1/sub, Preprocessor_2/sub, Preprocessor_3/sub, Preprocessor_4/sub, Preprocessor_5/sub, Preprocessor_6/sub, Preprocessor_7/sub, Preprocessor_8/sub, Preprocessor_9/sub, concat_1/axis)]]
It seems like the dimension of input images so it may be caused by not resizing the raw image data.
But As I know, model automatically resizes the input image to train (isn't it?)
Then I'm stuck with this issue.
If there is solution, I'll appreciate it for your answer. Thanks.
UPDATE
When I updated my batch_size field from 10 to one(original one)
, it seems to train without any problem
... but I don't understand why...
TaeWoo is right, you have to set batch_size
to 1 in order to train Faster RCNN.
This is because FRCNN uses a keep_aspect_ratio_resizer
, which in turn means that if you have images of different sizes, they will also be different sizes after the preprocessing. This practically makes batching impossible, since a batch tensor has a shape [num_batch, height, width, channels]
. You can see this is a problem when (height, width)
differ from one example to the next.
This is in contrast to the SSD model, which uses a "normal" resizer, i.e. regardless of the input image, all preprocessed examples will end-up having the same size, which allows them to be batched together.
Now, if you have images of different sizes, you practically have two ways of using batching:
I had the same problem. Setting batch_size=1 does indeed seem to solve this problem but i am not sure if this will have any effect on accuracy of the model. Would love to get TF team's answer to this.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With