Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ground-truth data collection and evaluation for computer vision

Currently I am starting to develop a computer vision application that involves tracking of humans. I want to build ground-truth metadata for videos that will be recorded in this project. The metadata will probably need to be hand labeled and will mainly consist of location of the humans in the image. I would like to use the metadata to evaluate the performance of my algorithms.

I could of course build a labeling tool using, e.g. qt and/or opencv, but I was wondering if perhaps there was some kind of defacto standard for this. I came across Viper but it seems dead and it doesn't quite work as easy as I would have hoped. Other than that, I haven't found much.

Does anybody here have some recommendations as to which software / standard / method to use both for the labeling as well as the evaluation? My main preference is to go for something c++ oriented, but this is not a hard constraint.

Kind regards and thanks in advance! Tom

like image 452
Goosebumps Avatar asked May 16 '12 12:05

Goosebumps


People also ask

What is ground truth in image processing?

Ground truth of a satellite image means the collection of information at a particular location. It allows satellite image data to be related to real features and materials on the ground. This information is frequently used for calibration of remote sensing data and compares the result with ground truth.

What is a ground truth database?

Ground truth data is data collected at scale from real-world scenarios to train algorithms on contextual information such as verbal speech, natural language text, human gestures and behaviors, and spatial orientation.

What is ground truth in machine learning?

Ground truth in machine learning refers to the reality you want to model with your supervised machine learning algorithm. Ground truth is also known as the target for training or validating the model with a labeled dataset.


1 Answers

I've had another look at vatic and got it to work. It is an online video annotation tool meant for crowd sourcing via a commercial service and it runs on Linux. However, there is also an offline mode. In this mode the service used for the exploitation of this software is not required and the software runs stand alone.

The installation is quite elaborately described in the enclosed README file. It involves, amongst others, setting up an appache and a mysql server, some python packages, ffmpeg. It is not that difficult if you follow the README. (I mentioned that I had some issues with my proxy but this was not related to this software package).

You can try the online demo. The default output is like this:

0 302 113 319 183 0 1 0 0 "person"
0 300 112 318 182 1 1 0 1 "person"
0 298 111 318 182 2 1 0 1 "person"
0 296 110 318 181 3 1 0 1 "person"
0 294 110 318 181 4 1 0 1 "person"
0 292 109 318 180 5 1 0 1 "person"
0 290 108 318 180 6 1 0 1 "person"
0 288 108 318 179 7 1 0 1 "person"
0 286 107 317 179 8 1 0 1 "person"
0 284 106 317 178 9 1 0 1 "person"

Each line contains 10+ columns, separated by spaces. The definition of these columns are:

1   Track ID. All rows with the same ID belong to the same path.
2   xmin. The top left x-coordinate of the bounding box.
3   ymin. The top left y-coordinate of the bounding box.
4   xmax. The bottom right x-coordinate of the bounding box.
5   ymax. The bottom right y-coordinate of the bounding box.
6   frame. The frame that this annotation represents.
7   lost. If 1, the annotation is outside of the view screen.
8   occluded. If 1, the annotation is occluded.
9   generated. If 1, the annotation was automatically interpolated.
10  label. The label for this annotation, enclosed in quotation marks.
11+ attributes. Each column after this is an attribute.

But can also provide output in xml, json, pickle, labelme and pascal voc

So, all in all, this does quite what I wanted and it is also rather easy to use. I am still interested in other options though!

like image 103
Goosebumps Avatar answered Oct 13 '22 06:10

Goosebumps