I want to compare two vector images (say SVG) and see how close they are. Basically, I want to test the correctness of a tracing algorithm which converts raster images to vector format.
The way I am thinking to test this algorithm is:
-Take some vector images.
-Rasterize the vector image to png.
-Feed the above png to tracing algorithm.
-Compare the output of tracing program (which is SVG) with the original one.
While I know there are some metrices for raster images like RMSE (in imagemagick), I am not familiar if there are some standard metrices for vector formats. I can think of some simple ones like number of arcs, lines, curves etc. But these can not detect the deviation in geometry and colors. Could someone suggest a good standard metric or some other approach to this problem.
I am not aware of standard metrics for this, but I do have a pointer that I hope will be helpful.
The Batik project uses a set of tools to test that its rendering of SVG documents does not diverge excessively from a set of reference images. My understanding is that it essentially rasterises the SVG and performs a pixel-based diff of the two images to see how they differ. It ought to be smart enough to overlook unavoidable differences that may stem for instance from subtle differences in antialiasing.
You can read more about it (especially the SVGRenderingAccuracyTest section) at: http://jpfop.sourceforge.net/jaxml-batik/html-docs/test.html.
That, of course, means that you'll be doing raster comparisons and not vector comparisons. Vector comparisons in your case will be fiendishly difficult because entirely different curves may produce extremely similar rendering — something which I assume is fine. What's more, the input may have a shape that is hidden behind another, making it impossible for the output to possibly guess what it is. The output will therefore end up showing as entirely wrong even though it may produce a pixel-perfect equivalent rendering.
If however you do wish to perform vector comparisons (perhaps your data is constrained in a manner that makes this more viable) the simplest may be to first normalise both SVGs (convert all shapes to paths, eliminate all metadata, apply inheritance of all properties and normalise their values, normalise path data to always use the same form, etc.) and use this for two purposes: first, to look at the diffs in the normalised tree structure. That should already give you some useful information. Second, if you feel brave, measure the surface of the difference between individual curves. I would think twice about embarking on the latter though, because it is likely to give you lots of false negatives.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With