Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Recognition of handwritten circles, diamonds and rectangles

I looking for some advices about recognition of three handwritten shapes - circles, diamonds and rectangles. I tried diffrent aproaches but they failed so maybe you could point me in another, better direction.

What I tried:

1) Simple algorithm based on dot product between points of handwritten shape and ideal shape. It works not so bad at recognition of rectangle, but failed on circles and diamonds. The problem is that dot product of the circle and diamond is quite similiar even for ideal shapes.

2) Same aproach but using Dynamic Time Warping as measure of simililarity. Similiar problems.

3) Neural networks. I tried few aproaches - giving points data to neural networks (Feedforward and Kohonen) or giving rasterized image. For Kohonen it allways classified all the data (event the sample used to train) into the same category. Feedforward with points was better (but on the same level as aproach 1 and 2) and with rasterized image it was very slow (I needs at least size^2 input neurons and for small sized of raster circle is indistinguishable even for me ;) ) and also without success. I think is because all of this shapes are closed figures? I am not big specialist of ANN (had 1 semester course of them) so maybe I am using them wrong?

4) Saving the shape as Freeman Chain Code and using some algorithms for computing similarity. I though that in FCC the shapes will be realy diffrent from each other. No success here (but I havent explorer this path very deeply).

I am building app for Android with this but I think the language is irrelevant here.

like image 412
Pax0r Avatar asked Nov 27 '13 22:11

Pax0r


3 Answers

Here's some working code for a shape classifier. http://jsfiddle.net/R3ns3/ I pulled the threshold numbers (*Threshold variables in the code) out of the ether, so of course they can be tweaked for better results.

I use the bounding box, average point in a sub-section, angle between points, polar angle from bounding box center, and corner recognition. It can classify hand drawn rectangles, diamonds, and circles. The code records points while the mouse button is down and tries to classify when you stop drawing.

HTML

<canvas id="draw" width="300" height="300" style="position:absolute; top:0px; left:0p; margin:0; padding:0; width:300px; height:300px; border:2px solid blue;"></canvas>

JS

var state = {
    width: 300,
    height: 300,
    pointRadius: 2,
    cornerThreshold: 125,
    circleThreshold: 145,
    rectangleThreshold: 45,
    diamondThreshold: 135,
    canvas: document.getElementById("draw"),
    ctx: document.getElementById("draw").getContext("2d"),
    drawing: false,
    points: [],
    getCorners: function(angles, pts) {
        var list = pts || this.points;
        var corners = [];
        for(var i=0; i<angles.length; i++) {
            if(angles[i] <= this.cornerThreshold) {
                corners.push(list[(i + 1) % list.length]);
            }
        }
        return corners;
    },
    draw: function(color, pts) {
        var list = pts||this.points;
        this.ctx.fillStyle = color;
        for(var i=0; i<list.length; i++) {
            this.ctx.beginPath();
            this.ctx.arc(list[i].x, list[i].y, this.pointRadius, 0, Math.PI * 2, false);
            this.ctx.fill();
        }
    },
    classify: function() {
        // get bounding box
        var left = this.width, right = 0, 
            top = this.height, bottom = 0;
        for(var i=0; i<this.points.length; i++) {
            var pt = this.points[i];
            if(left > pt.x) left = pt.x;
            if(right < pt.x) right = pt.x;
            if(top > pt.y) top = pt.y;
            if(bottom < pt.y) bottom = pt.y;
        }
        var center = {x: (left+right)/2, y: (top+bottom)/2};
        this.draw("#00f", [
            {x: left, y: top},
            {x: right, y: top},
            {x: left, y: bottom},
            {x: right, y: bottom},
            ]);
        // find average point in each sector (9 sectors)
        var sects = [
            {x:0,y:0,c:0},{x:0,y:0,c:0},{x:0,y:0,c:0},
            {x:0,y:0,c:0},{x:0,y:0,c:0},{x:0,y:0,c:0},
            {x:0,y:0,c:0},{x:0,y:0,c:0},{x:0,y:0,c:0}
            ];
        var x3 = (right + (1/(right-left)) - left) / 3;
        var y3 = (bottom + (1/(bottom-top)) - top) / 3;
        for(var i=0; i<this.points.length; i++) {
            var pt = this.points[i];
            var sx = Math.floor((pt.x - left) / x3);
            var sy = Math.floor((pt.y - top) / y3);
            var idx = sy * 3 + sx;
            sects[idx].x += pt.x;
            sects[idx].y += pt.y;
            sects[idx].c ++;
            if(sx == 1 && sy == 1) {
                return "UNKNOWN";
            }
        }
        // get the significant points (clockwise)
        var sigPts = [];
        var clk = [0, 1, 2, 5, 8, 7, 6, 3]
        for(var i=0; i<clk.length; i++) {
            var pt = sects[clk[i]];
            if(pt.c > 0) {
                sigPts.push({x: pt.x / pt.c, y: pt.y / pt.c});
            } else {
                return "UNKNOWN";
            }
        }
        this.draw("#0f0", sigPts);
        // find angle between consecutive 3 points
        var angles = [];
        for(var i=0; i<sigPts.length; i++) {
            var a = sigPts[i],
                b = sigPts[(i + 1) % sigPts.length],
                c = sigPts[(i + 2) % sigPts.length],
                ab = Math.sqrt(Math.pow(b.x-a.x,2)+Math.pow(b.y-a.y,2)),
                bc = Math.sqrt(Math.pow(b.x-c.x,2)+ Math.pow(b.y-c.y,2)),
                ac = Math.sqrt(Math.pow(c.x-a.x,2)+ Math.pow(c.y-a.y,2)),
                deg = Math.floor(Math.acos((bc*bc+ab*ab-ac*ac)/(2*bc*ab)) * 180 / Math.PI);
            angles.push(deg);                
        }
        console.log(angles);
        var corners = this.getCorners(angles, sigPts);
        // get polar angle of corners
        for(var i=0; i<corners.length; i++) {
            corners[i].t = Math.floor(Math.atan2(corners[i].y - center.y, corners[i].x - center.x) * 180 / Math.PI);
        }
        console.log(corners);
        // whats the shape ?
        if(corners.length <= 1) { // circle
            return "CIRCLE";
        } else if(corners.length == 2) { // circle || diamond
            // difference of polar angles
            var diff = Math.abs((corners[0].t - corners[1].t + 180) % 360 - 180);
            console.log(diff);
            if(diff <= this.circleThreshold) {
                return "CIRCLE";
            } else {
                return "DIAMOND";
            }
        } else if(corners.length == 4) { // rectangle || diamond
            // sum of polar angles of corners
            var sum = Math.abs(corners[0].t + corners[1].t + corners[2].t + corners[3].t); 
            console.log(sum);
            if(sum <= this.rectangleThreshold) {
                return "RECTANGLE";
            } else if(sum >= this.diamondThreshold) {
                return "DIAMOND";
            } else {
                return "UNKNOWN";
            }
        } else {
            alert("draw neater please");
            return "UNKNOWN";
        }
    }
};
state.canvas.addEventListener("mousedown", (function(e) {
    if(!this.drawing) {
        this.ctx.clearRect(0, 0, 300, 300);
        this.points = [];
        this.drawing = true;
        console.log("drawing start");
    }
}).bind(state), false);
state.canvas.addEventListener("mouseup", (function(e) {
    this.drawing = false;
    console.log("drawing stop");
    this.draw("#f00");
    alert(this.classify());
}).bind(state), false);
state.canvas.addEventListener("mousemove", (function(e) {
    if(this.drawing) {
        var x = e.pageX, y = e.pageY;
        this.points.push({"x": x, "y": y});
        this.ctx.fillStyle = "#000";
        this.ctx.fillRect(x-2, y-2, 4, 4);
    }
}).bind(state), false);
like image 157
Louis Ricci Avatar answered Nov 18 '22 18:11

Louis Ricci


Given the possible variation in handwritten inputs I would suggest that a neural network approach is the way to go; you will find it difficult or impossible to accurately model these classes by hand. LastCoder's attempt works to a degree, but it does not cope with much variation or have promise for high accuracy if worked on further - this kind of hand-engineered approach was abandoned a very long time ago.

State-of-the-art results in handwritten character classification these days is typically achieved with convolutional neural networks (CNNs). Given that you have only 3 classes the problem should be easier than digit or character classification, although from experience with the MNIST handwritten digit dataset, I expect that your circles, squares and diamonds may occasionally end up being difficult for even humans to distinguish.

So, if it were up to me I would use a CNN. I would input binary images taken from the drawing area to the first layer of the network. These may require some preprocessing. If the drawn shapes cover a very small area of the input space you may benefit from bulking them up (i.e. increasing line thickness) so as to make the shapes more invariant to small differences. It may also be beneficial to centre the shape in the image, although the pooling step might alleviate the need for this.

I would also point out that the more training data the better. One is often faced with a trade-off between increasing the size of one's dataset and improving one's model. Synthesising more examples (e.g. by skewing, rotating, shifting, stretching, etc) or spending a few hours drawing shapes may provide more of a benefit than you could get in the same time attempting to improve your model.

Good luck with your app!

like image 33
Ninjakannon Avatar answered Nov 18 '22 18:11

Ninjakannon


A linear Hough transform of the square or the diamond ought to be easy to recognize. They will both produce four point masses. The square's will be in pairs at zero and 90 degrees with the same y-coordinates for both pairs; in other words, a rectangle. The diamond will be at two other angles corresponding to how skinny the diamond is, e.g. 45 and 135 or else 60 and 120.

For the circle you need a circular Hough transform, and it will produce a single bright point cluster in 3d (x,y,r) Hough space.

Both linear and circular Hough transforms are implemented in OpenCV, and it's possible to run OpenCV on Android. These implementations include thresholding to identify lines and circles. See pg. 329 and pg. 331 of the documentation here.

If you are not familiar with Hough transforms, the Wikipedia page is not bad.

Another algorithm you may find interesting and perhaps useful is given in this paper about polygon similarity. I implemented it many years ago, and it's still around here. If you can convert the figures to loops of vectors, this algorithm could compare them against patterns, and the similarity metric would show goodness of match. The algorithm ignores rotational orientation, so if your definition of square and diamond is with respect to the axes of the drawing surface, you will have to modify the algorithm a bit to differentiate these cases.

like image 2
Gene Avatar answered Nov 18 '22 19:11

Gene