I am building an App for a parent of a friend of mine that sadly had a stroke and can no longer talk, read or spell. He can however draw rather detailed drawings. I have currently built an App that can process an image of a drawing and detect basic shapes. (Lines, squares and triangles) The App can count how many of each shape has been drawn so it knows the difference between an image with two squares appose to an image with just one square. This places a large amount of cognitive load onto the user to remember all combinations of shapes and what they mean. I am currently detecting the contours in the image via <code>findContours(maskMat, contours, hierarchy, CV_RETR_LIST, CV_CHAIN_APPROX_SIMPLE);</code> What I would like to achieve is the user draws a shape, adds that to a bank of known drawings and then each time he draws an image, the App processes each known image comparing it to the source image and saving a similarity value. Then taking the highest similarity value, providing it is above a threshold, it can be taken as the image drawn is the best known image. I have looked into OpenCV Pattern Matching as well as Templating but with unreliable results. I'm asking for advice into the best approach that will provide the result I'm hoping for. I built a promotion video for my university lecture to best illustrate what the App does. If you are interested you can view it here. https://youtu.be/ngMUUIsLHoc Thanks in advance.

First of all, this looks like a great app. And for a fantastic purpose. Good work! To the specific of your question, having watched the video, it seems like one approach would be to as follows: 1.Divide each drawing region into (say) a 3x3 grid and allow each region to contain a primitive, say vertical line, horizontal line, square, circle, triangle or nothing at all. (This depends somewhat on the motor control of your friend's parent) <ol start="2"> <li>When an image is complete, detect these primitives and encode a (say) 9 character key which can be used to retrieve the appropriate image. For example if triangle is, T, square is S and empty is underscore, then the code for 'I'm going home' as per the video would be "_T__S____".</li> <li>When a new image is started, you can detect each primitive as it's drawn and use it to construct a search key where the key has '?' for unknown characters. You can then quickly retrieve all possible matches from your database. </li> </ol> For example, if the user draws a triangle in the top, middle region, this would be encoded as '?T???????' and this would match '_T__S____' as well as '_TT______' If constraining the user to draw into a smaller region of the screen is not feasible then you can still store an encoding key representing the relative positions of each primitive. To do this you could calculate the centre of mass of each primitive, sort them left to right, top to bottom and then store some representation of their relative positions, e.g. a triangle above a square might be TVS where the V means that S is below T, a triangle to the left of a square might be T Hope this helps. Good luck!

<ol> <li>Sketch-based image retrieval. There is quite an extensive literature on looking up real images using sketches as the query. Not quite what you want, but some of the same techniques can probably be adapted to look up sketches using a sketch as the query. They may even work without modification.</li> <li>Automatic recognition of handwritten Chinese characters (or similar writing systems). There is also quite a lot of literature on that; the problem is similar, the writing system has evolved from image sketches but much simplified and stylized. Try to apply some of the same techniques.</li> <li>The number, order, location of individiual lines is probably more informative than the finished sketch as an image. Is there a way you can capture this? If your user is drawing using a stylus, you can probably record stylus trajectories for each line. This will have much, much more information content than the image itself. Think about someone drawing (eg) a car with their eyes closed. Going by the trajectories, you can easily figure out it's a car. From the picture it may be much harder.</li> <li>If you can capture lines as described, then the matching problem can be, to some approximation, reduced to the problem of matching some of the lines in image A to the most similar lines in image B (possibly deformed, offset, etc). They should also have similar relationships with other lines: if (for example) two lines cross in image A, they should cross in image B, and at a similar angle and at a similar location along the length of each. To be more robust, this should ideally deal with things like two lines in one image corresponding to a single (merged) line in the other.</li> </ol>

Compare source image against a bank of known images

Tags:

ios

image

image-processing

opencv

image-compression

I am building an App for a parent of a friend of mine that sadly had a stroke and can no longer talk, read or spell. He can however draw rather detailed drawings.

I have currently built an App that can process an image of a drawing and detect basic shapes. (Lines, squares and triangles) The App can count how many of each shape has been drawn so it knows the difference between an image with two squares appose to an image with just one square.

This places a large amount of cognitive load onto the user to remember all combinations of shapes and what they mean. I am currently detecting the contours in the image via

findContours(maskMat, contours, hierarchy, CV_RETR_LIST, CV_CHAIN_APPROX_SIMPLE);

What I would like to achieve is the user draws a shape, adds that to a bank of known drawings and then each time he draws an image, the App processes each known image comparing it to the source image and saving a similarity value. Then taking the highest similarity value, providing it is above a threshold, it can be taken as the image drawn is the best known image.

I have looked into OpenCV Pattern Matching as well as Templating but with unreliable results.

I'm asking for advice into the best approach that will provide the result I'm hoping for.

I built a promotion video for my university lecture to best illustrate what the App does. If you are interested you can view it here. https://youtu.be/ngMUUIsLHoc

Thanks in advance.

510

asked May 07 '15 16:05

Ste Prescott

2 Answers

First of all, this looks like a great app. And for a fantastic purpose. Good work!

To the specific of your question, having watched the video, it seems like one approach would be to as follows:

1.Divide each drawing region into (say) a 3x3 grid and allow each region to contain a primitive, say vertical line, horizontal line, square, circle, triangle or nothing at all. (This depends somewhat on the motor control of your friend's parent)

When an image is complete, detect these primitives and encode a (say) 9 character key which can be used to retrieve the appropriate image. For example if triangle is, T, square is S and empty is underscore, then the code for 'I'm going home' as per the video would be "_T__S____".
When a new image is started, you can detect each primitive as it's drawn and use it to construct a search key where the key has '?' for unknown characters. You can then quickly retrieve all possible matches from your database.

For example, if the user draws a triangle in the top, middle region, this would be encoded as '?T???????' and this would match '_T__S____' as well as '_TT______'

If constraining the user to draw into a smaller region of the screen is not feasible then you can still store an encoding key representing the relative positions of each primitive.

To do this you could calculate the centre of mass of each primitive, sort them left to right, top to bottom and then store some representation of their relative positions, e.g. a triangle above a square might be TVS where the V means that S is below T, a triangle to the left of a square might be T

Hope this helps.

Good luck!

answered Sep 21 '22 11:09

Dave Durbin

Sketch-based image retrieval. There is quite an extensive literature on looking up real images using sketches as the query. Not quite what you want, but some of the same techniques can probably be adapted to look up sketches using a sketch as the query. They may even work without modification.
Automatic recognition of handwritten Chinese characters (or similar writing systems). There is also quite a lot of literature on that; the problem is similar, the writing system has evolved from image sketches but much simplified and stylized. Try to apply some of the same techniques.
The number, order, location of individiual lines is probably more informative than the finished sketch as an image. Is there a way you can capture this? If your user is drawing using a stylus, you can probably record stylus trajectories for each line. This will have much, much more information content than the image itself. Think about someone drawing (eg) a car with their eyes closed. Going by the trajectories, you can easily figure out it's a car. From the picture it may be much harder.
If you can capture lines as described, then the matching problem can be, to some approximation, reduced to the problem of matching some of the lines in image A to the most similar lines in image B (possibly deformed, offset, etc). They should also have similar relationships with other lines: if (for example) two lines cross in image A, they should cross in image B, and at a similar angle and at a similar location along the length of each. To be more robust, this should ideally deal with things like two lines in one image corresponding to a single (merged) line in the other.

answered Sep 19 '22 11:09

Alex I

Related questions
                            
                                iOS Firebase IS_ADS_ENABLED flag in GoogleService-Info.plist file
                            
                                Can I remove previous versions of iphone support in DeviceSupport folder in Xcode 4?
                            
                                Can I launch one app from other app on iPhone
                            
                                Prevent tabbar from changing tab at specific index - IOS
                            
                                UITableViewCell bad performance with AutoLayout
                            
                                App Crashes after executing background fetch completionHandler
                            
                                CGContextSetFillColorWithColor: invalid context 0x0
                            
                                Why does a view changes position when rotating it using CGAffineTransformMakeRotation
                            
                                How-to suppress constraint and layout warnings in Xcode 5 on a storyboard?
                            
                                Draw smooth circle in iOS sprite kit
                            
                                How can I guarantee unique entries in a Core Data store in a shared app container used by both the host app and an extension?
                            
                                How to use "copy" property in Objective-C?
                            
                                White space at page bottom after device rotation in iOS Safari
                            
                                UIPopoverController dealloc getting called—ARC environment
                            
                                Understanding copyright field in the iTunes Connect app submission [closed]
                            
                                Core Bluetooth and backgrounding: Detection of a device and triggering an action, even after being days in background mode?
                            
                                UIViewController within UICollectionView
                            
                                Get nice, dark UIToolbar blur as in Facebook iOS 7 app
                            
                                Consequences of Bad Programming: dismissViewController vs popViewController
                            
                                ProjectName-Swift issue - Can't find protocol declaration for 'CLLocationManagerDelegate'

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With