Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does having variations of gestures in gesture library improve recognition?

I'm working on implementing gesture recognition in my app, using the Gestures Builder to create a library of gestures. I'm wondering if having multiple variations of a gesture will help or hinder recognition (or performance). For example, I want to recognize a circular gesture. I'm going to have at least two variations - one for a clockwise circle, and one for counterclockwise, with the same semantic meaning so that the user doesn't have to think about it. However, I'm wondering if it would be desirable to save several gestures for each direction, for example, of various radii, or with different shapes that are "close enough" - like egg shapes, ellipses, etc., including different angular rotations of each. Anybody have experience with this?

like image 243
Andy Dennie Avatar asked Oct 12 '11 16:10

Andy Dennie


People also ask

What are the design issues in gesture recognition?

There are many challenges associated with the accuracy and usefulness of gesture recognition software. For image-based gesture recognition, there are limitations on the equipment used and image noise. Images or video may not be under consistent lighting, or in the same location.

What algorithms are used in gesture recognition?

In this study, we propose the gesture recognition algorithm using support vector machines (SVM) and histogram of oriented gradient (HOG). Besides, we also use the CNN model to classify gestures. We approach and select techniques of applying problem controlling for the robotic system.

What can gesture recognition be used for?

Gesture recognition is technology that uses sensors to read and interpret hand movements as commands. In the automotive industry, this capability allows drivers and passengers to interact with the vehicle — usually to control the infotainment system without touching any buttons or screens.

What are the classification of gestures?

McNeill (1992) proposes a general classification of four types of hand gestures: beat, deictic, iconic and metaphoric. Beat gestures reflect the tempo of speech or emphasise aspects of speech.


1 Answers

OK, after some experimenation and reading of the android source, I've learned a little... First, it appears that I don't necessarily have to worry about creating different gestures in my gesture library to cover different angular rotations or directions (clockwise/counterclockwise) of my circular gesture. By default, a GestureStore uses a sequence type of SEQUENCE_SENSITIVE (meaning that the starting point and ending points matter), and an orientation style of ORIENTATION_SENSITIVE (meaning that the rotational angle matters). However, these defaults can be overridden with 'setOrientationStyle(ORIENTATION_INVARIANT)' and setSequenceType(SEQUENCE_INVARIANT).

Furthermore, to quote from the comments in the source... "when SEQUENCE_SENSITIVE is used, only single stroke gestures are currently allowed" and "ORIENTATION_SENSITIVE and ORIENTATION_INVARIANT are only for SEQUENCE_SENSITIVE gestures".

Interestingly, ORIENTATION_SENSITIVE appears to mean more than just "orientation matters". It's value is 2, and the comments associated with it and some related (undocumented) constants imply that you can request different levels of sensitivity.

// at most 2 directions can be recognized
public static final int ORIENTATION_SENSITIVE = 2;
// at most 4 directions can be recognized
static final int ORIENTATION_SENSITIVE_4 = 4;
// at most 8 directions can be recognized
static final int ORIENTATION_SENSITIVE_8 = 8;

During a call to GestureLibary.recognize(), the orientation type value (1, 2, 4, or 8) is passed through to GestureUtils.minimumCosineDistance() as the parameter numOrientations, whereupon some calculations are performed that are above my pay grade (see below). If someone can explain this, I'm interested. I get that it is calculating the angular difference between two gestures, but I don't understand the way it's using the numOrientations parameter. My expectation is that if I specify a value of 2, it finds the minimum distance between gesture A and two variations of gesture B -- one variation being "normal B", and the other being B spun around 180 degrees. Thus, I would expect a value of 8 would consider 8 variations of B, spaced 45 degrees apart. However, even though I don't fully understand the math below, it doesn't look to me like a numOrientations value of 4 or 8 is used directly in any calculations, although values greater than 2 do result in a distinct code path. Maybe that's why those other values are undocumented.

/**
 * Calculates the "minimum" cosine distance between two instances.
 * 
 * @param vector1
 * @param vector2
 * @param numOrientations the maximum number of orientation allowed
 * @return the distance between the two instances (between 0 and Math.PI)
 */
static float minimumCosineDistance(float[] vector1, float[] vector2, int numOrientations) {
    final int len = vector1.length;
    float a = 0;
    float b = 0;
    for (int i = 0; i < len; i += 2) {
        a += vector1[i] * vector2[i] + vector1[i + 1] * vector2[i + 1];
        b += vector1[i] * vector2[i + 1] - vector1[i + 1] * vector2[i];
    }
    if (a != 0) {
        final float tan = b/a;
        final double angle = Math.atan(tan);
        if (numOrientations > 2 && Math.abs(angle) >= Math.PI / numOrientations) {
            return (float) Math.acos(a);
        } else {
            final double cosine = Math.cos(angle);
            final double sine = cosine * tan; 
            return (float) Math.acos(a * cosine + b * sine);
        }
    } else {
        return (float) Math.PI / 2;
    }
}

Based on my reading, I theorized that the simplest and best approach would be to have one stored circular gesture, setting the sequence type and orientation to invariant. That way, anything circular should match pretty well, regardless of direction or orientation. So I tried that, and it did return high scores (in the range of about 25 to 70) for pretty much anything remotely resembling a circle. However, it also returned scores of 20 or so for gestures that were not even close to circular (horizontal lines, V shapes, etc.). So, I didn't feel good about the separation between what should be matches and what should not. What seems to be working best is to have two stored gestures, one in each direction, and using SEQUENCE_SENSITIVE in conjunction with ORIENTATION_INVARIANT. That's giving me scores of 2.5 or higher for anything vaguely circular, but scores below 1 (or no matches at all) for gestures that are not circular.

like image 139
Andy Dennie Avatar answered Oct 13 '22 15:10

Andy Dennie