Mouse Gestures Recognition

Sample Image

Environment: VC++ 6.0. SP5, Win2k, WinXP, WinMe, Win9x

Introduction

Recently I installed Opera 5 and was impressed on a Gesture UI. IMHO, the neural network most suitable for this purpose. As I a little know neural network I tried to implement such feature themselves.

Neural Network

What is Neural Network ? Hm it's not easy to say. A rephrased definition Zurada, J.M.:

"Neural network software is a software which can acquire, store, and utilize experiential knowledge."

I think I can point any person concerned to theory directly to several neural network sites. Here is small list of web resources about Neural networks:

Implementation

Let's return to mouse gestures. After some research I have chosen a multilayer perceptron and standard back-propagation algorithm for training. The main problem was in the representation of an input data for neural network. The best result I found was in the transformation of a mouse path into a vector of cosines and sines.

For example:

path   {170:82 172:83 175:85 177:86 ...} 
transformed into 
vector {0.45 0.55 0.45 0.71 0.89 0.83 0.89 0.71 ...}

Recognition algorithm.

  1. record a mouse path
  2. smooth a path to a base points
  3. transform points to angles' vector
  4. compute sines and cosines
  5. pass values (cosines and sines) to network's inputs
  6. apply softmax function on an output network vector
  7. find and verify a winner

Neural network architecture.

  • input layers : 32 sinapses
  • hidded layer : 32 neurons
  • output layer : 29 axons (one for each gesture)
  • fully connected layers
  • transfer function : log-sigmoid
  • incremental training algorithm, standard back-propagation method
  • momentum, variable learning rate (slowly reduced)
  • input noise

Application

Training

Sample Image

Before testing the recognition ability you must train the network (or you can load an file image of trained net). You can customize the parameters of the training process, namely: maximum number of cycles, a momentum value, a learning rate, a minimum value of mean square error (in other words "target error"). The training process will stop after achieving either of the conditions: maximum number of cycles or target error. During the training process you can keep an eye on a error's graph, a current gesture (with noise) and 2D network presentation.

Testing

As soon as you have a trained net, you can test it. Select the patterns (or test all of them), a speed value and a noise level. Besides, you can familiarize oneself with ideal presentation of gestures via setting minimal noise and minimal speed.

Recognition

For recognition of mouse gestures you must press right mouse button during moving a mouse. For example for recognition "left" gesture, press right mouse button and move a mouse to the left. If a neural network can recognize the gesture, then you will see the name, probability and ideal presentation of winner. Because of freeware nature of GestureApp the mouse path must have at least 16 points :(. Sorry I didn't implemented a "stretch a path" feature so far.

Note: the direction is very important.

The network is trained to recognize the gestures but not 2D images. Hence, you can draw the "circle" gesture a thousand different ways, but the only valid way is: press mouse button and move a mouse to the right and down and so on. Once more: it's gesture, not 2D image.

Mouse gestures

Compatibility

Compatible with Win2k, WinXP, Win98, WinMe. Unfortunately doesn't work on WinNT because of the need for the AlphaBlend API.

Acknowledgement

Special Thanks:
My wife Julia for her nice artwork ;)

And thanks to:
Pedro Pombeiro for Selection slider control

Downloads

Download application - 158 Kb
Download source - 101 Kb


Comments

  • There are no comments yet. Be the first to comment!

Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • As mobile devices have pushed their way into the enterprise, they have brought cloud apps along with them. This app explosion means account passwords are multiplying, which exposes corporate data and leads to help desk calls from frustrated users. This paper will discover how IT can improve user productivity, gain visibility and control over SaaS and mobile apps, and stop password sprawl. Download this white paper to learn: How you can leverage your existing AD to manage app access. Key capabilities to …

  • This ESG study by Mark Peters evaluated a common industry-standard disk VTl deduplication system (with 15:1 reduction ratio) versus a tape library with LTO-5, drives with full nightly backups, over a five-year period.  The scenarios included replicated systems and offsite tape vaults.  In all circumstances, the TCO for VTL with deduplication ranged from about 2 to 4 times more expensive than the LTO-5 tape library TCO. The paper shares recent ESG research and lots more. 

Most Popular Programming Stories

More for Developers

Latest Developer Headlines

RSS Feeds