Sunday, March 30, 2008

Taiwan Sign Language (TSL) Recognition Based on 3D Data and Neural Networks

Summary:

This paper presensts a Neural Network based approach to recognize the 20 Taiwanese Sign Language Static Gestures. They have proposed a neural network (Back Propgation NN) for recognizing the 20 static Taiwanese gestures that are captured using a vision based capturing device called VICON. Using markers on the dorsal surface of the hand, they capture the features for the given gesture. The given gesture features are actually distance measures of the marker positions relative to some reference. The distances are normalized to take care of the variable hands and then used as the feature inputs to the Neural Network. The neural netwok is trained on the similar data obtained from the users. Authors have reported that they have used data from 10 students which repeated each of the 20 gesture 15 times thus providing 3000 data samples in all. Out of 3000 data samples, 212 were reprted to have missing values and were not used. Out of the rest 2788, 1350 samples were uysed for training and 1438 were used for testing.

Their NN architecture consists of 15 input Neurons and 20 output neurons aong with 2 hidden layers. With 250X250 neurons in the hidden layer they have reported accuracy of 94.5 on the test data while on the training data it was 98.5 (not important). 15 input neurons are for the 15 feature vectors used and 20 output neurons provide the output probabilities of each of the gesture.


Discussion:


This paper was simple and straight forward. The way of obtaining the features was simple considering only the distances between the markers were chosen and also gestures chosen had no occlusion. Considering the statis gestures, I think this is the problem in 2D rather than problem in 3D as the 3rd dimension is / can be always constant for the static gestures. The training and testing data is obtained with much care which may affect the recognition , if input data is taken from outside the users without much instructions. There is not much take home message from this paper except their way of obtaining the distance metrics that can be used as features.

No comments: