Wednesday, March 19, 2008

A hidden Markov Model Based Sensor fusion Approach for Recognizing Continuous human grasping Sequence

This paper presents a method that is aimed at teaching robots the human grasping sequence by observing the human grasping. Though the main aim of the systemis to just use vision for the purpose, they are using this approach as a faster alternative.Authors have argued that the grasping postures along with the tactile sensor feedback can be used for capturing the grasping sequence. As such , they have used 18 sensor cyber glove along with the tactile sensors which are sewed under the gloves and occupy certain position on hand , which have been identified as spots that have maximum chance of detecting the contact using smaller number of sensors.

For the classification purpose they have used Kamakura grasp Taxonomy which separates grasps into 14 different classes according to their purpose ,shape and contact points. With this taxonomy, it is easier to identify Human grasps that humans use in the very day life. As per the Kamakura taxanomy, gestures have been classified into 4 major catagories:

1. 5 Power Grasps
2. 4 intermediate grasps
3. 4 precision grasps
4. 1 thumb less grasp.

Each Grasp is actually modeled as a HMM and used for as a classifier.The data for the HMM is obtained through Cyberglove and this data is fused with the sensor data obtained from the tactile sensors for the particular grasp. This is done as using just the information from the data gloves may not be correct as the shape of hand between two grasps may not change significantly. They have obtained 16 feature values from the gloves and 16 sensor values along with the one maximum sensor value to frame the feature vector for the particular grasp. Their system taken in both the inputs simultaneously and learns to weigh their importance by adjusting parameters during the training.

Their model consists of 9 states for HMM's and the HMM's are trained offline for each gesture. Along with the 14 HMM's for grasp classes, a junk class for garbage collection was also trained. They have made a simple assumption that each grasp must be followed by a release. This is done to ensure segmentation of the gesture and the maximum of the grasp gives a cue about the grasp and non grasp.

For modeling the HMM's they have used the Georgia Tech HMM toolkit and collected 112
112 training samples and 112 testing samples from 4 different users. They have reported an maximum accuracy of the single user model (trained on 1 user data and tested on same user data)as 92.2% with minimum of 76.8% and for the multiple user system (trained on all and tested on a given user data) they have reported accuracy of 92.2% as maximum and minimum accuracy of 89.9%. They have also suggested that with more user data, single user model may get better than multi user model. They have also claimed that most of the recognition error came from a small set of grasps in which the system relied solely on tactile information to distinguish the grasps. They believe that improved sensor technology may improve the results.


Discussion:

This paper presented a new method which utilizes the tactile information for gesture recognition. It made much sense to me as common gestures for day to day tasks contain much of the tactile information which can be exploited to differentiate between the two similar looking gestures. for example a tight fist and hollow fist (sometimes used to show O) may look similar to the glove but including the tactile information can distinguish between the two.

Also, it would be interesting if the vision can be used to add more flexibility to the system as a similar looking gesture (based on tactile and cyberglove) at a different position in space may convey some different meaning, (but , this can be done using the Flock birds too). Also this paper made gave me another approach for the segmentation problem based on utilizing the tactile information

1 comment:

Paul Taele said...

This was definitely one of my favorite papers because it was different and also attacked a overlooked problem that we hadn't seen in the other papers. I hadn't thought of grasping as a viable feature for gesture recognition until you brought it up. A cunning idea, Mr. Pankaj.