Wednesday, April 23, 2008

Toward Natural Gesture/Speech HCI: A Case Study of Weather Narration

Summary:

This paper presents a gesture recognition system which can capture the motive behind the gesture rather than the description of the gesture for recognition. This means that if the user is pointing back, he can point back in any intuitive way as he likes. The system consists of the HMM which is trained on the global features of the gesture taking head as the reference (as shown in the diagram below).Color segmentation was used to extract out the skin which was tracked using the Predictive Kalman filtering. Authors have also suggested that in certain domains , certain speech follows the particular gesture which can be used as another feature to remove ambiguity in recognition.Authors observed that in their weather speaker domain, 85% of the time meanigful gesture is accompanied by a similar verbal word.. Using this knowledge, the correctness was improved from 63%-92% and accuracy to 62%-75%.



Discussion:


I liked that hey have pointed out the correlation between the words and the accompanying gestures. I think such a system which uses context (derived from words) would be a state of art development in the field of gesture recognition. They results (Though I am not convinced about their approach) demonstrate that context can improve accuracy. However, I am not happy with their too simplistic approach as lot of ambiguity can be introduced in such a system because of 3D nature of the problem and recognition using angles in 2D plane. Considering simple weather example, this may work to some extent, but results show that there is not much to expect from such a system in more complex situation.

No comments: