Wednesday, April 16, 2008

Discourse Topic and Gestual form

In the paper authors have tried to find out the extent to which the gestures made are dependent on the topic (i.e. are speaker independent) and to what extent the gestures made are dependent upon the user. The presented frame work is a Bayesian framework which utilizes the unsupervised technique for quantifying the above mentioned extent. They have used vision based approach for the same.

Their approach utilizes the visual features that describe motion based on so called spatiotemporal interest points, which are actually a high contrast image regions like corners and edges that undergo complex motion. From the detected points, the visual, spatial and kinematics characteristics are extracted to frame a huge feature vector on which PCA is applied to reduce the dimensionality. The reduced dimensionality feature vector is used to form a mixture model and a code book is obtained. The dataset consists of the 33 short videos (duration 3 minutes) of the dialogues involving 15 speakers describing one of the five pre determined topics. The user set age ranges from 18 -32 and were native English speakers.

In the experiment each user was allowed to talk to another speaker though they were not asked to make gestures. The scenarios involve describing, “Tom and Jerry” and mechanical devices (piston, candy, pinball machine and a toy). It was observed that of the recorded gestures, 12% of gestures were classified as topic-specific with correct topic labels and with the corrupted labels this value dropped to 3%. This indicated that there exists a connection between discourse topic and gestural form which is independent of speaker.

Discussion:

12% gestures are classified topic specific if given with correct labels and 3% are classified correctly with corrupted labels. And to convey this message instead of using simple human judge to take a note, complex machine learning with vision based approach was used which they admitted must be bit corrupted because of computer vision errors. I am not very impressed with the paper.

No comments: