This paper presents a method to recognize the gestures based on the acceleration data and its application for the musical performance control. Authors have argued that the emotions and the expression are very much dependent on the force with which a gesture is performed rather than just the gesture. As such, they have used the accelerometer, placed on the back of the hand, to capture the force with which gesture is performed. This information is also utilized to recognize the gesture by analyzing the components of the acceleration vector in the 3-Planes (x-y,y-z,z-x) and capture the temporal change in the acceleration.
Using the temporal information available through the time series acceleration data projected in to the planes, 11 direction parameters (1 intensity component, 1 rotational component, 1 main motion component and 7 direction distribution components (computed by measuring density) ) are computed for each plane giving 33 parameters for the given gesture.These 33 parameters are then used for the recognition of the gesture.
For the recognition purposes, the given gesture samples are collected from the user and then using the difference between the gesture acceleration with the mean of the standard patterns, normalized with the standard deviation is computed which gives the error measure (weighted error). The gesture is recognized as the one belonging to the standard pattern, which gives the minimum weighted error.
They have tested their approach for generating music based on the gestures of the condutor. They have also used dynamic linear prediction for predicting the tempo based on the information of the previous tempo. This according to them gives a realtime results comapred to the image processing based approach.
They have claimed results meeting 100% accuracy with same user while performance declining with different user
Discussion:
This paper presents a simple though interesting utilization of the acceleration data for gesture recognition.However, I think that a given user can perform the same gesture quite differently(based on acceleration data) depending upon the energy and enthusiasm, so their approach may fail even for a same user (for whom system has been trained). I guess they are using some threshold to determine the start of the gesture and then some threshold to mark the end. I believe,the initial start will have an abrupt acceleration while will become kind of constant for the rest of the gesture and then for change of gesture, there will be abrupt change in acceleration (direction + Magnitude) finally leading to decrease in acceleration leading to final stop. The changing values of acceleration can be used for segmenting the gesture.
I am not sure if they have used a series of gestures or just one gesture at a time for their experiment. Also, it would be interesting if they use multiple user data (normalized) for the training set and then check the accuracy for the multiple users.
No comments:
Post a Comment