Wednesday, February 6, 2008

A Similarity Measure for Motion Stream Segmentation and Recognition

This paper presents a SVD (Singular Vector Decomposition) approach for segmenting out the similar gestures based on the similarity measure they have introduced in this paper.

Any motion capture device captures the sensory information sampled over time. The sensory information from respective sensors occupies the columns where as the time sampling occupies the rows of any data matrix. Hence if we look at nearly same gestures, it is very difficult to maintain the same time samples because of variations in even similar motion. As such, we usually get the matrix for the same motion which is not dimensionally same. Also, given a huge data of the gesture matrix, we cannot arrive at a measure which captures the structure of the data, which can be exploited to identify the gesture as different from others.

This paper presents an approach based on principle component analysis which has been widely used in machine learning and pattern recognition domains to capture the components of maximum variance in the data. These components of variations are popularly known as the Principle components (PC) and by projecting the data using them as basis, we can obtain the discriminatory structure of the data. Usually only first few of the PC’s are enough for discrimination purposes. These Principle components ideally, for a similar gesture should be nearly parallel, but because of the variations, they are not. It is argued that for similar gestures, corresponding, principle components (eigenvectors) should contribute equally to the parallel ness. Also its is suggested by the authors that since the eigenvalues are related to the variances along the principle components ( eigen vectors ), they can be used to give different weights to the different eigenvector pairs.

Based on their observations, they framed a similarity measure called K-Weighted Angular Similarity or k-WAS, which captures the similarity between the two matrixes of data, based on a kind of average of the normalized eigenvalues for the two data matrixes and the contribution of each eigen vector pair to the parallel ness. Numerically, it is given by

In order to compare their similarity measures authors have tried the same on two data collection methods. One is the Cyber Glove and other is the video capturing model called VICON. It was observed that in case of cybergloves just 3 eigenvector pairs (out of 22 possible) were sufficient for accuracy where as for motion capture system, 5 out of possible 54 eigen vectors were sufficient.

In order to frame a comparison, authors have presented their results in comparison to existing systems like MAS and Eros, which was out performed by their measure called k-WAS.





Discussion:

I liked the paper as it was pretty straightforward with good results. I liked the fact that they were able to capture the variance using the weighted PCA approach which is a popular and simple approach for data analysis. They weighted PCA was similar to many existing works in other domains, though it was different that many works in the field of motion recognition which normally rely on complex HMM’s and lot of training. Apart from this, their system is unsupervised which makes it even more attractive. However, since they are working with just the sensory information, they have eliminated the temporal nature of the gestures form the consideration and hence, it is quite possible that even if the gesture is performed in a way which gives same values for the sensors at different times, they would be considered similar. I was impressed that they have identified the problem with their approach and suggested a dynamic time wrapping approach, though I believe, they have not implemented it.

As far as my knowledge of PCA is concerned, I believe it is very prone to noise and noisy data may cause great variation in the similar looking gestures affecting the similarity measure base on PCA. Also, because the variance information is exploited, if the gestures don’t vary a lot (which means, information is not contained in variance), PCA may not be helpful. I believe using PCA with some temporal approach would be a nice contribution.

1 comment:

Grandmaster Mash said...

You can probably incorporate the temporal data by performing PCA on the derivative of a window of matrix data. Shouldn't be too tough.