<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-9183475114237580876</id><updated>2011-08-01T13:07:23.702-07:00</updated><title type='text'>Haptics</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>45</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-187583981438075164</id><published>2008-04-23T09:56:00.000-07:00</published><updated>2008-04-23T11:47:41.094-07:00</updated><title type='text'>Toward Natural Gesture/Speech HCI: A Case Study of Weather Narration</title><content type='html'>Summary:&lt;br /&gt;&lt;br /&gt;This paper presents a gesture recognition system which can capture the motive behind the gesture rather than the description of the gesture for recognition. This means that if the user is pointing back, he can point back in any intuitive way as he likes. The system consists of the HMM which is trained on the global features of the gesture taking head as the reference (as shown in the diagram below).&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_3DcJ3hARvOY/SA-D_Nj2AjI/AAAAAAAAAZA/fZzeY4uA-pk/s1600-h/untitled.bmp"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="http://4.bp.blogspot.com/_3DcJ3hARvOY/SA-D_Nj2AjI/AAAAAAAAAZA/fZzeY4uA-pk/s320/untitled.bmp" alt="" id="BLOGGER_PHOTO_ID_5192514017376469554" border="0" /&gt;&lt;/a&gt;Color segmentation was used to extract out the skin which was tracked using the  Predictive Kalman filtering. Authors have also suggested that in certain domains , certain speech follows the particular gesture which can be used as another feature to remove ambiguity in recognition.Authors observed that in their weather speaker domain, 85% of the time meanigful gesture is  accompanied by a similar verbal word.. Using this knowledge, the correctness was improved from  63%-92% and accuracy to 62%-75%.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Discussion:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;I liked that hey have pointed out the correlation between the words and the accompanying gestures. I think such a system which uses context (derived from words) would be a state of art development in the field of gesture recognition. They results (Though I am not convinced about their approach) demonstrate that context can improve accuracy. However, I am not happy with their too simplistic approach as lot of ambiguity can be introduced in such a system because of 3D  nature of the problem and recognition using angles  in 2D plane. Considering simple weather example, this may work to some extent, but results show that there is not much to expect from such a system in more complex situation.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-187583981438075164?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/187583981438075164/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=187583981438075164' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/187583981438075164'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/187583981438075164'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/04/toward-natural-gesturespeech-hci-case.html' title='Toward Natural Gesture/Speech HCI: A Case Study of Weather Narration'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_3DcJ3hARvOY/SA-D_Nj2AjI/AAAAAAAAAZA/fZzeY4uA-pk/s72-c/untitled.bmp' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-127683203281502581</id><published>2008-04-23T09:30:00.000-07:00</published><updated>2008-04-23T09:54:21.181-07:00</updated><title type='text'>3D Object Modeling Using Spatial and Pictographic Gestures</title><content type='html'>This paper presents the merger of graphics and haptics to create objects in the augmented reality setup. The authors have given a brief introduction of the superquadrics (super ellipsoid) and how they are used to create various shapes in the graphics. In the paper they have used the hand gestures to map to certain operations like tapering, twisting, bending etc  which are mapped to the graphic object using various functional mappings. Using the Self organizing maps, the system allows the users to specify their own gestures for each operation. In the example, the creation of primitive shapes  is  the  first  step. The primitive shapes are then used to build up the complex shapes like the teapot and the vase.The system was  also tested to see if the non existing objects can be created from hand and how precisely the real objects can be modeled.&lt;br /&gt;&lt;br /&gt;The system consists of two cybergloves , aPolhemus tracker and a large 200 inches projection screen where the objects are projected. Using the special LCD shutter eye glasses, user can see the objects and manipulate with them using his hands and pre-specified gestures.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Discussion:&lt;br /&gt;&lt;br /&gt;I am not sure how easy or difficult is it to manipulate with the objects in the augmented reality.I believe if there would have been some haptic feedback, it would have been more natural.However, I liked the paper because it was different and had very practical  application. Such a system can also be used by multi-user design team to create designs in the virtual world by interacting with the object together.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-127683203281502581?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/127683203281502581/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=127683203281502581' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/127683203281502581'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/127683203281502581'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/04/3d-object-modeling-using-spatial-and.html' title='3D Object Modeling Using Spatial and Pictographic Gestures'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-7502002469831278358</id><published>2008-04-23T08:33:00.000-07:00</published><updated>2008-04-23T09:30:45.277-07:00</updated><title type='text'>Device Independence and Extensibility in Gesture Recognition</title><content type='html'>In this paper a multi-layer recognizer is presented which is claimed to be device independent recognition system that can work with different gloves. The data from the gloves is  converted into the  posture(like is bend, is open ), temporal (Changes in the shape of the posture over time),gestural predicates(Motion of the postures). The predicates basically contain the information about the states of the fingers  . Thus a 32 dimensional boolean vector is obtained  through a feed forward  neural network which is used to match with the Template.The template matcher works by computing the Euclidean distance between the observed predicate vector and every known template. The template that is found to be the shortest distance from the observed predicate vector is selected as the gesture with certain confidence value. Confidence values are basically weights which are used to weigh certain predicate so that the information from all predicates is equally utilized (as some predicates may give less information while others may give more).&lt;br /&gt;&lt;br /&gt;They have tested their system on ASL (excluding letter J and Z) and have reported the accuracy in high 50's  and mid Sixties by varying the sensors availabe in 22 sensor Cyber glove. They also reported that there was increase in accuracy when bigram context model was added to the recognition.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Discussion:&lt;br /&gt;&lt;br /&gt;There is nothing great in the paper. using the predicates is an easy way but there is always scalability issue when users change. This method will not be suitable for multi-user environment. Even a single user would not be able to make this system to get higher accuracy because of intrinsic variability in the human gestures. That is why accuracy is very less. Also, using the less sensor data from similar glove does not make glove a different glove. As I understand  different glove means altogether different sensors and architecture. I do not agree with their claim of device indpendence as they have projected in the paper.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-7502002469831278358?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/7502002469831278358/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=7502002469831278358' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/7502002469831278358'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/7502002469831278358'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/04/device-independence-and-extensibility.html' title='Device Independence and Extensibility in Gesture Recognition'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-5931346693821012716</id><published>2008-04-16T15:25:00.000-07:00</published><updated>2008-04-16T15:26:34.231-07:00</updated><title type='text'>Discourse Topic and Gestual form</title><content type='html'>&lt;p class="MsoNormal" style="text-align: justify;"&gt;In the paper authors have tried to find out the extent to which the gestures made are dependent on the topic (i.e. are speaker independent) and to what extent the gestures made are dependent upon the user. &lt;span style=""&gt; &lt;/span&gt;The presented frame work is a Bayesian framework which utilizes the unsupervised technique for quantifying the above mentioned extent. They have used vision based approach for the same.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;Their approach utilizes the visual features that describe motion based on so called spatiotemporal interest points, which are actually a high contrast image regions like corners and edges that undergo complex motion. From the detected points, the visual, spatial and kinematics characteristics are extracted to frame a huge feature vector on which PCA is applied to reduce the dimensionality. The reduced dimensionality feature vector is used to form a mixture model and a code book is obtained. The dataset consists of the 33 short videos (duration 3 minutes) of the dialogues involving 15 speakers describing one of the five pre determined topics. The user set age ranges from 18 -32 and were native English speakers.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;In the experiment each user was allowed to talk to another speaker though they were not asked to make gestures. The scenarios involve describing, “Tom and Jerry” and mechanical devices (piston, candy, pinball machine and a toy). It was observed that of the recorded gestures, 12% of gestures were classified as topic-specific with correct topic labels and with the corrupted labels this value dropped to 3%. This indicated that there exists a connection between discourse topic and gestural form which is independent of speaker.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;Discussion:&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;12% gestures are classified topic specific if given with correct labels and 3% are classified correctly with corrupted labels. And to convey this message instead of using simple human judge to take a note, complex machine learning with vision based approach was used which they admitted must be bit corrupted because of computer vision errors. I am not very impressed with the paper.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-5931346693821012716?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/5931346693821012716/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=5931346693821012716' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/5931346693821012716'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/5931346693821012716'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/04/discourse-topic-and-gestual-form.html' title='Discourse Topic and Gestual form'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-5609893605942300265</id><published>2008-04-14T00:05:00.000-07:00</published><updated>2008-04-14T00:08:04.534-07:00</updated><title type='text'>Glove-TalkII - A Neural-Network Interface which Maps Gestures to Parallel Formant Speech Synthesizer Controls</title><content type='html'>This paper presents a gesture based speech synthesizer called glove-talk-II that can be used for communication. The idea behind the whole work is that by recognizing the subtasks involved in speech generation (tongue motion, alphabet generation, sound used), they can be mapped to suitable actions mapped to the sensor devices The device consist of a cyber glove (18 Sensor) for gestures, a polhemus for emulating the tongue motion (up down) for consonants, a foot pedal for sound variation. The input from all the three sensors is fed to 3 different neural networks trained on the data for that particular subtask.&lt;br /&gt;&lt;br /&gt;There are three networks used which are: Vowel/Consonant network, Consonant network and Vowel network. A Vowel/consonant network is responsible for recognizing if the emit the vowel or consonant sound based on the configuration of the user hands. Authors have claimed the network can interpolate between hand configurations to produce smooth but rapid transitions between vowels and consonants. The network trained is a 10-5-1 feed forward network with sigmoid activation function. They have trained it using 2600 examples of consonants and 700 examples of vowels. Training data was collected from the expert user while vowel data was collected from the same user by requiring him to move hands up and down. The test consists of 1614 examples and MSE was reduced to 10^-4 after the training.&lt;br /&gt;&lt;br /&gt;The vowel network is a 2-11-8 FF network and hidden units are the RBF’s which are centered to the respond to one of the cardinal networks. Outputs are 8 sigmoid units representing 8 synthesized control parameters. The network was trained on 1100 examples with some noise added. The network was tested on 550 examples and gave MSE of 0.0016 after training.&lt;br /&gt;&lt;br /&gt;The consonant network is a 10-14-9 network with 14 hidden units as normalized RBF units. Each of the RBF unit is centered at Hand configuration determined from the training data. The 9 output units are responsible for the nine control parameters of the formant synthesizer. The network is trained on 350 approximants, 1510 fricatives and 700 nasal scaled data. Test data consists of 255 approximants, 960 fricatives and 165 nasal data giving MSE of 0.005.&lt;br /&gt;&lt;br /&gt;They tested their product on an expert pianist and with 100 hours of training, his speech with glove-talk-II was found to be intelligible though still with errors in polysyllable words.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Discussion:&lt;br /&gt;&lt;br /&gt;I just wonder how difficult / easy is to control your speech with foot pedal, data glove and polhemus all of which need to move in synchronization for correct pronunciation. Also with 100 hours of training results were intelligible and I doubt if this system is comfortable at all. But for a paper 10 years old, it is a decent first step and a different approach.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-5609893605942300265?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/5609893605942300265/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=5609893605942300265' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/5609893605942300265'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/5609893605942300265'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/04/glove-talkii-neural-network-interface.html' title='Glove-TalkII - A Neural-Network Interface which Maps Gestures to Parallel Formant Speech Synthesizer Controls'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-9176293629558089207</id><published>2008-04-13T21:11:00.000-07:00</published><updated>2008-04-13T23:03:47.760-07:00</updated><title type='text'>Feature selection for grasp recognition from optical markers</title><content type='html'>&lt;div&gt;This paper presents the feature selection methodology to select the relevant features from a large feature set for better recognition results. The aim of the project is to obtain a gesture class for an input using the minimum features. In the approach, the authors have used several markers on the back of the hand and used the calibrated cameras to track the markers in the controlled environment. They have supported the use of this method because this approach doesn’t affect the natural grasping of the subject. Also, it doesn’t affect the natural contact with the object.&lt;br /&gt;In their experiment they have placed markers on the back of the hand and used the local coordinate system which is invariant to the pose.&lt;br /&gt;&lt;br /&gt;In order to classify the gestures they have used a linear logical regression classifier which is used for subset selection in a supervised manner. They have used three markers a t a time as a single feature vector and then used across validation with subset selection with the number of features to arrive at the best feature set. They have tried both the backward and the forward approach for the subset selection and observed that the error between the two approaches is just 0.5%.&lt;br /&gt;&lt;br /&gt;In their experiment, they have used a 90 dimensional vector for a hand pose representing the 30 markers on the hand. Their domain for the experiment is the daily functional grasps shown in the figure below. &lt;img id="BLOGGER_PHOTO_ID_5188977612707539682" style="DISPLAY: block; MARGIN: 0px auto 10px; CURSOR: hand; TEXT-ALIGN: center" alt="" src="http://1.bp.blogspot.com/_3DcJ3hARvOY/SALzpXjjKuI/AAAAAAAAAYI/pLpUoNt_Ags/s320/Untitled21.jpg" border="0" /&gt;&lt;br /&gt;&lt;br /&gt;They collected their data using the 46 objects which were grasped in a multiple ways. The objects were divided into two sets A and B, A containing 38 objects with the 88 object-grasp pairs and B containing 8 objects with 19 object grasp pairs. They collected data from 3 subjects for set B and 2 subjects for set A. Then they used 2 fold cross validation and with full 30 marker data obtained accuracy of 91.5 % while with 5 marker selected by subset selection they achieved 86% accuracy.&lt;br /&gt;&lt;br /&gt;They also evaluated their feature subset on classifier trained on different data using both 5 and 30 marker set. They trained in total 4 classifiers (2 with the data from subject 1 and subject 2 respectively from object set A, 3rd one with the combined data of subject 1 and subject 2 from object set A and fourth on combined set A+B from subject 1 and subject 2).&lt;br /&gt;&lt;br /&gt;They observed that the accuracy was sensitive to weather the data from the subject is used for training or not. With data included the accuracy for the user was 80-93% in reduced space (5 markers) and between 92-97% (with 30 markers). With totally new user accuracy tested on single user trained data, accuracy was abysmal 21-65% for reduced marker space. However the retain-ment of accuracy was over 100% in all cases.&lt;br /&gt;&lt;br /&gt;In their analysis they also observed that the grasps for all the three subjects did well for cylindrical and pinch grasps however spherical and lateral tripod performed poorly because of the similarity between three finger precision grasps.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Discussion:&lt;br /&gt;&lt;br /&gt;This paper has nothing new except the new complex linear logical regression classifier. Their analysis is also based on small user set and hence cannot be generalized to most of the cases. I think with more users with different hand sizes, it would have been a better paper. Also, I don’t understand why many papers claim that the accuracy increases with the samples from the user included in the training set. I think it is very simple and easy to digest fact which needs no explanation. Also, it would have been nice if they could have mapped the relation between the grasping patterns of the users, which might have been used for making more generalized set of features for a given set of users sharing similar patterns. The work is very similar to our paper on sketch.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-9176293629558089207?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/9176293629558089207/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=9176293629558089207' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/9176293629558089207'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/9176293629558089207'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/04/feature-selection-for-grasp-recognition.html' title='Feature selection for grasp recognition from optical markers'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_3DcJ3hARvOY/SALzpXjjKuI/AAAAAAAAAYI/pLpUoNt_Ags/s72-c/Untitled21.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-5206268925764129779</id><published>2008-04-13T19:28:00.000-07:00</published><updated>2008-04-13T21:11:21.439-07:00</updated><title type='text'>RFID-enabled Target Tracking and Following with a Mobile Robot Using Direction Finding Antennas</title><content type='html'>&lt;p&gt;Summary This paper presents novel application of RFID for robot target tracking and following. Authors have utilized the cheap though effective RFID technology for the purpose. These antennas receive the signal form the RFID transponder which generates the voltage which depends on angle of orientation of the antenna (sin of the angle)with respect to the transponder. This variation in the voltage can be used to find the direction of transponder .In order to find the direction correctly; they have used dual antennas which are at 90 degree phase difference. S The ration between the voltage can be calibrated to the angle information of the transponder (as the v1/v2= tan(theta), where theta is the angle between the  transponder direction and the  line of sight of the transponder with respect to the vertex of the angle formed by the  two antennas.&lt;br /&gt;&lt;br /&gt;The set of antennas is connected to an actuator, controlled by a micro controller, which is programmed to rotate the antennas such that the ratio (v1/v2) remains 1. This ensures that the transponder is always in the direction given by the angle bisector of the two antennas. The change in the direction triggered by the actuator is mapped to the actual direction of motion of the robot. Thus the robot can be made to track either another robot or reach some static destination.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Discussion:&lt;br /&gt;&lt;br /&gt;This paper as discussed by all is more relevant to Robotic navigation and Rather than RFID. Though, it has given some good cue about utilizing the information based on the voltage ratios for direction tracking. May be we can club this information with the data gloves to get the 3D information about the hand position. I think if we can use a similar setup (orthogonal antennas) as a receiver and some RFID transmitter (tags) on the gloves, the ratio can be utilized to track the hand in 3D space with respect to the vertex of the receiver. Over all I liked this simple and straightforward approach.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-5206268925764129779?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/5206268925764129779/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=5206268925764129779' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/5206268925764129779'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/5206268925764129779'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/04/rfid-enabled-target-tracking-and.html' title='RFID-enabled Target Tracking and Following with a Mobile Robot Using Direction Finding Antennas'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-7589227135871603381</id><published>2008-04-13T18:20:00.000-07:00</published><updated>2008-04-13T19:28:19.154-07:00</updated><title type='text'>Gesture Recognition Using an Acceleration Sensor and Its Application to Musical Performance Control</title><content type='html'>This paper presents a method to recognize the gestures based on the acceleration data and its application for the musical performance control. Authors have argued that the emotions and the expression are very much dependent on the force with which a gesture is performed rather than just the gesture. As such, they have used the accelerometer, placed on the back of the hand, to capture the force with which gesture is performed. This information is also utilized to recognize the gesture by analyzing the  components of the acceleration vector in the 3-Planes (x-y,y-z,z-x) and capture the  temporal change in the acceleration.&lt;br /&gt; Using the temporal information available through the time series acceleration data projected in to the planes, 11 direction parameters (1 intensity component, 1 rotational component, 1 main motion component and 7 direction distribution components (computed by measuring density) ) are computed for each plane giving 33 parameters for the given gesture.These 33 parameters are then used for the recognition of the gesture.&lt;br /&gt;&lt;br /&gt;For the recognition purposes, the given gesture samples are collected from the user and then  using the difference between the gesture acceleration with the mean of the standard patterns, normalized with the standard deviation is computed which gives the error measure (weighted error). The gesture is recognized as the one belonging to the standard pattern, which gives the minimum  weighted error.&lt;br /&gt;&lt;br /&gt;They have tested their approach for  generating music based on the gestures of the condutor. They have also used dynamic linear prediction for predicting the tempo based on the information of the previous tempo. This according to them gives a realtime results comapred to the image processing based approach.&lt;br /&gt;&lt;br /&gt;They have claimed results meeting 100% accuracy with same user while performance declining with different user&lt;br /&gt;&lt;br /&gt;Discussion:&lt;br /&gt;&lt;br /&gt;This paper presents a  simple though interesting  utilization of the acceleration data for gesture recognition.However, I think that a given user can perform the same gesture quite differently(based on acceleration data) depending upon the energy and enthusiasm, so their approach may fail even for a same user (for whom system has been trained). I guess they are using some threshold to determine the  start of the gesture and then some threshold to mark the end. I believe,the initial start will have an abrupt acceleration  while will become kind of constant for the  rest of the gesture and then for change of gesture, there will be abrupt change in acceleration (direction + Magnitude) finally leading to decrease in acceleration leading to final stop. The changing values of acceleration can be used for segmenting the gesture.&lt;br /&gt;&lt;br /&gt;I am not sure if they have used a series of gestures or just one gesture at a time for their experiment. Also, it would be interesting if they use multiple user data (normalized) for the training set and then check the accuracy for the multiple users.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-7589227135871603381?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/7589227135871603381/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=7589227135871603381' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/7589227135871603381'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/7589227135871603381'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/04/gesture-recognition-using-acceleration.html' title='Gesture Recognition Using an Acceleration Sensor and Its Application to Musical Performance Control'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-6044443510340886453</id><published>2008-04-03T19:34:00.000-07:00</published><updated>2008-04-03T20:16:49.071-07:00</updated><title type='text'>Activity Recognition using Visual Tracking and RFID</title><content type='html'>&lt;span style="font-weight: bold;"&gt;Summary:&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;In this paper, authors have presented the use of computer vision and RFID information for capturing the activity of the subject. Using the standard methods  based on skin color based segmentation, they obtain the skinny region (hands face). By using the information about the area of the bounding box, they recognize the hands and track them.&lt;br /&gt;&lt;br /&gt;All the objects are having a RFID tag which is read through a RFID reader. The RFID reader consists of an antenna which emits the enery that is used by the RFID tag to get charged. The RFID antenna is periodically switched off during which, the capacitor in the RFID tag looses charge which communicates the ID of the Tag back to the reader by modulation of the energy. The authors have modified the above mentioned mechanism  is modified in which the reader is provided with the aditional capability of capturing the voltage values emitted by the RFID tags.&lt;br /&gt;&lt;br /&gt;Since the tag would receive maximum signal when it is  normal to the antenna field. The variation in the energy received and emitted can be used to capture the orientation information about the ibject being manipulated by the user.&lt;br /&gt;&lt;br /&gt;In a nut shell:&lt;br /&gt;&lt;br /&gt;RFID tag is used to capture if the object is being manipulated by the user or not and what is the object being manipulated.&lt;br /&gt;&lt;br /&gt;Computer Vision is used to obtain the position of the hand. If they are close to the object being manipulated (measured through RFID), it is said that user is approaching the object.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Tracking his hands along with the orientation information from the RFID tags is used to recognize activity. As an example they have shown the  activity of a subject in a retail store environment.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Discussion:&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;This is really a nice and novice method of recognizing activity and may be it can be helpful in Dr Hammond's favorite  "wokshop saw" activity also . But my concerns are:&lt;br /&gt;&lt;br /&gt;1. If RFID can provide a good estimate of the distance.&lt;br /&gt;2. If they are unaffected by presence of other magnetic devices in the proximity.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Over all it is a nice approach and as discussed in the class is a nice and cheap direction to look at for future research.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-6044443510340886453?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/6044443510340886453/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=6044443510340886453' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/6044443510340886453'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/6044443510340886453'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/04/activity-recognition-using-visual.html' title='Activity Recognition using Visual Tracking and RFID'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-7831425218159009005</id><published>2008-04-03T19:08:00.000-07:00</published><updated>2008-04-03T19:29:58.238-07:00</updated><title type='text'>Gestures without Libraries, Toolkits or Training: A $1 Recognizer for User Interface Prototypes</title><content type='html'>Summary:&lt;br /&gt;&lt;br /&gt;The authors have presented simple yet robust method of recognizing single stroke gestures (Sketch gestures) without use of any complex machine learning algorithm. The idea behind the complete $1 recognizer is simple:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt; Capture different ways of drawing a shape.&lt;/li&gt;&lt;li&gt; Re-sample the data points so that over all shape is preserved.&lt;/li&gt;&lt;li&gt; Rotate the sampled points by an indicative angle, which is defined by them as an angle between the gestures first point and the centroid. According to them this indicative angle rotation helps in finding the best match.&lt;/li&gt;&lt;li&gt;Then the samples points are compared against the available templates on the database.&lt;/li&gt;&lt;li&gt;The closest matching template is the recognized result.&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;Since sampled points are dependent on the drawing style of the user, the two similar looking shapes may have different template matches because of the variability in drawing style. This is eliminated by providing the database with all possible drawing style samples.&lt;br /&gt;&lt;br /&gt;They have compared their algorithm with the Rubine algorithm and have reported better accuracy than Rubine. With 1 template, they have obtained accuracy of  97% which improved to 99% with 3 templates per gesture.&lt;br /&gt;&lt;br /&gt;Discussion:&lt;br /&gt;&lt;br /&gt;I have same thoughts like Brandon. being an instance based algorithm , if your database is too large , this would not yield faster results. This means that when we present the gesture to the system, it does all the pre processing and searches through data set.Even if we repeat the same gesture, it will search the complete data set again .It does not keep a notebook record of expected gesture even when gesture is repeated again. In other words , it is instance based algorithm with memory erased after each search is completed. May be by some book keeping we can improve the algorithm. Other wise it is a beautiful and simple algorithm .&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-7831425218159009005?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/7831425218159009005/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=7831425218159009005' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/7831425218159009005'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/7831425218159009005'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/04/gestures-without-libraries-toolkits-or.html' title='Gestures without Libraries, Toolkits or Training: A $1 Recognizer for User Interface Prototypes'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-7685238356484437063</id><published>2008-04-03T18:34:00.000-07:00</published><updated>2008-04-03T19:08:28.053-07:00</updated><title type='text'>Enabling Fast and Effortless Customisation in Accelerometer Based Gesture Interaction</title><content type='html'>This paper presents use of the acceleration data for the gesture recognition using the HMM's. The gestures used are used for controlling the DVD player controls.They are using the accelerometers to capture the acceleration values and as per them these signal patterns can be used in generating models that allow the recognition of gestures using an HMM. The process involves:&lt;br /&gt;&lt;br /&gt;1. Using the sensors to obtain the accelerometer data.&lt;br /&gt;2. Sampling the data again and normalizing the same equal to equal length and amplitude. The data is reduced to 40 sample points per gesture.&lt;br /&gt;3. Sending the data for a gesture to the Vector Quantizer to reduce the dimensionality of the data to 1-D. (Assigning labels)&lt;br /&gt;4.  Then the vector quantized data is used to train the HMM which is then used for the recognition.&lt;br /&gt;&lt;br /&gt;For their training, they are adding artificial noise to introduce variability and thus increasing their training data set. They found that with SNR=3  Gaussian noise, data achieved best accuracy. They collected  30 distinct 3D acceleration vectors from one person and  selected 8 gestures for the control.  Then they added noise to introduce variability and get additional data. They then used the data for cross validation to obtain the best training/ testing set . They got an accuracy of 97.2% with SNR=3.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Discussion:&lt;br /&gt;&lt;br /&gt;This paper was strange for me. I don't understand, If I am correct they are using the same data for  vector quantization  (after adding nose too) and then they are testing on part of the data. And after using vector quantization why are they using HMM's. Also addition of noise cannot introduce the variability that different user can introduce, so their accuracy means nothing to me.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-7685238356484437063?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/7685238356484437063/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=7685238356484437063' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/7685238356484437063'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/7685238356484437063'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/04/this-paper-presents-use-of-acceleration.html' title='Enabling Fast and Effortless Customisation in Accelerometer Based Gesture Interaction'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-4370546188584571787</id><published>2008-03-30T23:13:00.000-07:00</published><updated>2008-03-30T23:48:54.397-07:00</updated><title type='text'>SPIDAR G&amp;G: A Two-Handed Haptic Interface for Bimanual VR Interaction</title><content type='html'>&lt;span style="font-weight: bold;"&gt;Summary:&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;This paper presents a Haptic device called SPIDAR which is used to interact with the virtual world. The device consists of the ball in the center which is attached to different pulleys through strong nylon threads. This system provides user with 6 DOF and 7th is the grasp which is provided using a pressure sensor on th ball.&lt;br /&gt;&lt;br /&gt;The interaction with the virtual world is through the movement of the  SPIDAR ball  in the restricted environment (each SPIDAR corresponds to the single object)provided by the arrangement. The motion of ball in a given direction creates tension in some of the strings of available direction. This tension drives the pulley against resistance provided by the motor. This motion is captured and used a the input for the motion of the object associated. Authors have stated thats such a system can be beneficial for use in  tele-operation, medical operations. molecular simulations etc.&lt;br /&gt;&lt;br /&gt;The system of two SPIDER was tested on three users, where each of the user was assigned a task of controlling a sphere in a virtual world with one hand (SPIDER) and using the other hand (SPIDER) to touch a pointer to the marks on the sphere . They observed that people liked SPIDER -G&amp;amp;G  (bi modal version) compared to the SPIDER-G (single mode version) because the bi modal version seemed much intuitive.Also they found that users were able to perform better when provided with haptic feedback.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Discussion:&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;This device is a self developed device by the authors and has nice combination of the mechanics of strings and computer manipulation of the data. Though device has a good feedback, movement of the ball is restricted by the strings as they may interfere with each other. Also, we have to apply a balanced force to interact with the system as the system is not fixed and may fall down with more force and with less force may not give desired result.&lt;br /&gt;&lt;br /&gt;However the cost involved in such a system is a limiting factor and also no new work has been reported which limits my knowledge about the current state of the system. Also since there is one SPIDER per objects,the interaction is very limited. May be with some kind of switch single SPIDER can be used to interact with the other objects with just the press of the switching switch. Also, it would be interesting if the similar objects can be grouped together and then the single SPIDER can be used to manipulate with them in virtual world.&lt;br /&gt;&lt;br /&gt;Since I have personally used the system, I believe it is one of the stand apart application and very useful in terms of  interaction response provided by the device.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-4370546188584571787?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/4370546188584571787/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=4370546188584571787' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/4370546188584571787'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/4370546188584571787'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/03/spidar-g-two-handed-haptic-interface.html' title='SPIDAR G&amp;G: A Two-Handed Haptic Interface for Bimanual VR Interaction'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-720867453843920117</id><published>2008-03-30T23:06:00.000-07:00</published><updated>2008-03-30T23:10:49.563-07:00</updated><title type='text'>Gesture Recognition with a Wii Controller</title><content type='html'>Summary:&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Get a wii Controller and obtain the acceleration values.&lt;/li&gt;&lt;li&gt;Filter the values and remove some of the redundant values&lt;/li&gt;&lt;li&gt;Use Vector Quantization (K-means to form clusters)&lt;/li&gt;&lt;li&gt;Feed to HMM with Bayes classifier&lt;/li&gt;&lt;li&gt;Get the results (90%)&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;Discussion:&lt;br /&gt;&lt;br /&gt;Nothing to discuss as I just see it as application paper.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-720867453843920117?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/720867453843920117/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=720867453843920117' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/720867453843920117'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/720867453843920117'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/03/gesture-recognition-with-wii-controller.html' title='Gesture Recognition with a Wii Controller'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-6980680730870147457</id><published>2008-03-30T22:29:00.000-07:00</published><updated>2008-03-30T23:04:26.740-07:00</updated><title type='text'>Taiwan Sign Language (TSL) Recognition Based on 3D Data and Neural Networks</title><content type='html'>Summary:&lt;br /&gt;&lt;br /&gt;This paper presensts a Neural Network based approach to recognize the 20 Taiwanese Sign Language  Static Gestures. They have proposed a neural network (Back Propgation NN) for recognizing the 20 static Taiwanese gestures that are captured using a vision based capturing device called VICON.  Using markers on the dorsal surface of the hand, they capture the features for the given gesture. The given gesture features are actually distance measures of the marker positions relative to some reference. The distances are normalized  to take care of the variable hands and then used as the feature inputs to the Neural Network. The neural netwok is trained on the similar data obtained from the users. Authors have reported that they have used data from 10 students which repeated each of the 20  gesture 15 times thus providing 3000 data samples in all. Out of 3000 data samples, 212 were reprted to have missing values and were not used. Out of the rest 2788, 1350 samples were uysed for training and 1438 were used for testing.&lt;br /&gt;&lt;br /&gt;Their NN architecture consists of 15 input Neurons and 20 output neurons aong with 2 hidden layers. With 250X250 neurons in the hidden layer they have reported accuracy of 94.5 on the test data while on the training data it was 98.5 (not important). 15 input neurons are for the  15 feature vectors used and 20 output neurons provide the output probabilities of each of the gesture.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Discussion:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;This paper was simple and straight forward. The way of obtaining the features was simple considering only the distances between the markers were chosen and also gestures chosen had no occlusion. Considering the statis gestures, I think this is the problem in 2D rather than problem in 3D as the 3rd dimension is / can be always constant for the static gestures. The training and testing data is obtained with much care which may affect the recognition , if input data is taken from outside the users without much instructions. There is not much take home message from this paper except their way of obtaining the distance metrics that can be used as features.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-6980680730870147457?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/6980680730870147457/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=6980680730870147457' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/6980680730870147457'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/6980680730870147457'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/03/taiwan-sign-language-tsl-recognition.html' title='Taiwan Sign Language (TSL) Recognition Based on 3D Data and Neural Networks'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-3376197952998389201</id><published>2008-03-30T20:45:00.000-07:00</published><updated>2008-03-30T22:06:46.371-07:00</updated><title type='text'>Hand Gesture Modelling and Recognition Involving Changing Shapes and Trajectories, Using a Predictive EigenTracker</title><content type='html'>Summary:&lt;br /&gt;&lt;br /&gt;This paper presents an approach of gesture recognition using the vision based techniques. This technique boasts of no training involved like HMM's ,hence faster adaptability to new gestures . The only requirement (which is kind of not nice)  is that the gestures should be well distinguishable. The algorithm obtains the affine transforms of the image frames and projects the image to the Eigenspace. In the Eigenspace, since only hand is moving, while the background is stationary, the first few PCA components would capture the maximum variance,i.e  in fact the motion of the hand in each frame.&lt;br /&gt;&lt;br /&gt; This method is inspired by another similar method called Eigen tracker, however, differs from it because of the added predictive modality. The predictive nature of the proposed method makes it versatile enough to track hand motion on fly without any requirement for  description of the orientation and physical dimensions of the object offline as required for the previous eigentracker method. This predictive nature is induced by using the skin color for segmenting out the hand and then using particle filter to track the hand.  The information about the position of hand is obtained by the significant eigenvectors in the eigen space as only hand is in motion while rest of the background is stationary.The variations in the motion direction are captured when the error between the prediction and the actual track exceeds certain threshold.Tracking Hand, along with the information about the change in motion track (captured by the error between prediction and actual position)can be used to map the information to a gesture would give the  structure of the gesture (assuming linearity between motion ) which can be matched against the available gesture set (which is decided off line).&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Discussion:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Well the nice part of the paper is the use of Particle filter for obtaining the information about the segments of the gesture by measuring the error between the prediction and the actual motion and the use of affine eigen space for capturing the hand motion. Well the method proposed is much different from the most of the papers we have read , though the results are not as impressive. As suggested by many in the class 100% doesn't make sense as the test data is 80% similar to the train data  (64/80). Also, they have used PCA space to obtain the maximum variance, which is not robust for noise. I agree though with their simple steady background and black arms,it would work though in many real life situations this may not be feasible. I would have been happy with some complex gestures with slightly lower accuracy rather than 100 % accuracy (which I am not impressed with) with very distinct simple gestures.&lt;br /&gt;&lt;br /&gt;However, I like vision based techniques as they provide much freedom and space for gestures which gloves do not provide and as such I would add this paper to my favorites in the semester.&lt;br /&gt;&lt;br /&gt;With the advancement in the digital imaging techniques and capturing devices, it is possible to change the background to some other stationary background. So if the relative motion of the hand captured is faster than the change in the background, we shoule be able to capture the hand motion and blend it with some artificial background. By such an appraoch, we can tackle the problem of noises associated with changing background in PCA based techniques.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-3376197952998389201?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/3376197952998389201/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=3376197952998389201' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/3376197952998389201'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/3376197952998389201'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/03/hand-gesture-modelling-and-recognition.html' title='Hand Gesture Modelling and Recognition Involving Changing Shapes and Trajectories, Using a Predictive EigenTracker'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-2405625698335825431</id><published>2008-03-21T21:54:00.000-07:00</published><updated>2008-03-21T22:05:41.660-07:00</updated><title type='text'>Wiizards: 3D Gesture Recognition for Game Play Input</title><content type='html'>This paper presents modern Wii controllers as the wands for casting spells in  a game. The data from the accelerometers is used to obtain the xyz coordinates and are used to frame the gesture.  There as certain gestures that can be used to cast a spell on the opponent. The gestures have been classified as: actions, modifiers, and blockers and HMM's have been used for recognition (another HMM though  without Datagloves).&lt;br /&gt;&lt;br /&gt;Training involved 7 different players who were asked to perform a given gesture over 40 times. The HMM presented accuracy of maximum 93% with 15 states and 90% with 10 states using test data from the same users as used for training. With new user there is a drastic drop to 50%.&lt;br /&gt;&lt;br /&gt;Discussion:&lt;br /&gt;&lt;br /&gt;Not at all impressive though something different.I am tired of explaining HMM's  but most of the results they presented were quite obvious as all training based algorithms improve as more data is available.Nothing much to say&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-2405625698335825431?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/2405625698335825431/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=2405625698335825431' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/2405625698335825431'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/2405625698335825431'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/03/wiizards-3d-gesture-recognition-for.html' title='Wiizards: 3D Gesture Recognition for Game Play Input'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-2882980313339433192</id><published>2008-03-21T21:21:00.000-07:00</published><updated>2008-03-21T21:52:19.872-07:00</updated><title type='text'>TIKL: Development of a Wearable Vibrotactile Feedback Suit for Improved Human Motor Learning</title><content type='html'>This paper presents the case that using an external  real-time vibrotactile feedback suit can provide better training for teaching subjects to mimic the skills of an expert.This is because subjects may not be able to mimic the very minute details of motion and angle of the teacher as they may either not be able to observable or may not present the relative position of various joints. Their system can provide the real time response at the joints where subjects are not meeting the criterion and as such , subjects get to know all the positions relative to other positions where they went wrong and hence can correct the same appropriately.&lt;br /&gt;&lt;br /&gt;This kind of feedback supplemented by the conventional auditory and visual feedback would definitely help the learners to learn better by knowing the errors in the body kinematics by vibratory feedback at that points.The system uses a Vicon motion capture system to track motion and the suit contains the vibro tactile sensors that provide the feedback and needs to be worn during the training&lt;br /&gt;&lt;br /&gt;Possible applications have been suggested as sports training, dance, and other similar activities. They also conducted a user study in which around  40 participants were included  but only 20 were provided with the suits and rest 20 were trained without suits. It was observed that the participants with the suit performed better because of the  vibratory response. It was observed that the users with the suit had a 27% improvement in accuracy and an accelerated learning rate of 23% over the non suited counterparts under similar rest conditions.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Discussion:&lt;br /&gt;&lt;br /&gt;It was a nice paper well written with good explanation. I liked the approach however it would have been nice if legs and the upper body feedback could also be provided as wrong placement of the three parts of the body can cause injuries.Such a system also provides an excellent way to have a remote teaching school as a instructor may not be physically availabe and the less skilled students can be trained online. I had some similar &lt;a href="http://sachuin23.googlepages.com/Student_Paper.pdf"&gt;work&lt;/a&gt; in my under-grad when I used ANN to teach less skilled drivers  real time- steering control for driving.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-2882980313339433192?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/2882980313339433192/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=2882980313339433192' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/2882980313339433192'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/2882980313339433192'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/03/tikl-development-of-wearable.html' title='TIKL: Development of a Wearable Vibrotactile Feedback Suit for Improved Human Motor Learning'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-765707821960960114</id><published>2008-03-21T20:10:00.000-07:00</published><updated>2008-03-21T21:20:33.083-07:00</updated><title type='text'>Spatio temporal extension to isomap nonlinear dimension reduction</title><content type='html'>This paper presents a spatio-temporal isomap which is basically an extension to the conventional isomap proposed by tenumbum in 2000. The presented method captures the temporal relationships between the neighborhood that can be propgated globally via a shortest path mechanism. ST Isomaps aim to deal with the &lt;span style="font-style: italic;"&gt;proximal disambiguation &lt;/span&gt;which means to distinguish between spatially close data in the input space that is structurally different and finding the &lt;span style="font-style: italic;"&gt;distal correspondence&lt;/span&gt; which means finding the common structure in the space. By finding these measures, one can find the spatio-temporal structure of the data. As epr the paper, proximal disambiguation and distance correspondence are the pair wise concepts and  as such the existing pairwise dimension reduction needs to be augmmented to include spatio-temporal relationships.&lt;br /&gt;&lt;br /&gt;The approach involves:&lt;br /&gt;&lt;br /&gt;1 Windowing of the input data into temporal blocks which basically serves as a history of each data point.&lt;br /&gt;&lt;br /&gt;2 Computation of sparse distance matrix D from the local neighborhood using Euclidean distance.&lt;br /&gt;&lt;br /&gt;3) Using the  Distance matrix obtained in above step to obtain the common temporal neighbors CTN&lt;span style="font-style: italic;"&gt;&lt;span style="font-style: italic;"&gt; &lt;/span&gt;&lt;/span&gt;which are either  local temporal  CTN or K-nearest non-trival neighbors.&lt;br /&gt;&lt;br /&gt;4) Above measure is used to reduce the distance between points with common and adjacent temporal relationships.&lt;br /&gt;&lt;br /&gt;5) The above metric is then used to obtain the shortest pair distance metric using Dijkstra's.&lt;br /&gt;&lt;br /&gt;6) Classical MDS is applied to preserve the spacing .&lt;br /&gt;&lt;br /&gt;Step 1,3,4 are the contribution of the paper . These steps introduce the temporal information in the Isomaps.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;This paper presents the approach to apply ST isomaps on continuous data where &lt;span style="font-style: italic;"&gt;K-nearest non trivial neighbor &lt;/span&gt;metric is used to find the best matching neighbor from each individual trajectory and removing any redundancy of selection of neighbor. Where as, Local Segmented common temporal neighbor is used for measuring the distance metric for non continuous data. The Segmented Common Temporal Neighbor hood approach is based on the logic that the pair of points are spatio-temporally similar, if they are spatially similar and the points they are transition to are also spatially similar.&lt;br /&gt;&lt;br /&gt;They have applied their method a  tele-operated NASA Robot to grasp wrenches placed at various locations, on obtaining the Kinematic motion data from human subjects. They have also given a comparison with PCA and standard Isomaps.They also showed some comparizon with HMMs.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Discussion:&lt;br /&gt;&lt;br /&gt;They have added another important dimension to the data that can actually capture the motions that repeat in time and may be structurally similar though temporally they may be different  for example a  spiral motion . Adding temporal information makes a lot of sense as many  dimensionality reduction techniques may give false results as they cannot distinguish between repetition of data as a temporal characteristic but in fact it will take it as redundancy of data which in fact it is not (I mean all data is not redundant).&lt;br /&gt;&lt;br /&gt;I can now  see that for gestures and motion capture we cannot use PCA if our gesture contains repetition of motion and for such cases ST -Isomaps is the solution to capture the embedded motion which characterizes the gesture&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-765707821960960114?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/765707821960960114/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=765707821960960114' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/765707821960960114'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/765707821960960114'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/03/spatio-temporal-extension-to-isomap.html' title='Spatio temporal extension to isomap nonlinear dimension reduction'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-7783223289363672919</id><published>2008-03-19T21:43:00.000-07:00</published><updated>2008-03-19T22:10:04.514-07:00</updated><title type='text'>Articulated Hand Tracking by PCA-ICA Approach</title><content type='html'>&lt;div style="text-align: justify;"&gt;This paper presents a vision based approach for capturing the hand postures and recognizing them. Authors have suggested the use of PCA for the location of hand in the image frame and then using ICA to obtain the intrinsic features from the hand to identify the hand motions.ICA has been described by the authors as a way to obtain a linear non-orthogonal coordinate system in any multivariate data. The goal of ICA is to perform linear transformation which makes the resulting variable as statistically independent from each other as possible.&lt;br /&gt;&lt;br /&gt;They have represented the hand motions by modeling a Hand model in open GL and then using the information about the degrees of freedom of the hand fingers to obtain the  various possible combinations available in which fingers touch the palm.&lt;br /&gt;&lt;br /&gt; They used the data gloves to obtain the joint information of the possible 31 combinations . They then used that data to obtain the model parameters for various combinations generated by the open GL model  over a time span and obtained around 2000 dimensional vector for each posture.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;By PCA they reduces the dimensionality of the problem and were able to locate the position of the maximum variance in the image frame. Then by using the ICA model where each basis represents the motion of a particular finger they obtained the  hand pose for a given time frame. They used the particle filtering method to track hands in accordance with the bayes theorem.&lt;br /&gt;&lt;br /&gt; They employed the edge and silhouette model to match the hand frame with the open GL model and then estimated the  closest match between the hand  image and open GL models.By superimposing open GL model on hand image, they were able to recognize the posture.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Discussion:&lt;br /&gt;&lt;br /&gt;I liked some different approach in this paper though I don't agree the statistically independent nature of fingers exists for all the hand postures. But considering the simplicity of their gestures, it might work. I likes that they used PCA for global hand tracking but PCA requires the bacground to be stable and only hand moving to track the variance. If there is some change in background (like the user moved a bit) PCA may give erroneous global results. Though for limited region, it may be feasible and simplest approach.&lt;br /&gt;&lt;br /&gt;I would like to think more about the feasibility of ICA for intrinsic finger tracking, though presently I believe it is not possible to track fingers by this approach for the kind of complex motions we are aiming at,&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-7783223289363672919?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/7783223289363672919/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=7783223289363672919' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/7783223289363672919'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/7783223289363672919'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/03/articulated-hand-tracking-by-pca-ica.html' title='Articulated Hand Tracking by PCA-ICA Approach'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-7263387409008977443</id><published>2008-03-19T21:17:00.000-07:00</published><updated>2008-03-19T21:42:31.740-07:00</updated><title type='text'>The 3D Tractus: A three Dimensional Drawing Board</title><content type='html'>This paper presents a drawing system to draw shapes in 3D. It consiste of a Tablet PC mounted on the top of a mechanical structure that can move up or down using the dead  counter weight and push of the user. Authors believe that  by providing the mechanical 3D motion in the Z direction, it will be more intuitive for users to sketch in 3D. The mechanical device is constructed to minimize human effort in pushing/ pulling.&lt;br /&gt;&lt;br /&gt;In order to capture the 3D data, they have used a simple potentiometer whose resistance varies with the movement (up and down of the mechanical structure), this information is calibrated to obtain Z value using a  Analog to digital converter and provided to the PC via a USB connection.&lt;br /&gt;&lt;br /&gt;The user interface of the system consists of a 2D drawing pad and a window which displays the view of the  object being drawn by the user. In order to provide the depth cue in 2D, authors tried to use different color cues but found it to be confusing. They also tried using the varying thickness cues but found it also to be non intuitive. Finally they decided to present users with just the information that all thin strokes shown are actually below and users are drawing the top strokes. Similarly authors found out that projective views were more intutive and helpful than the orthographic projections and as such  the window which displays the object being drawn shows projective view.Their system also provides provision for deletion so that users can edit their sketches.&lt;br /&gt;&lt;br /&gt;They conducted a user study where they asked arts students to get familiar with their system and use it to draw certain sketches. Authors observed that the users liked their system though each user agreed that it was easier to push the table down than to pull it up.Also users reported that it would have been better if they could tilt the surface in the direction they were sketching the object. Also, they complained about alignment issues as they found it difficult to match the 3D symmetry of object being drawn (like top and bottom of Box)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Discussion:&lt;br /&gt;&lt;br /&gt;I think it was a cool idea but very uncomfortable as  user has to push and pull the table which seems much unintuitive to me. Also I would not be surprised if the user unintentionally pushed the table while sketching as some people tend to sketch with hard hand.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-7263387409008977443?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/7263387409008977443/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=7263387409008977443' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/7263387409008977443'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/7263387409008977443'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/03/3d-tractus-three-dimensional-drawing.html' title='The 3D Tractus: A three Dimensional Drawing Board'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-4573065072081296810</id><published>2008-03-19T19:31:00.001-07:00</published><updated>2008-03-19T20:16:51.976-07:00</updated><title type='text'>A hidden Markov Model Based Sensor fusion Approach for Recognizing Continuous human grasping Sequence</title><content type='html'>This paper presents a method that is aimed at teaching  robots the human grasping sequence by observing the human grasping. Though the main aim of the systemis to just use vision for the purpose, they are using this approach as a faster alternative.Authors have argued that the grasping postures along with the  tactile sensor feedback can be used for capturing the  grasping sequence. As such , they have used 18 sensor cyber glove along with the tactile sensors which are sewed under the gloves and occupy certain position on hand , which have been identified  as spots that have maximum chance of detecting the contact using smaller number of sensors.&lt;br /&gt;&lt;br /&gt;For the classification purpose they have used  Kamakura grasp Taxonomy which separates grasps into 14 different classes according to their purpose ,shape and contact points. With this taxonomy, it is easier to identify Human grasps that humans use in the very day life. As per the Kamakura taxanomy, gestures have been classified into 4 major catagories:&lt;br /&gt;&lt;br /&gt;1. 5 Power Grasps&lt;br /&gt;2. 4 intermediate grasps&lt;br /&gt;3. 4 precision grasps&lt;br /&gt;4. 1 thumb less grasp.&lt;br /&gt;&lt;br /&gt;Each Grasp is actually modeled as a HMM and used for as a classifier.The data for the HMM is obtained through Cyberglove and this data is fused with the sensor data obtained from the tactile sensors for the particular grasp. This is done as using just the information from the data gloves may not be correct as the shape of hand between two grasps may not change significantly. They have obtained 16 feature values from the gloves and 16 sensor values along with the one maximum sensor value to frame the feature vector for the particular grasp. Their system taken in both the inputs simultaneously and learns to weigh their importance by adjusting parameters during the training.&lt;br /&gt;&lt;br /&gt;Their model consists of 9 states for HMM's  and the HMM's are trained offline for each gesture. Along with the 14 HMM's for grasp classes, a junk class for garbage collection was also trained. They have made a simple assumption that each grasp must be followed by a release. This is done to ensure segmentation of the gesture and the maximum of the grasp gives a cue about the grasp and non grasp.&lt;br /&gt;&lt;br /&gt;For modeling the HMM's they have used the Georgia Tech HMM toolkit and collected 112&lt;br /&gt;112 training samples and 112 testing samples from 4 different users. They have reported an maximum accuracy of the single user model  (trained on 1 user data and tested on same user data)as 92.2% with minimum of 76.8% and for the multiple user system (trained on all and tested on a given user data) they have reported accuracy of 92.2% as maximum and minimum accuracy of 89.9%. They have also suggested that with more user data, single user model may get better than multi user model. They have also claimed that  most of the  recognition error came from a small set of grasps in which the system relied solely on tactile information to distinguish the grasps. They believe that improved sensor technology may improve the results.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Discussion:&lt;br /&gt;&lt;br /&gt;This paper presented a new method which utilizes the tactile information for gesture recognition. It made much sense to me as  common gestures for  day to day tasks contain much of the tactile information which can be exploited to differentiate between the  two similar looking gestures. for example a tight fist and hollow fist  (sometimes used to show O) may look similar to the glove but including the tactile information can distinguish between the two.&lt;br /&gt;&lt;br /&gt;Also, it would be interesting  if the vision can be used to add more flexibility to the system as a similar looking gesture (based on tactile and cyberglove) at a different position in space may  convey some different meaning, (but , this can be done using the Flock birds too). Also this paper made gave me another approach for the segmentation problem based on utilizing the tactile information&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-4573065072081296810?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/4573065072081296810/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=4573065072081296810' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/4573065072081296810'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/4573065072081296810'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/03/hidden-markov-model-based-sensor-fusion_19.html' title='A hidden Markov Model Based Sensor fusion Approach for Recognizing Continuous human grasping Sequence'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-3279734347611670159</id><published>2008-03-03T21:01:00.000-08:00</published><updated>2008-03-03T21:05:25.094-08:00</updated><title type='text'>Temporal Classification:Extending the Classification Paradigm to Multivariate Time Series</title><content type='html'>&lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;This paper is basically a part of the much detailed thesis dealing with the Australian Sign Language. They are using two gloves to capture the data and analyze the recognition rate. One of the gloves used is the Nintendo glove and the other is a device called the Flock data. Nintendo is a low cost glove with the cheap sensors and Flock is the complicated superior device. The data obtained is used in their classifier called the Tclass, which looks like a decision Tree type classifier, and using the different parameters of the Tclass results were obtained.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style=""&gt; &lt;/span&gt;Since the data obtained from the Nintendo glove is noisy they have used smoothening to tune in their results which helped to get better accuracy. In the end they used a voting methodology to get the best learners, similar to the ada boost, to improve the accuracy and decrease the error.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;With the flock, they did the same thing how ever they found that the smoothening is actually affecting the recognition results as the sensor data is already much refined with almost nil noise. &lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;Discussion:&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;The have shown that their classifier called the Tclass was able to provide a low error rate, by using the ensemble. They tested their data on Nintendo and Flock and  smoothening worked with the Nintendo and not with the flock. I believe,it is because the data from Nintendo is so noisy that the distinguishing features are suppressed by the noise, while in case of the flock data,  if we tend to smoothen the refined data accuracy will drop as the distinguishing features associated with the data are smoothened. It would have been nice to read in actual what Tclass is and what it does.I believe the only good thing in the paper was the Tclass classifier which is actually some kind of decision tree based classifier which needs to be investigated.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-3279734347611670159?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/3279734347611670159/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=3279734347611670159' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/3279734347611670159'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/3279734347611670159'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/03/temporal-classificationextending.html' title='Temporal Classification:Extending the Classification Paradigm to Multivariate Time Series'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-244797626247404290</id><published>2008-03-03T15:10:00.000-08:00</published><updated>2008-03-03T15:12:31.120-08:00</updated><title type='text'>Using Ultrasonic Hand Tracking to augment Motion Analysis Based Recognition of manipulative Gestures</title><content type='html'>&lt;p class="MsoNormal" style="text-align: justify;"&gt;This paper deals with the introduction of the ultrasonic sensors with  accelerometer and gyro meters for capturing the gestures to recognize the activity. The contribution of the paper has been claimed as the use of ultrasonic for the motion analysis and combining the information from the sensors to refine the results. The information obtained by different sensors used, is processed by a classifier so that the motion can be recognized. The classifiers used are: HMM’s, C.4.5, KNN. For capturing the sensor data, three ultrasonic beacons are placed on the top of the roof and the listeners are placed on the arms of the users. It is reported that the ultrasonic deals with the problem of reflection, occlusion and temporal resolution (low sampling rate) and hence the information provided by just ultrasonic sensors is not reliable and there are many false responses associated. Apart from this there is noise associated with the sensor input over the time frames which cannot be smoothened using Kalman filters as the sampling rate is much less than the frequency of hand movement.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;In their experiment they have taken an example of bicycle repair and have chosen following gestures involving pumping, screwing screws or de-screwing them, different pedal turnings, assembly of the parts, wheel spinning and carrier object removing/ placing&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;They have tried approaches that can be classified into two categories:&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;ol style="margin-top: 0in;" start="1" type="1"&gt;&lt;li class="MsoNormal" style="text-align: justify;"&gt;Model Based Approaches &lt;/li&gt;&lt;li class="MsoNormal" style="text-align: justify;"&gt;Frame based approaches.&lt;/li&gt;&lt;/ol&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;In the model based approach they are using the HMMs on the sensory information obtained by the 2 gyro meters and 3 accelerometers on the users’ right hand and also the same set of sensors on the upper right hand. &lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;In the frame based approach the feature vectors are obtained during each time frame and used for either training or testing of the classifier. The set of features extracted are: mean, standard deviation, median of the raw sensor data. This approach captures the local features and can be used to obtain the local characteristics which can be exploited for training or testing the classifier. For their approach, they are using no overlaps between the adjacent windows and using the part of the feature vectors from a frame for training and part for testing. For the comparison they are using the Kmeans and the C.4.5 classifier for testing the frame based approach.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;They also proposed the use of plausible analysis for classification which actually means restriction of the search space for the vectors and using the information from both frame based and model based approaches for classification. &lt;span style=""&gt; &lt;/span&gt;For example the result of the gesture recognition obtained by the HMMs is compared with the constraint restriction imposed and if it is satisfied, the gesture is selected else next best gesture satisfying both constraints is selected. &lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;In the results they have claimed that the ultrasonic with the C.4.5 classifiers produced the results close to 58.7% and with K-means they have shown 60.3 % classification. They have argued that since most of their gestures are not distinguishable using only hand locations, the results were affected. They then used a kind of ensamble to merge the inputs from the accelerometer and gyro meter and obtained a high classification in (90 % range). They argued that for certain gestures they almost achieved 100% while for some gestures, which are ambiguous and can be confused with other gestures, there was a drop in classification.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;Disussion:&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;    &lt;p class="MsoNormal" style="text-align: justify;"&gt;They have just merged the ultrasonics with the accelerometer data and used the same to get the sensory information of the local movement associated with the body part with the global position of the part. This method is definitely going to yield better results as we have two layers (Global and then Local) of classification as we vote for the gesture that meets requirements for both layers and it is no surprising. The method is intuitive but needs a specialized room, as ultrasonic waves are reflected by metallic objects and also occlusion affects the response. They have not addressed the issues with occlusion associated with our requirements as we are dealing with fingers and hand movements which cannot be prevented from occlusion using the top, mounted beckons. May be we can use array of beckons on ground ,top, sides to capture the responses but that would restrict the doimain of application for the user.&lt;o:p&gt;&lt;br /&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;Also I feel, instead of wired accelerometers and gyro meters, it should be nice to use the wireless sensors which were described in the last week’s paper on “ASL for game development”. I would be really light weighted and more particle to use.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-244797626247404290?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/244797626247404290/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=244797626247404290' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/244797626247404290'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/244797626247404290'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/03/using-ultrasonic-hand-tracking-to.html' title='Using Ultrasonic Hand Tracking to augment Motion Analysis Based Recognition of manipulative Gestures'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-3455345789090012236</id><published>2008-02-27T13:30:00.000-08:00</published><updated>2008-02-27T22:58:30.262-08:00</updated><title type='text'>American Sign Language recognition in Game Development for Deaf children</title><content type='html'>This paper presents a system called copy cat which is actually a computer game system that utilizes the gesture recognition technology to teach ASL to the deaf children. The motivation behind the paper is to expose children to the sign language at appropriate age when then can pick up the skill relatively very fast and since the statistics show that 90% of the deaf children are born to hearing parents ,which may not be knowing ASL, this system can be of great help. This system is interactive and shows children the interactive video before the game beings and the related signs. Then the game encourages the children to interact with the computer game by correctly making those signs  which can execute the task assigned. Thus the children learn to interact with the virtual environment by making the correct signs which maps to the actions in the game.The set up is shown in the figure below:&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_3DcJ3hARvOY/R8Za0zWlL7I/AAAAAAAAASo/8UEEgN47Vy0/s1600-h/Untitled.jpg"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 264px; height: 176px;" src="http://1.bp.blogspot.com/_3DcJ3hARvOY/R8Za0zWlL7I/AAAAAAAAASo/8UEEgN47Vy0/s320/Untitled.jpg" alt="" id="BLOGGER_PHOTO_ID_5171921085266210738" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;As per the author's there is no ASL engine existing to test the setup so they conducted their experiments using the Wizard of OZ study  using a human wizard which would be eventually replaced by the computer latter.Since the system is in nascent stage, they have limited the ASL to just single/ double handed gestures and no facial or other expressions. The vocabulary was chosen  such that it was comparable to the system constraints  as well as the standards of  what is taught in the  real class. The system follows push to sign approach which means that the user has to push a button to activate the recognition system and then do the gesture which is recognized.&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_3DcJ3hARvOY/R8ZbazWlL9I/AAAAAAAAAS4/tSX4iA41kaw/s1600-h/Untitled.jpg"&gt;&lt;img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer;" src="http://1.bp.blogspot.com/_3DcJ3hARvOY/R8ZbazWlL9I/AAAAAAAAAS4/tSX4iA41kaw/s320/Untitled.jpg" alt="" id="BLOGGER_PHOTO_ID_5171921738101239762" border="0" /&gt;&lt;/a&gt;This system, consists of the colored gloves  and the wireless accelerometers which capture the motion data of the hands using the gravitational effects on the X,Y and Z coordinates.  All the related  hardware was developed in house. Since they are using the different color gloves, they are using the color segmentation approach in which the discriminatory information of the background and the glove color, based on the HSV histogram  is used for segmentation .The data received from the  vision and sensor based approaches will provided to the trained HMM which then recognizes the sign and triggers the mapped action in the game. The HMM tool kit proposed for the system is the GT2k developed at Georgia Tech. They are using the human observer to prune the responses and label them as correct or incorrect.The system Architecture is shown in the figure to the left.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;They have reported their results as the user dependent and the user independent models. In the user dependent models , they obtained  accuracy of 93.39% by training on the 90% data and the testing on remaining 10 % data repeating it 100 times. In the user independent models, they have obtained accuracy of 86.28%.&lt;br /&gt;They have reported the success rate of 92.96 %  on average in  all samples at the  word level samples, however they have reported that for the sentence level, their system gave less because the words can be deleted and added which causes less accuracy.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Disussion:&lt;br /&gt;&lt;br /&gt;This paper presented a nice system which can teach children with hearing disability to learn the ASL in an interactive way through GAME which makes it more exciting than boring classes. A wizard of oz study ensures that it is understood how children would like to interact with the system and thus gave an idea of what the system should look like and interact.The usage of vision with the simple blue tooth wireless adapters was interesting as it makes them free from wires that may make things messy.Over all it was  a nice paper with a nice practical application, but I still donot have GT2k anywhere on line!!!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-3455345789090012236?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/3455345789090012236/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=3455345789090012236' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/3455345789090012236'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/3455345789090012236'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/02/american-sign-language-recognition-in.html' title='American Sign Language recognition in Game Development for Deaf children'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_3DcJ3hARvOY/R8Za0zWlL7I/AAAAAAAAASo/8UEEgN47Vy0/s72-c/Untitled.jpg' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-8274317427643653681</id><published>2008-02-27T09:53:00.000-08:00</published><updated>2008-02-27T11:44:54.548-08:00</updated><title type='text'>A Method ofr recognizing a Sequence of Sign Language Words represented ina Japanese Sign Language</title><content type='html'>This paper presents the recognition system for the Japanese sign language. In this paper authors have highlighted that sometimes the superfluous gestures are included in the recognition results which is not desirable. As such we need to segment these superfluous gestures, which are actually the caused by in correct segmentation because of incorrect identification of the border of the signed words.Also they intend to identify if the gesture is made by two hands or one hand.&lt;br /&gt;&lt;br /&gt;In order to identify the borders they have proposed two measures. One is the measure of change of velocity and the other is the measure of the change in direction and hand movements dynamically. The segmentation point is registered if the measure exceeds certain threshold. Since the measure of the hand movement can lead to the different borders because of noise, they proposed using the hand border as one which is closest to the border detected by the change in velocity border. Using this information, gestures can be segmented from the stream of gestures and sent to the recognition system for identification.&lt;br /&gt;&lt;br /&gt;In order to identify the hand which is used in the gesture, they calculate several parameters representing the difference between the movements of the right and the left hand. These measures are basically the  velocity ratio relative to each other and the difference of the  velocity squared value normalized with time of the gesture.These parameters are calculated separately for the left and the right hand and if the  value is less than certain threshold for bot hands then both hands  are used else one hand is being used.They have used another measure which is based on  just the relative velocity of the hand to determine which hand is used in the gesture.&lt;br /&gt;&lt;br /&gt;The sequence candidates are generated by  evaluating a measure, which  takes into consideration if the identified segment is the transition or a word, and only considering the words. The segments identified as words are combined with the segments identified as transitions using a weighted sum and this is used for a sentence.&lt;br /&gt;&lt;br /&gt;They evaluated thus system using 100 samples from JSL which included 575 word segments and 571 transition segments out of which 46`(80.2%) of transitions were correctly recognized and 64(11.2%) were misjudged as words.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Discussion:&lt;br /&gt;&lt;br /&gt;This paper presented an approach which was similar to what we have been using in sketch i.e taking cues from the speed and direction about the stroke. I am not sure if we can use the velocity cues with much accuracy as normally the gestures are made very fast, however I liked that they have also used the change in hand movements too and then used both of the cues to identify the border. However, I did not like the flow of the paper and I was confused with the language too. It was not very clear how they got the thresholds and if people with different styles and speed can get use the system with same thresholds. Also they have admitted that some of the gestures were mis identified as their system does not take any spatial information which may the error in many gestures.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;'&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-8274317427643653681?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/8274317427643653681/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=8274317427643653681' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/8274317427643653681'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/8274317427643653681'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/02/this-paper-presents-recognition-system.html' title='A Method ofr recognizing a Sequence of Sign Language Words represented ina Japanese Sign Language'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-3503420394289506239</id><published>2008-02-25T10:09:00.000-08:00</published><updated>2008-02-25T10:26:51.135-08:00</updated><title type='text'>Computer Vision based getsure recognition for an augmented reality interface</title><content type='html'>This paper presents a computer vision based gesture recognition method for augmented reality interface. The presented work can recognize a 3D pointing gesture, a click gesture and five static gestures which are used to interact with the objects in the augmented reality setup.  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;It is stated that to interact in the virtual environment, system should be able to select the object by pointing towards it and selecting it by clicking. As such for any such system it is important to these two features as the primary features. To make the pointing gesture intuitive, they are using the index finger as the pointer. They are also using some of the basic gestures, very different from each other, shown in the figure below:&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_3DcJ3hARvOY/R8MH8zWlL4I/AAAAAAAAASQ/NvfLG-m_j-M/s1600-h/fri_blog.jpg"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="http://2.bp.blogspot.com/_3DcJ3hARvOY/R8MH8zWlL4I/AAAAAAAAASQ/NvfLG-m_j-M/s320/fri_blog.jpg" alt="" id="BLOGGER_PHOTO_ID_5170985538309926786" border="0" /&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;By constraining the users to perform the gesture in one plane they are restricting the problem to 2D though it is 3D in nature and it is stated that after some trails users were able to adapt to the constraint without much difficulty.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;As a first step, system involves segmentation of the fingers so that they can be distinguished from the place holder objects. For this they have used color cue to segment out the skin from the other objects. In order to deal with the issues of intensity and illumination changes, they are using the color space which is invariant to illumination changes i.e. normalized RGB space (chromaticity). In this space different objects form different clusters. These clusters are used to frame the confidence eclipse, by measuring the mean and the covariance matrices and distance of the chromaticity of the pixel is measured in terms of mahalanobis distance and thus we obtain different labels for different chromaticity value pixels.The predetermined size blob is then labeled as the hand and small blobs which are actually misclassification objects are discarded. &lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_3DcJ3hARvOY/R8MIODWlL6I/AAAAAAAAASg/3_0JqyvjpXE/s1600-h/chro_fri.jpg"&gt;&lt;img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer;" src="http://3.bp.blogspot.com/_3DcJ3hARvOY/R8MIODWlL6I/AAAAAAAAASg/3_0JqyvjpXE/s320/chro_fri.jpg" alt="" id="BLOGGER_PHOTO_ID_5170985834662670242" border="0" /&gt;&lt;/a&gt;The pixels in the hand blob which are missing are filled up using the morphological operators. To take care of the dynamic range issues, only pixels with certain minimum intensity are considered for the process. On the higher end the pixels which have at least one channel with 255, are discarded.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;Since each gesture can be recognized by the number of fingers, they have used polar transformation and the number of concentric circle to measure the number of fingers lying in the each radius. The click gesture is the movement of the thumb and they are using the bounding box measure to determine if the thumb has moved or not by measuring the bounding boxes of the series of frames.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;b style=""&gt;Discussion:&lt;o:p&gt;&lt;/o:p&gt;&lt;/b&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;b style=""&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/b&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;I chose this paper as I thought it would be nice to talk about the role of gestures in the augmented reality. This was a very simple paper and with very simple gestures that they are recognizing. The good part is their hand segmentation approach and some new ideas in term of augmented reality office which have come in some of the discussions. I did not like that though they claimed that they are recognizing 3D gestures, but by constraining users to move in a plane they forced the problem to simpler 2D. I believe that their recognition approach, based on counting fingers cannot work in 3D as the occlusion between the fingers will give ambiguous recognition results. However, I liked the approach they have presented, as by using such a system, interacting in a design meeting would be much interactive and less confusing.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-3503420394289506239?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/3503420394289506239/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=3503420394289506239' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/3503420394289506239'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/3503420394289506239'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/02/computer-vision-based-getsure.html' title='Computer Vision based getsure recognition for an augmented reality interface'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_3DcJ3hARvOY/R8MH8zWlL4I/AAAAAAAAASQ/NvfLG-m_j-M/s72-c/fri_blog.jpg' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-8821723460631179151</id><published>2008-02-24T22:41:00.000-08:00</published><updated>2008-02-24T22:43:07.840-08:00</updated><title type='text'>Georgia Tech Gesture Toolkit: Supporting Experiments in</title><content type='html'>This paper presents the gesture recognition tool kit developed by Georgia Tech, based on the Hidden Markov analogy. The tool kit is inspired by the HTK of Cambridge University, which is used for speech recognition. The main motivation behind the development of the kit is to provide the researchers adequate tool to concentrate on their research on gesture recognition, rather than dwelling into the intricacies of speech recognition to understand the HMM which has been widely researched by the speech community.&lt;br /&gt;&lt;br /&gt;This tool kit provides the users with the tools for preparation, training, validation and recognition using the HMM. The preparation involves the user to design the appropriate models, determine the appropriate grammar and providing the labeled examples of the gestures to be performed. All these steps require some analysis of the available data and the gestures involved. The validation step involves the evaluation of the potential performance of the overall system. Validations approaches like cross validation and one left out have been used in the paper. In the cross validation, portion of data is used for training and the other part is used for testing where as in the left one out, one data sample is always kept out for testing and the model is iteratively trained on the remaining. Training utilizes the information from the preparation stage to train the models for the each gesture and recognition, based on the HMM’s, is used to classify the new data using the trained models.&lt;br /&gt;&lt;br /&gt;Since it is necessary for the system to understand the relevant continuous gestures for any practical application, authors have proposed the use of rule based grammar for the same. With such a grammar the complex gestures can be explained with the set of simple rules.&lt;br /&gt;&lt;br /&gt;This toolkit has been used in various projects being undertaken at Georgia tech and authors have brief details of the same. The first application is development of the gesture based system to change the radio stations by performing certain gestures. The data is obtained by the LED sensors. As a gesture is made, some of the LED’s are occluded which provide the information about the gesture. This information is used to train the model which can then be used for the recognition purposes. Authors have claimed a classification of 249 gestures out of 251 gestures by this approach. Another project introduced is the patterned blinked eye based secure entry. In this project, face recognition is coupled with blinking of eye to generate person recognition model. Optical flow from the images was used to capture the blinking pattern. In this model, it was observed that 9 states in left to right HMM were able to model the blinking pattern. This model achieved an accuracy of 89.6%.Another project deals with the integration of computer vision with the sensing devices like accelerometer and other mobile sensors for sensing the motions. For the correct recognition, different color gloves are used. The features are obtained by both the techniques and integrated into combined feature vector representing a given gesture for recognition process. For this project, they have used 5 states left to right HMM, with self transition and two skip states. This project has not been implemented till the paper was published so no information about the results is available. Authors have also mentioned the use of HMM based approach to recognize the working of the workman in the workshop. They are using the vision and sensors to receive the features which are used to recognize the gestures. Since workman are suppose to perform a series of gestures in order, their model keeps track of their moves and reports an error if they miss a gesture. This system according to them has received accuracy of 93.33%.&lt;br /&gt;&lt;br /&gt;Discussion:&lt;br /&gt;&lt;br /&gt;This paper just provided an overview of a new toolkit for gesture recognition was being developed at Gtech on the top of HTK developed by Cambridge University. Though, after reading the paper, I was happy that they have something ready for gestures, I was disappointed to find no code on the project webpage, which as per the paper should have been made available as early as 2003. The projects where they are applying the technique also don’t look very attractive as with Bluetooth and wireless remotes, changing channels is much easy compared to making gestures. It is quite possible that a slight unintended occlusion can trigger a channel change. I also believe that voice technology is much more superior now for the purpose. Another project of blinking eye based entry was also something that I did not like. It is not very difficult to copy the blinking pattern and also making and remembering the complex blinking patterns is not easy task. (It is torture to eyes if you have some eye infection [J]). With finger biometrics and retinal signatures establishments are more secure. I have an experience of dealing with the people in the workshop, and I know their motions are very much mechanized and measured to meet the fast manufacturing requirements, but still there are still many unintended motions (after all they are humans), which the presented system can interpret as gestures and provide the alarm signal. Also, it will be really troublesome to work in a workshop with accelerometers on your body which can even affect the efficiency.&lt;br /&gt;&lt;br /&gt;Well There is nothing much to say about the paper,  if this toolkit is available for download some where it will be helpful and good to have a look at it. May be it can save us from some hard core programming.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-8821723460631179151?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/8821723460631179151/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=8821723460631179151' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/8821723460631179151'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/8821723460631179151'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/02/georgia-tech-gesture-toolkit-supporting.html' title='Georgia Tech Gesture Toolkit: Supporting Experiments in'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-4386392680404558150</id><published>2008-02-20T14:52:00.000-08:00</published><updated>2008-02-20T15:30:54.999-08:00</updated><title type='text'>Television Control by Hand Gestures</title><content type='html'>This paper presents a simple vision based approach to use hands as natural mode of television control. The setup requires a camera on the television set which keeps track of the hand gestures   made my the user sitting at a particular place and when gesture is observe, the system is triggered and the icon appears on the television which shows that the hands have been detected. After that,  icon appears which can be moved by the motion of the hand. To select the control, the icon needs to be kept stagnant/ hovering about the control button. The gesture recognition system also works when the television is switched off and detection of gesture triggers on the control mode.&lt;br /&gt;&lt;br /&gt;For the detection of the hands in the image, they have used a normalized correlation measure between the hand template and the image frame.  The location of the hand in the image frame would be the region of maximum correlation. For measuring the correlation, orientation information or the pixel information can be used. However, they have found that the orientation information has proved to be a little bit better.For measuring speed, they have used the derivative information. Stationary background is removed  by simply taking a running average of the image frames of the scene ans subtracting the frames. This causes removal of stationary objects in the image frame.For ending the control mode, the user just have to close his hand.&lt;br /&gt;&lt;br /&gt;In order to save computational cost in searching for the template in the complete image frame,system finds the position of the best match of the current filter and searches in the local region with the other template to find the position and value of the filter giving the best correlation match.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;In order to accidentally press a trigger, they have used a threshold based on the time of non activity. Their system is limited by the limited field of view which is 250 for searching trigger and 15 degree for  tracking .&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Discussion.&lt;br /&gt;&lt;br /&gt;This is a simple vision based approach to track the hand and use its motion for controlling the volume of the television set. Since they are using orientation, I believe if some one keeps hand in the direction which is in slanting  direction to the principle axis direction of the camera, it will be difficult to use the template match as for template matching the template should match the shape of the object.   Also, I believe , considering the complex nature of the current television controls,it would be difficult to use such a remote control as it might be very tiring. Also, i am unable to understand if the distance from the television also affects the recognition as the size of the hands would change.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-4386392680404558150?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/4386392680404558150/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=4386392680404558150' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/4386392680404558150'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/4386392680404558150'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/02/television-control-by-hand-gestures.html' title='Television Control by Hand Gestures'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-4196408652431287139</id><published>2008-02-19T22:51:00.000-08:00</published><updated>2008-02-19T22:59:04.284-08:00</updated><title type='text'>3D Visual Detection of Correct NGT Production</title><content type='html'>&lt;p class="MsoNormal" style="text-align: justify;"&gt;This paper presents a vision based technique to recognize gestures of NGT (Dutch Sign Language). Many NGT signs contain periodic motion, e.g repetitive rotation, moving hands up and down, left and right and back or forth. As such, it is required that the motion be captured in 3D. For this purpose, user is bounded to be in a specific region with a pair of stereo cameras having wide angle lenses and no obstruction between the camera and the hand so that gestures are obtained clearly. The complete setup and architecture of such a system is shown in figure below.&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_3DcJ3hARvOY/R7vO5DWlL3I/AAAAAAAAARY/Wskhxn3DiXM/s1600-h/image2.bmp"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 382px; height: 267px;" src="http://3.bp.blogspot.com/_3DcJ3hARvOY/R7vO5DWlL3I/AAAAAAAAARY/Wskhxn3DiXM/s320/image2.bmp" alt="" id="BLOGGER_PHOTO_ID_5168952476885659506" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;Since hands are the region of interest it is very important that their motion be tracked. As such authors have proposed a segmentation scheme to segment out the skin from the image frames by using skin models which are trained by training them on the positive and negative skin examples specified by the users. Skin color is modeled by the 2D Gaussian perpendicular to the main direction of the distribution of the positive skin in the RGB space, which is obtained by a sampling consensus method called RANSAC. In order to compensate for the chrominance direction uncertainty, mahalanobis distance of the color to the brightness is measured and divided by the color intensity. This provides a kind of normalized pixel intensity which takes care of very bright regions on the skin cause by varied light sources and their different directions&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;In order to track the gestures, it is important to follow the detected blobs (head and hands) and their movement. As such, various frames per second are captures to record the sequence and track the blobs in each frame by using the best template match approach. It is important though that the right blob be recognized as right and the left blob be recognized as left. Also occlusion may disrupt the gesture recognition and to prevent this, if there is occlusion for more than 20 frames, blobs are reinitialized. For reference a synchronization point is selected for each which is not considered for training. Various features like 2D angles, tangent of the displacement between consecutive frame, upper lower, side angles etc are extracted from each frame and used as features. &lt;span style=""&gt; &lt;/span&gt;The time signal formed by the measurement of the features is wrapped on to the reference point by Dynamic Time Warping which ensures that we obtain a list of time correspondences between new gesture and the reference sign. For classification, a Bayesian classifier, based on the independent feature assumption is trained for each gesture. The classifier is trained on a data of 120 different NGT signs performed by 70 different persons using 7 fold cross validation. Their system achieved an accuracy of 95% true positives and the response of their system was just 50 ms which ensures real-time application.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;Discussion:&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;This is a vision based paper throwing in lot of image processing stuff and classification is basically recognizing of the patterns created by hand and head observed in image frames. Since they are tracking blobs, I believe they have to specify the synchronization point for each person who is going to use the system. &lt;span style=""&gt; &lt;/span&gt;Also, if there is movement of head while performing gestures, it might affect certain gestures as head may be confused with hand blob. Also, there should be no object in the background which resembles a blob as it could cause miss- recognition. Also, the system requires that the person be constrained in an arrangement which may not be comfortable. I wonder if they have any user study for their system while obtaining data. As far as image processing stuff is concerned, they have done pretty much good job in taking care of the chromaticity changes by training the system on various illumination conditions and using normalized chromaticity value for each color.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-4196408652431287139?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/4196408652431287139/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=4196408652431287139' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/4196408652431287139'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/4196408652431287139'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/02/real-time-locomotion-control-by-sensing_19.html' title='3D Visual Detection of Correct NGT Production'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_3DcJ3hARvOY/R7vO5DWlL3I/AAAAAAAAARY/Wskhxn3DiXM/s72-c/image2.bmp' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-8165089205797860974</id><published>2008-02-19T22:44:00.000-08:00</published><updated>2008-02-19T22:48:59.044-08:00</updated><title type='text'>Real-time Locomotion Control by Sensing Gloves</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_3DcJ3hARvOY/R7vNFjWlL2I/AAAAAAAAARQ/diMZcOXSk2E/s1600-h/image1.bmp"&gt;&lt;img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer;" src="http://1.bp.blogspot.com/_3DcJ3hARvOY/R7vNFjWlL2I/AAAAAAAAARQ/diMZcOXSk2E/s320/image1.bmp" alt="" id="BLOGGER_PHOTO_ID_5168950492610768738" border="0" /&gt;&lt;/a&gt;&lt;span style=";font-family:&amp;quot;;" &gt;This paper presents a method to control the virtual creatures by using the fingers of your hand to track the natural gait of virtual creatures. Using a p-5 glove, author has designated response of certain sensors of the glove as the motion response of the virtual creature. It is like making a stick man to walk by moving fingers on the table in a fashion resembling human gait. The two stages of the system are shown on left side.&lt;/span&gt;&lt;br /&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;          &lt;p class="MsoNormal" style="margin-bottom: 0.0001pt; text-align: justify; line-height: normal;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Since such a system requires calibration for matching the finger movements with walking pattern of the character, a calibration process forms the first stage. In the calibration process, the user mimics the motion of the character on screen by fingers. After this, auto correlation between the trajectories is calculated which maps the topology of the finger movement and the movement of the virtual character. It is assumed that the character performs simple movements like walking, hopping, running and trotting.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin-bottom: 0.0001pt; text-align: justify; line-height: normal;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;After determining the calibration, the next step is control process in which,&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:12;"  &gt; &lt;span style="font-size:100%;"&gt;the player performs a new movement by the hand; then the corresponding motion of the character based on the mapping function, obtained during the calibration stage is generated and displayed in real-time. The user can change the movement of the fingers and the wrist to reproduce similar but different motions.&lt;o:p&gt;&lt;br /&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin-bottom: 0.0001pt; text-align: justify; line-height: normal;"&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;The system was tested using the cyber gloves and the Flocker Birds at the testing stage as P-5 are sensitive to external noise and the sensor for 3D location is not very sensitive, if hand is away from the tower. In their test users were given30 minutes to get familiarized with the system and then they were asked to do certain tasks using keyboard and the sensing gloves. It is observed that the number of collisions using the sensing gloves were less for users though time taken was more.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin-bottom: 0.0001pt; text-align: justify; line-height: normal;"&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;    &lt;p class="MsoNormal" style="margin-bottom: 0.0001pt; text-align: justify; line-height: normal;"&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;Discussion:&lt;br /&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin-bottom: 0.0001pt; text-align: justify; line-height: normal;"&gt;&lt;span style=";font-family:&amp;quot;;font-size:12;"  &gt;&lt;span style="font-size:100%;"&gt;This is a simple mapping of the finger movements to simple stick type character movements in the virtual world. I can understand that the users took time while working with the sensing gloves as most of us are more familiar with the key board and obviously we would be fast using it rather than the gloves. Also, working with gloves can be tiresome as with key board pressing one key does the task while here we have to mimic each and every step by hand movements. Also in zig zag mazy movements requiring sudden turning, it could be really painful. However, over all it was a nice application for small games. I could also be useful for small virtual tours in 3D space.&lt;/span&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-8165089205797860974?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/8165089205797860974/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=8165089205797860974' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/8165089205797860974'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/8165089205797860974'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/02/real-time-locomotion-control-by-sensing.html' title='Real-time Locomotion Control by Sensing Gloves'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_3DcJ3hARvOY/R7vNFjWlL2I/AAAAAAAAARQ/diMZcOXSk2E/s72-c/image1.bmp' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-4524669123187199762</id><published>2008-02-17T23:28:00.001-08:00</published><updated>2008-02-17T23:28:31.467-08:00</updated><title type='text'>A Survey of Hand Posture and Gesture Recognition Techniques and Technology</title><content type='html'>&lt;p class="MsoNormal" style="text-align: justify;"&gt;This paper is the survey paper dealing with the survey of hand posture and gesture recognition techniques used in the literature. Chapters 3, deals with the survey of various algorithmic techniques that have been used over the years for the purpose. Various approaches in the literature have been classified by the author into 3 major categories:&lt;/p&gt;  &lt;ol style="margin-top: 0in;" start="1" type="1"&gt;&lt;li class="MsoNormal" style="text-align: justify;"&gt;Feature Extraction, Statistical models.&lt;/li&gt;&lt;li class="MsoNormal" style="text-align: justify;"&gt;Learning Algorithms. &lt;/li&gt;&lt;li class="MsoNormal" style="text-align: justify;"&gt;Miscellaneous Algorithms.&lt;/li&gt;&lt;/ol&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;i style=""&gt;Feature Extraction, Statistical Models:&lt;o:p&gt;&lt;/o:p&gt;&lt;/i&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;This category of methods deals with the extraction of the features in form of mathematical quantities from the available data which is captured through sensors (gloves) or images. The method is further classified into sub categories like:&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;b style=""&gt;&lt;i style=""&gt;Template Based&lt;/i&gt;&lt;/b&gt;:&lt;span style=""&gt;  &lt;/span&gt;In this approach the data obtained is compared against some reference data and using the thresholds, the data is categorized into one of the gestures available in the reference data. This is a simple approach with little calibration but suffers from noise and doesn’t work with overlapping gestures.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;b style=""&gt;&lt;i style=""&gt;Feature extraction:&lt;/i&gt;&lt;/b&gt; This approach deals with the extraction of the low level information from the data and combine the information to produce high level semantic feature information which can be used to classify gestures/ postures. The methods in this sub category usually deal with capturing the changes and measuring certain qualities during those changes. The collection of these values is used to label a posture which can be subsequently extended to a gesture. This method however suffers from heavy computational cost and also there should be a specific sequence that should frame a gesture else this method fails.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;b style=""&gt;&lt;i style=""&gt;Active shape Models:&lt;o:p&gt;&lt;/o:p&gt;&lt;/i&gt;&lt;/b&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;b style=""&gt;&lt;i style=""&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/i&gt;&lt;/b&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;Active shape models deal with the image based gesture recognition systems in which they place a contour on the image which is roughly the shape of the feature to be tracked. The contour is then manipulated by moving it iteratively towards nearby edges that deforms the contour to fit the feature. This suffers from the drawback that it can capture those gestures that can be performed by postures requiring open hand. Also there is very little work in this direction. However with limited gestures meeting the open hand criterion, this method has been found to work in real-time. Also stereo cameras cannot be used.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;b style=""&gt;&lt;i style=""&gt;Principle component analysis:&lt;o:p&gt;&lt;/o:p&gt;&lt;/i&gt;&lt;/b&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;This method is basically the dimension reduction method in which the significant eigen vectors (based on the eigen-values) are used to project the data. This approach captures the significant variability in the data and thus can be used to identify the gestures and postures in the vision based system. Though this method can be exploited for glove based approaches also, but till that time (1999) only vision based techniques have been exploited. This method suffers from a drawback that there should be variance in the at least one direction. If variance is uniformly distributed in the data, it will not yield the relevant Principle vectors also, if there is noise, PCA would consider it as a significant bias too. Besides this method suffers from scaling in hand size and position, which can be taken care by normalization. Even then, this method is user dependent.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;b style=""&gt;&lt;i style=""&gt;Linear finger Tip model:&lt;o:p&gt;&lt;/o:p&gt;&lt;/i&gt;&lt;/b&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;This method requires special markers on the finger tips and then segmenting the finger tip motion form the scene image .This motion is analyzed for the possible gesture. This method works well for simple gestures and deals with the initial and final position with good recognition. However, the system cannot work in real time and recognizes a small set due to limited possible finger motions. Also curvilinear motions are not taken into account.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;b style=""&gt;&lt;i style=""&gt;Cause analysis&lt;o:p&gt;&lt;/o:p&gt;&lt;/i&gt;&lt;/b&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;This method is based on the interaction of humans with the environment and capturing the body kinematics and dynamics. This also suffers from limited gesture sets and no orientation information can be used. Besides, this system cannot work in real-time. &lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;b style=""&gt;Learning Algorithms&lt;o:p&gt;&lt;/o:p&gt;&lt;/b&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;b style=""&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/b&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;These are machine learning algorithms that deal with the learning of the gesture based on the data manipulation and weight assignment. The popular techniques in this sub category are:&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;b style=""&gt;&lt;i style=""&gt;Neural Networks:&lt;/i&gt;&lt;/b&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;This method is based on modeling of the human nervous system element called neuron and its interaction with the other neurons to transfer the information. Each node (neuron0, consists of and the input function which computes the weighted sum and the activation function to generate the response based on the weighted sum. There are two types of NN, feed forward and recurrent. The methods based on this approach deal with the problem of heavy training and computation cost involved offline for training. Also for complex systems, such a model could be very complex. Also &lt;span style=""&gt; &lt;/span&gt;addition of each new gesture/posture requires complete retraining of the network.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;b style=""&gt;&lt;i style=""&gt;Hidden Markov Model:&lt;o:p&gt;&lt;/o:p&gt;&lt;/i&gt;&lt;/b&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;b style=""&gt;&lt;i style=""&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/i&gt;&lt;/b&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;This method has been widely exploited for temporal gesture recognition. An HMM consists of states and state transitions with observation probabilities. For watch gesture a separate HMM is trained and the recognition of the gesture is based on the generation of maximum probability by a particular HMM. This method also suffers from training time involved and complex working nature as the results are unpredicted because of the hidden nature. For the gesture recognition, baki’s HMM is commonly used.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;b style=""&gt;&lt;i style=""&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/i&gt;&lt;/b&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;b style=""&gt;&lt;i style=""&gt;Instance based learning:&lt;/i&gt;&lt;/b&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;Instance is the vector of features of the entity to be classified. These techniques involve the computation of the distance between given data vector and instances in the database. This method is very expensive and for instance recognition, we need to maintain a large database and computation has to be performed for each instance to be recognized even when a given instance is provided for re-recognition.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;b style=""&gt;Miscellaneous Techniques: &lt;o:p&gt;&lt;/o:p&gt;&lt;/b&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;b style=""&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/b&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;b style=""&gt;&lt;i style=""&gt;The Linguistic Approach&lt;/i&gt;&lt;/b&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;This method uses the formal grammar to represent the hand gestures and postures however limited. This method involves simple gestures requiring the fingers to be extended in various configurations which are mapped to the formal grammar specified by specific tokens and rules. The system involves tracker and glove. This system has poor accuracy and very limited gesture set.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;b style=""&gt;&lt;i style=""&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/i&gt;&lt;/b&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;b style=""&gt;&lt;i style=""&gt;Appearance based models:&lt;o:p&gt;&lt;/o:p&gt;&lt;/i&gt;&lt;/b&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;b style=""&gt;&lt;i style=""&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/i&gt;&lt;/b&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;These models are based on the observation that the humans are able to recognize the fine actions from the very low resolution images with little or no information about the 3D nature of the scene. These methods involve measurement of the regional statistics of the particular region of the image based on intensity values of the region. This method is simple but is unable to capture fine details in the gesture.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;b style=""&gt;&lt;i style=""&gt;Spatio-temporal vector Analysis&lt;o:p&gt;&lt;/o:p&gt;&lt;/i&gt;&lt;/b&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;b style=""&gt;&lt;i style=""&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/i&gt;&lt;/b&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;This method is used to track the movement of the hand in the images of the scene and track the motion in the sequence of image. The information about the motion is obtained by the derivatives and it is assumed that under static background, hand motion is the fastest changing object of the scene. Then using the refinement and variance constraint flow field is refined. This flow field is captures the characteristics of the given gesture.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;b style=""&gt;&lt;i style=""&gt;Application of the method:&lt;o:p&gt;&lt;/o:p&gt;&lt;/i&gt;&lt;/b&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;This sections deal with introduction to the application of the posture and gesture in various domains like:&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;i style=""&gt;Sign language&lt;/i&gt;: where high accuracy of 90 % have been obtained under some constraints.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;i style=""&gt;Gesture to speech&lt;/i&gt;: &lt;span style=""&gt; &lt;/span&gt;in this which hand gestures are converted to speech.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;i style=""&gt;Presentations&lt;/i&gt;: &lt;span style=""&gt; &lt;/span&gt;Hand motion and gestures are used to generate presentations. &lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;i style=""&gt;Multimodal interaction&lt;/i&gt;: Hand gesture and motion is incorporated along with speech to generate better user interfaces.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;Human Robot interaction: Hand gestures are used as natural mode to control robots:&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;Other domains include, Virtual environment interaction, 3D modeling in virtual environment, television control.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;Discussion:&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;This paper presents survey of the various approaches used for gesture recognition with beautiful classification into three approaches for recognition which have been further elaborated with methods involved in each approach. Most of the paper deals with the works done prior to 1999 and the advancements in the past 9 years are definitely worth exploring considering advancements in computational power, better sensors , gloves, vision capturing devices and human touch interfaces. It was interesting to note that the problems about the techniques that we discuss in the class have been known to the research community since last 9 years but still only few have been resolved and that too partially. This is mainly because of the complexities involved in the gestures and also absence of any robust segmentation approach.&lt;b style=""&gt;&lt;i style=""&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/i&gt;&lt;/b&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-4524669123187199762?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/4524669123187199762/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=4524669123187199762' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/4524669123187199762'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/4524669123187199762'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/02/survey-of-hand-posture-and-gesture.html' title='A Survey of Hand Posture and Gesture Recognition Techniques and Technology'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-5128618322164138062</id><published>2008-02-17T20:14:00.001-08:00</published><updated>2008-02-17T20:21:43.962-08:00</updated><title type='text'>Dynamic Gesture Recognition System for Korean Sign language</title><content type='html'>&lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;In this paper author has described a method based on proposed Fuzzy min-max neural networks. Considering the variety of possible gestures I the KSL, authors have reduced the number of possible gestures to 25 which according to them are the most common and basic gestures. The sensory information about the gestures is obtained by a data glove which generates 16 responses which is further reduced to obtain only the directional changes in the postures.&lt;span style=""&gt;  &lt;/span&gt;Based on their data, they have reduced their data to frame 10-basic direction types which captures the directional change information in the postures. In order to reduce the data processing time and effective filtering, x and y range has been divided into 8 regions (based on their observation of the deviation in these directions) which form their local coordinate system. The directional information is stored in the 5 cascading registers. The directional information about each time unit is measured in these 5 cascading registers by “+” for the right/upper motion and “-“for the left lower motion and “x” is the no care position. Depending upon the previous position and the new position, measured through the values in these registers, change in direction is observed.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;As per the author the 25 gestures have 14 postures which are recognized by the so called Fuzzy, min max neural networks. The Fuzzy min max neural network requires no pre learning about the postures and can be used for the online adaptability.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;For recognition, the gestures generate the data which is inputted to the system which transforms this raw data set into asset containing small number of data which is then used to identify the direction class. After the direction class is recognized, posture recognition method is used to identify the gesture.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;The complete system is represented by the figure shown below:&lt;br /&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_3DcJ3hARvOY/R7kHkzWlL1I/AAAAAAAAAQw/We7qyROUkQ8/s1600-h/pap.jpg"&gt;&lt;img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer;" src="http://4.bp.blogspot.com/_3DcJ3hARvOY/R7kHkzWlL1I/AAAAAAAAAQw/We7qyROUkQ8/s320/pap.jpg" alt="" id="BLOGGER_PHOTO_ID_5168170376225959762" border="0" /&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;Discussion:&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;I did not have much to say about this paper because I have no clues what author wanted to convey by presenting the min max fuzzy neural network system which looks like a simple template matching system based on the direction segmentation of the postures. Except the direction values that have been used to classify the gestures, nothing is impressive. Also, they have considered 25 gestures with 14 postures which look simple with just finger movements (no yaw and pitch). Another flaw in the paper was that there is no mention of the user study for the paper as the min max values based on the data may be very user specific and the complete setup might require retuning with new data. As far as the results are considered, it looks very obscure to say about 85% as it is not clear if it is close to 85%, more or less. It would have been much better if they would have conducted various experiments and provided with some exact average classification. Also, it would have been nice if they would have spent some time explaining about their min-max NN which looks to me very confusing with the diagram presented.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-5128618322164138062?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/5128618322164138062/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=5128618322164138062' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/5128618322164138062'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/5128618322164138062'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/02/dynamic-gesture-recognition-system-for_17.html' title='Dynamic Gesture Recognition System for Korean Sign language'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_3DcJ3hARvOY/R7kHkzWlL1I/AAAAAAAAAQw/We7qyROUkQ8/s72-c/pap.jpg' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-2403106588610943108</id><published>2008-02-17T20:14:00.000-08:00</published><updated>2008-02-17T20:21:05.570-08:00</updated><title type='text'>Dynamic Gesture Recognition System for Koream Sign language</title><content type='html'>&lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;In this paper author has described a method based on proposed Fuzzy min-max neural networks. Considering the variety of possible gestures I the KSL, authors have reduced the number of possible gestures to 25 which according to them are the most common and basic gestures. The sensory information about the gestures is obtained by a data glove which generates 16 responses which is further reduced to obtain only the directional changes in the postures.&lt;span style=""&gt;  &lt;/span&gt;Based on their data, they have reduced their data to frame 10-basic direction types which captures the directional change information in the postures. In order to reduce the data processing time and effective filtering, x and y range has been divided into 8 regions (based on their observation of the deviation in these directions) which form their local coordinate system. The directional information is stored in the 5 cascading registers. The directional information about each time unit is measured in these 5 cascading registers by “+” for the right/upper motion and “-“for the left lower motion and “x” is the no care position. Depending upon the previous position and the new position, measured through the values in these registers, change in direction is observed.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;As per the author the 25 gestures have 14 postures which are recognized by the so called Fuzzy, min max neural networks. The Fuzzy min max neural network requires no pre learning about the postures and can be used for the online adaptability.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;For recognition, the gestures generate the data which is inputted to the system which transforms this raw data set into asset containing small number of data which is then used to identify the direction class. After the direction class is recognized, posture recognition method is used to identify the gesture.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;The complete system is represented by the figure shown below:&lt;br /&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_3DcJ3hARvOY/R7kHkzWlL1I/AAAAAAAAAQw/We7qyROUkQ8/s1600-h/pap.jpg"&gt;&lt;img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer;" src="http://4.bp.blogspot.com/_3DcJ3hARvOY/R7kHkzWlL1I/AAAAAAAAAQw/We7qyROUkQ8/s320/pap.jpg" alt="" id="BLOGGER_PHOTO_ID_5168170376225959762" border="0" /&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;Discussion:&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;I did not have much to say about this paper because I have no clues what author wanted to convey by presenting the min max fuzzy neural network system which looks like a simple template matching system based on the direction segmentation of the postures. Except the direction values that have been used to classify the gestures, nothing is impressive. Also, they have considered 25 gestures with 14 postures which look simple with just finger movements (no yaw and pitch). Another flaw in the paper was that there is no mention of the user study for the paper as the min max values based on the data may be very user specific and the complete setup might require retuning with new data. As far as the results are considered, it looks very obscure to say about 85% as it is not clear if it is close to 85%, more or less. It would have been much better if they would have conducted various experiments and provided with some exact average classification. Also, it would have been nice if they would have spent some time explaining about their min-max NN which looks to me very confusing with the diagram presented.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-2403106588610943108?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/2403106588610943108/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=2403106588610943108' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/2403106588610943108'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/2403106588610943108'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/02/dynamic-gesture-recognition-system-for.html' title='Dynamic Gesture Recognition System for Koream Sign language'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_3DcJ3hARvOY/R7kHkzWlL1I/AAAAAAAAAQw/We7qyROUkQ8/s72-c/pap.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-3218740916884325735</id><published>2008-02-13T14:12:00.000-08:00</published><updated>2008-02-13T14:14:32.241-08:00</updated><title type='text'>Shape your Imagination: Iconic Gestural Based Interaction</title><content type='html'>&lt;p class="MsoNormal" style="text-align: justify;"&gt;This paper is basically an observation study in which computer scientists are observing the iconic gesture of the people from different educational domains and trying to test the hypothesis if iconic gestures can be employed as the natural and intuitive HCI technique for transfer of spatial information. The subjects, 5 males and 7 females, were taken from variety of education domains like: Languages, Science, Politics, nutrition and health and Library with age ranging from 21-31 years. &lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;Subjects were seated in a quiet room and presented with name of a shape or an object and were asked to convey the shape or an object using non verbal communication using hand gestures. The primitive shapes chosen were, circle, triangle, square, cube, cylinder, sphere and a pyramid. The complex shapes chosen were: Football, chair, French baguette, table, vase, car, house, table lamp. These shapes were presented in an order to the subjects.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;It was observed from the study that subjects preferred to use two hands to draw the virtual description (i.e boundary tracing etc) for primitive as well as some simple 2D shapes. For the complex shapes, they tend to use the iconic gestures. It was observed that, subjects also used pantomimic, dialect and body gestures sometimes in conjugation or sometimes in place of the iconic gesture. Some of the items (mostly 3D complex) were found to be too complex for the users. Complex items also took a more time to generate representative gestures.&lt;/p&gt;&lt;br /&gt;&lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;br /&gt;&lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="text-align: justify;"&gt;Disussion:&lt;/p&gt;&lt;br /&gt;&lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="text-align: justify;"&gt;The paper is petty straight forward. This is a observational study and some commonly understood/ observed facts were studied spending some money. It would have been interesting if subjects would have been told to communicate some ideas/ sentences using gestures rather than just objects.&lt;br /&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-3218740916884325735?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/3218740916884325735/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=3218740916884325735' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/3218740916884325735'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/3218740916884325735'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/02/shape-your-imagination-iconic-gestural.html' title='Shape your Imagination: Iconic Gestural Based Interaction'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-8685339872284904037</id><published>2008-02-13T12:39:00.000-08:00</published><updated>2008-02-13T12:44:02.610-08:00</updated><title type='text'>Simultaneous Gesture Segmentation and Recognition based on Forward Spotting Accumulative HMMs</title><content type='html'>&lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;This paper presented a method based on forward spotting scheme that performs the gestural segmentation and recognition using the sliding windows and moving average HMMS. The sliding window captures and computes the observation probability of the gesture or non gesture using a number of continuous observations within the sliding window and thus computes the dynamics of the gesture without any abrupt start or end. &lt;span style=""&gt; &lt;/span&gt;Using the forward procedure, the averaged conditional probability is calculated for each observation window and HMM is trained using these probabilities measures and transitions. After training the required number of gestures, a non gesture model is also trained, which accounts for everything else that is not classified as the gesture. This model is trained by the same approach as for the gestures but using the non –gesture data as inputs.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;In order to segment the gesture, author has argued that since gesture consists of number of postures, we can &lt;i style=""&gt;segment&lt;/i&gt; the gesture by dividing it into postures and then looking at the cumulative probability of the window. &lt;span style=""&gt; &lt;/span&gt;As per the author, for all gestural postures, the cumulative probability of the sequence will be higher than the non gestural probability and for the sequence of postures from non gesture, cumulative probability would be less than the non –gesture probability. Since we have small sliding windows, we can actually locate the position of the state at which probability became less and use that state for segmentation. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;The complete system has a hierarchal structure where all HMM gestures are together in one class and there is a separate class for non-gesture HMM. A posture sequence provided as input is provided to the HMM Gesture class and to the non gesture class. The complete structure looks like a figure below.&lt;/span&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_3DcJ3hARvOY/R7NWSzWlLyI/AAAAAAAAAQY/Efz5ueGPDHg/s1600-h/diagram.PNG"&gt;    &lt;/a&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_3DcJ3hARvOY/R7NWSzWlLyI/AAAAAAAAAQY/Efz5ueGPDHg/s1600-h/diagram.PNG"&gt;&lt;img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer;" src="http://2.bp.blogspot.com/_3DcJ3hARvOY/R7NWSzWlLyI/AAAAAAAAAQY/Efz5ueGPDHg/s320/diagram.PNG" alt="" id="BLOGGER_PHOTO_ID_5166568078546644770" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: center;" align="center"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;!--[if gte vml 1]&gt;&lt;v:shapetype id="_x0000_t75" coordsize="21600,21600" spt="75" preferrelative="t" path="m@4@5l@4@11@9@11@9@5xe" filled="f" stroked="f"&gt;  &lt;v:stroke joinstyle="miter"&gt;  &lt;v:formulas&gt;   &lt;v:f eqn="if lineDrawn pixelLineWidth 0"&gt;   &lt;v:f eqn="sum @0 1 0"&gt;   &lt;v:f eqn="sum 0 0 @1"&gt;   &lt;v:f eqn="prod @2 1 2"&gt;   &lt;v:f eqn="prod @3 21600 pixelWidth"&gt;   &lt;v:f eqn="prod @3 21600 pixelHeight"&gt;   &lt;v:f eqn="sum @0 0 1"&gt;   &lt;v:f eqn="prod @6 1 2"&gt;   &lt;v:f eqn="prod @7 21600 pixelWidth"&gt;   &lt;v:f eqn="sum @8 21600 0"&gt;   &lt;v:f eqn="prod @7 21600 pixelHeight"&gt;   &lt;v:f eqn="sum @10 21600 0"&gt;  &lt;/v:formulas&gt;  &lt;v:path extrusionok="f" gradientshapeok="t" connecttype="rect"&gt;  &lt;o:lock ext="edit" aspectratio="t"&gt; &lt;/v:shapetype&gt;&lt;v:shape id="Picture_x0020_1" spid="_x0000_i1025" type="#_x0000_t75" style="'width:310.5pt;height:182.25pt;visibility:visible;mso-wrap-style:square'"&gt;  &lt;v:imagedata src="file:///C:\DOCUME~1\Pankaj\LOCALS~1\Temp\msohtmlclip1\01\clip_image001.png" title=""&gt; &lt;/v:shape&gt;&lt;![endif]--&gt;&lt;!--[if !vml]--&gt;&lt;br /&gt;&lt;!--[endif]--&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;It is similar to standard HMM except that we have non gestural HMM trained too which will give higher probability for the non gesture where as other gesture trained HMM’s would give higher probability for respective gestures.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Discussion:&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;I liked the approach of dividing the gestures into postures and then using the sequence of postures over small windows to determine the segmentation spot. However, I was disappointed by their simple observation sequence as each of the gesture was far more distinct and simple to what we are aiming at. Besides, computational time is also a questionable for this approach as it doesn’t look to me a real-time approach if we are looking at fine postures to frame a gesture. However, I believe for very distinct gestures like theirs we can use low resolution images which may work close to real-time. I believe the system is also very much user dependent as the non gestural movements may vary between person to person and some of the non gestural movements of one person may also correspond to gestural movements of another person (I mean it is not uncommon to do even stretching in different ways)&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-8685339872284904037?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/8685339872284904037/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=8685339872284904037' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/8685339872284904037'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/8685339872284904037'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/02/simultaneous-gesture-segmentation-and.html' title='Simultaneous Gesture Segmentation and Recognition based on Forward Spotting Accumulative HMMs'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_3DcJ3hARvOY/R7NWSzWlLyI/AAAAAAAAAQY/Efz5ueGPDHg/s72-c/diagram.PNG' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-1048732678162188472</id><published>2008-02-13T11:31:00.001-08:00</published><updated>2008-02-13T11:31:50.506-08:00</updated><title type='text'>A Survey of POMDP Applications</title><content type='html'>&lt;span style="font-family: &amp;quot;Times New Roman&amp;quot;,&amp;quot;serif&amp;quot;;"&gt;This paper was an introduction to a new stochastic method called POMDP which stands for “Partially observable Markov Decision Processes”. This paper presents the wide application domains of the method and its applications in various processes that we observe almost daily.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-family: &amp;quot;Times New Roman&amp;quot;,&amp;quot;serif&amp;quot;;"&gt;This model takes into account the uncertainty in any decision process and extends the already prevailing MDP (Markov Decision process).The POMDP model consists of the following:&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span style="font-family: &amp;quot;Times New Roman&amp;quot;,&amp;quot;serif&amp;quot;;"&gt;&lt;br /&gt;a finite set of states -&lt;b style=""&gt;S&lt;/b&gt;&lt;br /&gt;a finite set of actions -&lt;b style=""&gt;A&lt;/b&gt;&lt;br /&gt;a finite set of observations- &lt;b style=""&gt;Z &lt;/b&gt;&lt;br /&gt;a state transition function -&lt;b style=""&gt;τ &lt;/b&gt;&lt;br /&gt;an observation function -&lt;b style=""&gt;o&lt;/b&gt;&lt;br /&gt;an immediate reward function &lt;b style=""&gt;-r &lt;o:p&gt;&lt;/o:p&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-family: &amp;quot;Times New Roman&amp;quot;,&amp;quot;serif&amp;quot;;"&gt;The states represent all the possible underlying states in the process which may not be even directly visible. The state transition accounts for the uncertainty by providing probabilistic measure to for a certain transition. The action set is all the available control choices available at particular time, whereas the observation set is the set of all possible observations available at a particular state. The reward function gives immediate utility for performing an action in each of the underlining process states. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-family: &amp;quot;Times New Roman&amp;quot;,&amp;quot;serif&amp;quot;;"&gt;The main objective of the model is to derive the control policy that will utilize the minimum possible information to yield the optimum decision. This is important because in many cases, complete information about the process is either not available or is very expensive. Thus, this approach minimizes the cost associated with the decision and also the computational complexity of the decision. This objective has been demonstrated by the author by providing various examples of application of POMDP domains various domains like, structural inspection, elevate control policies, fishery industries, autonomous robots, network troubleshooting, behavioral ecology, machine vision, , distributed database queries, marketing, questionnaire design machine maintenance, &lt;span style=""&gt; &lt;/span&gt;weapon allocation, corporate policy, moving target search, search and rescue, target identification, education, medical diagnosis. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-family: &amp;quot;Times New Roman&amp;quot;,&amp;quot;serif&amp;quot;;"&gt;Since the model is very heavily dependent on some of the detailed partial process information, author has warned that the model may not be very useful if we do not have the partial information about the process. In other words, we require the model to provide us with complete information about every possible observation and immediate reward for each state, action and observation. This information may not be always available for every application domain. The other important problem highlighted by the author is the user interface issues which deals with how all information can be provided to a system. Apart from this, there are also issues like computational complexities.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-family: &amp;quot;Times New Roman&amp;quot;,&amp;quot;serif&amp;quot;;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-family: &amp;quot;Times New Roman&amp;quot;,&amp;quot;serif&amp;quot;;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-family: &amp;quot;Times New Roman&amp;quot;,&amp;quot;serif&amp;quot;;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-family: &amp;quot;Times New Roman&amp;quot;,&amp;quot;serif&amp;quot;;"&gt;Discussion:&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-family: &amp;quot;Times New Roman&amp;quot;,&amp;quot;serif&amp;quot;;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-family: &amp;quot;Times New Roman&amp;quot;,&amp;quot;serif&amp;quot;;"&gt;This per provided a clear case motivation to the use of POMDPs in variety of domains some of which seemed interesting. I was impressed by the way he has highlighted all information available form a given process and framed them in form of requirements for the POMDP.As far as the implementation details are concerned, there was almost nothing in the paper which seemed bit disappointing, but keeping in mind many non engineer person dealing with uncertainty issues (like ecologist, fishery people), this is a nice over view. I believe, POMDP’s have good application in many search and limited information navigation systems as system is capable enough to decide based on the available information. Keeping in mind our goals, I believe the system could be useful if we have a limited posture sequence forming a distinct gesture and the library of such gestures is small. Otherwise, it is computationally very expensive and we may not be able to provide all information required to model a decision. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-family: &amp;quot;Times New Roman&amp;quot;,&amp;quot;serif&amp;quot;;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-family: &amp;quot;Times New Roman&amp;quot;,&amp;quot;serif&amp;quot;;"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-family: &amp;quot;Times New Roman&amp;quot;,&amp;quot;serif&amp;quot;;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-family: &amp;quot;Times New Roman&amp;quot;,&amp;quot;serif&amp;quot;;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-1048732678162188472?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/1048732678162188472/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=1048732678162188472' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/1048732678162188472'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/1048732678162188472'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/02/survey-of-pomdp-applications.html' title='A Survey of POMDP Applications'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-6318436328168868548</id><published>2008-02-06T15:35:00.000-08:00</published><updated>2008-02-06T15:36:03.949-08:00</updated><title type='text'>A Similarity Measure for Motion Stream Segmentation and Recognition</title><content type='html'>&lt;p class="MsoNormal" style="text-align: justify;"&gt;This paper presents a SVD (Singular Vector Decomposition) approach for segmenting out the similar gestures based on the similarity measure they have introduced in this paper. &lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;Any motion capture device captures the sensory information sampled over time. The sensory information from respective sensors occupies the columns where as the time sampling occupies the rows of any data matrix. Hence if we look at nearly same gestures, it is very difficult to maintain the same time samples because of variations in even similar motion. As such, we usually get the matrix for the same motion which is not dimensionally same. Also, given a huge data of the gesture matrix, we cannot arrive at a measure which captures the structure of the data, which can be exploited to identify the gesture as different from others. &lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;This paper presents an approach based on principle component analysis which has been widely used in machine learning and pattern recognition domains to capture the components of maximum variance in the data. These components of variations are popularly known as the Principle components (PC) and by projecting the data using them as basis, we can obtain the discriminatory structure of the data. Usually only first few of the PC’s are enough for discrimination purposes. These Principle components ideally, for a similar gesture should be nearly parallel, but because of the variations, they are not. &lt;span style=""&gt; &lt;/span&gt;It is argued that for similar gestures, corresponding, principle components (eigenvectors) should contribute equally to the parallel ness. Also its is suggested by the authors that since the eigenvalues are related to the variances along the principle components ( eigen vectors ), they&lt;span style=""&gt;  &lt;/span&gt;can be used to give different weights to the different eigenvector pairs. &lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;Based on their observations, they framed a similarity measure called K-Weighted Angular Similarity or &lt;i style=""&gt;k&lt;/i&gt;-WAS, which captures the similarity between the two matrixes of data, based on a kind of average of the normalized eigenvalues for the two data matrixes and the contribution of each eigen vector pair to the parallel ness. Numerically, it is given by&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: center;" align="center"&gt;&lt;!--[if gte vml 1]&gt;&lt;v:shapetype id="_x0000_t75" coordsize="21600,21600" spt="75" preferrelative="t" path="m@4@5l@4@11@9@11@9@5xe" filled="f" stroked="f"&gt;  &lt;v:stroke joinstyle="miter"&gt;  &lt;v:formulas&gt;   &lt;v:f eqn="if lineDrawn pixelLineWidth 0"&gt;   &lt;v:f eqn="sum @0 1 0"&gt;   &lt;v:f eqn="sum 0 0 @1"&gt;   &lt;v:f eqn="prod @2 1 2"&gt;   &lt;v:f eqn="prod @3 21600 pixelWidth"&gt;   &lt;v:f eqn="prod @3 21600 pixelHeight"&gt;   &lt;v:f eqn="sum @0 0 1"&gt;   &lt;v:f eqn="prod @6 1 2"&gt;   &lt;v:f eqn="prod @7 21600 pixelWidth"&gt;   &lt;v:f eqn="sum @8 21600 0"&gt;   &lt;v:f eqn="prod @7 21600 pixelHeight"&gt;   &lt;v:f eqn="sum @10 21600 0"&gt;  &lt;/v:formulas&gt;  &lt;v:path extrusionok="f" gradientshapeok="t" connecttype="rect"&gt;  &lt;o:lock ext="edit" aspectratio="t"&gt; &lt;/v:shapetype&gt;&lt;v:shape id="_x0000_i1025" type="#_x0000_t75" style="'width:245.25pt;"&gt;  &lt;v:imagedata src="file:///C:\Users\Pankaj\AppData\Local\Temp\msohtml1\01\clip_image001.emz" title=""&gt; &lt;/v:shape&gt;&lt;![endif]--&gt;&lt;!--[if !vml]--&gt;&lt;img src="file:///C:/Users/Pankaj/AppData/Local/Temp/msohtml1/01/clip_image002.gif" shapes="_x0000_i1025" height="52" width="327" /&gt;&lt;!--[endif]--&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;In order to compare their similarity measures authors have tried the same on two data collection methods. One is the Cyber Glove and other is the video capturing model called VICON. It was observed that in case of cybergloves just 3 eigenvector pairs &lt;span style=""&gt; &lt;/span&gt;(out of 22 possible) were sufficient for accuracy where as for motion capture system, 5 out of possible 54 eigen vectors were sufficient.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;In order to frame a comparison, authors have presented their results in comparison to existing systems like MAS and Eros, which was out performed by their measure called &lt;i style=""&gt;k&lt;/i&gt;-WAS.&lt;/p&gt;&lt;br /&gt;&lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;br /&gt;&lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="text-align: justify;"&gt;  &lt;/p&gt;&lt;p class="MsoNormal" style="text-align: justify;"&gt;Discussion:&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;I liked the paper as it was pretty straightforward with good results. I liked the fact that they were able to capture the variance using the weighted PCA approach which is a popular and simple approach for data analysis. They weighted PCA was similar to many existing works in other domains, though it was different that many works in the field of motion recognition which normally rely on complex HMM’s and lot of training. Apart from this, their system is unsupervised which makes it even more attractive. However, since they are working with just the sensory information, they have eliminated the temporal nature of the gestures form the consideration and hence, it is quite possible that even if the gesture is performed in a way which gives same values for the sensors at different times, they would be considered similar. I was impressed that they have identified the problem with their approach and suggested a dynamic time wrapping approach, though I believe, they have not implemented it.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;As far as my knowledge of PCA is concerned, I believe it is very prone to noise and noisy data may cause great variation in the similar looking gestures affecting the similarity measure base on PCA. Also, because the variance information is exploited, if the gestures don’t vary a lot (which means, information is not contained in variance), PCA may not be helpful. I believe using PCA with some temporal approach would be a nice contribution.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;    &lt;p class="MsoNormal" style="text-align: center;" align="center"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-6318436328168868548?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/6318436328168868548/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=6318436328168868548' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/6318436328168868548'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/6318436328168868548'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/02/similarity-measure-for-motion-stream.html' title='A Similarity Measure for Motion Stream Segmentation and Recognition'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-4848460750918515429</id><published>2008-02-06T14:00:00.000-08:00</published><updated>2008-02-06T14:06:06.259-08:00</updated><title type='text'>Cyber Composer: Hand Gesture-Driven Intelligent Music Composition and Generation</title><content type='html'>&lt;p class="MsoNormal" style="text-align: justify;"&gt;This paper introduces us to a novel way of generating music according to the hand motions and gestures of the user, when the real instruments are not available. In order to have sync with the existing musical theories, the system has been embedded with the same, controlled by the motion sensing glove movements. The paper provides a nice overview of the musical theories and also how they have been incorporated them in their system.&lt;br /&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;The system consists of a Cyber Glove, with a 3D Polhemus sensor to each glove to track the 3D positioning, a background music generation module, a melody generation module and the main program. Keeping in mind the music standard in industry, authors have used &lt;st1:place st="on"&gt;MIDI&lt;/st1:place&gt; format, which according to them, allows multiple instrument sounds and expressions. Apart from this, the availability for API’s to control input and output also was also an encouragement to use the &lt;st1:place st="on"&gt;MIDI&lt;/st1:place&gt; format. &lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;The system uses the sensory information from the gloves and they are sent to the Cyber composer main program. The main program transforms the signals into motion triggers which are used as the inputs to the various modules of the system. These inputs trigger the corresponding events in the modules, which are routed to the Musical interface connected to the standard &lt;st1:place st="on"&gt;MIDI&lt;/st1:place&gt; output device, (Roland sound module in this case), which generates the music.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;Authors have tried to keep their hand movements very intuitive to match the requirements of a lay person as well as the expert musician. In order to make the usage of the system smooth components were mapped to particular motions keeping in mind usage, frequency and flexibility of usage. For example:&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;ul style="margin-top: 0in;" type="disc"&gt;&lt;li class="MsoNormal" style="text-align: justify;"&gt;Rhythm was mapped to wrist motion in right or left      direction.&lt;/li&gt;&lt;li class="MsoNormal" style="text-align: justify;"&gt;Pitch is mapped to relative height of the right hand      to ground to the last note.&lt;/li&gt;&lt;li class="MsoNormal" style="text-align: justify;"&gt;Dynamics of the melody note are controlled by the      flexion of the right hand fingers&lt;/li&gt;&lt;li class="MsoNormal" style="text-align: justify;"&gt;Volume is controlled by the extent of extension of      the right fingers.&lt;/li&gt;&lt;li class="MsoNormal" style="text-align: justify;"&gt;Cadence (end of the music) is controlled by bending      of the left hand fingers completely.&lt;/li&gt;&lt;/ul&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;In order to add another instrument, left hand is lifted up higher than the right hand which enables the dual instrument mode and the second instrument starts playing in harmony with the first instrument.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;Discussion:&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;The paper started with many expectations but latter turned out to be bit disappointment. There is no mention of the details of the implementations. As per their description, the signals from the motions are used to trigger the events and any motion of the part of hand corresponding to the musical component will trigger an event. For example, even if wrist is moved a bit, which is unintentional, event will be triggered. There is no particular gesture associated, but just motion of the parts which is not a good approach. There is no training involved, so, I don’t understand how the motions of different users will correspond to the same rhythm and harmony. Also the gesture for pitch increase is confusing as I am not able to understand, how to musician will keep the track of the relative position, and even if he does by some means, how in dual mode he can control that as both hands are involved.  Besides, i don't understand if this can be useful to musicians as they often tend to use many instruments to compose music and I don't see any other way to add more than two instruments. Also during some compositions, some side music is added which may not be in exact harmony with the major notes, I don't see any way to incorporate that effect. There are many flaws in the approach which may end up in another paper, so I am stopping here. I believe approach could be refined by using machine learning models and using some easy better gestures which are not conflicting and confusing.&lt;br /&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-4848460750918515429?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/4848460750918515429/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=4848460750918515429' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/4848460750918515429'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/4848460750918515429'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/02/cyber-composer-hand-gesture-driven.html' title='Cyber Composer: Hand Gesture-Driven Intelligent Music Composition and Generation'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-6079114223570868462</id><published>2008-01-30T13:24:00.001-08:00</published><updated>2008-01-30T13:28:56.475-08:00</updated><title type='text'>Online Interactive Learning of Gestures for Human/ Robot Interfaces</title><content type='html'>&lt;b style=""&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/b&gt;In this paper authors have presented an interesting gesture recognition system that can be trained on fly. Keeping in mind the ability of Humans to learn the new gestures by looking at the human teacher, authors have tried to develop a similar system, where a human teacher can make robot to learn the new gesture on fly rather than depending upon the time consuming offline training. As per the authors, even 2 gestures were sufficient to teach new gesture to the gesture recognition system. This is quite close to human way of teaching and interacting. This approach is demonstrated by the authors, by developing a gesture recognition system that can recognize 14 alphabets of the sign language. For this system, they have used 18 sensors Cyber Glove as a feature capture device. They have chosen 14 alphabets such that there is little ambiguity between alphabets and they have avoided use of 6D Polhemus sensors which account for orientation and position information.  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;Keeping in mind the strength of temporal leaning exhibited by HMM and the highly stochastic behavior of the human gestures, authors have used them as classifiers for the gestures. For the computational simplicity, authors have used discrete HMMs and for that they have used a pre processing technique in which the continuous stream of data is discretized using a Vector Quantization of series of short time FFT approach which is popularly used in speech recognition community.&lt;/p&gt;    &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;br /&gt;This approach is basically a sampling technique which required windowing function to capture the prominent (Dominant) frequencies in the window and use them as the features for training the HMM’s. &lt;span style=""&gt; &lt;/span&gt;The vector quantizers &lt;span style=""&gt; &lt;/span&gt;then encodes the vector and a code book is formed where each vector is represented by a single index. As a new gesture is provided, the features are matched to the codebook entry and assigned to the code book with the lowest distance in least square norm sense. The interesting part of such a clustering (in spectral domain) is that it is not task specific and can be easily applied for many more recognition domains where features are consistent in nature over time, like Handwriting Recognition, Face Recognition etc.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;As far as the implementation is concerned, they have used a modified form of HMMs called Bakis HMMs which move from a given state to either same state or state that is within next two states. This ensures that the gestures to be classifieds are simple sequences of motions and non cylindrical in nature.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;In order to verify their classification rates, authors have used a confidence measure which measures the misclassification rates. Their results are very impressive as in one sample trial, with just 2 examples, they had 1% error rate which dropped significantly to 0.1% after 4 examples. In another sample trial, the error rate dropped down to 0 after 6 examples from 2.4 % with 2 training examples.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;Discussion&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;This paper presents a new approach of dealing with the gestures when we need to train a system on fly. The method is straight forward and is pretty accurate with small online training. I liked their approach where they tried to find a relation ship between the HMMs modeled for speech and that for gestures. I believe, it makes sense considering the fact that both are temporal approaches in practice. However, in spectral domain, we do have a good chance of noise addition which may lead to wrong selection of frequencies during sampling. Also with windowing, there is another problem associated, which is called leakage. Ideally, after FFT, the function should have 0 at the values close to ±ώ, however, windowing causes the waveform to have non zero values (of significant magnitude ) at the frequencies that are closer to ±ώ and also small magnitude non zero values at frequencies away from ±ώ. This leads to unwanted interferences which may affect the waveform and the spectral information. I am not sure; if this is affect is going to affect the formation of the input vector obtained from preprocessor. &lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;Also, believe the introduction of acceleration based method to segment the gestures is a nice way of dealing with the complexities of natural interaction and understanding as it would be very inconvenient for the people to pause between gestures as it is not natural to them.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;Over all I liked the approach and the insight provided as I was not aware of the usage of Spectral HMMs before in any other domain except speech.&lt;/p&gt;&lt;br /&gt;&lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="text-align: justify;"&gt;(FYI: A good reference for spectral analysis of speech (&lt;a href="http://www.cs.dartmouth.edu/%7Edwagn/aiproj/speech.html"&gt;Click here&lt;/a&gt;))&lt;br /&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-6079114223570868462?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/6079114223570868462/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=6079114223570868462' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/6079114223570868462'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/6079114223570868462'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/01/online-interactive-learning-of-gestures.html' title='Online Interactive Learning of Gestures for Human/ Robot Interfaces'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-4917069231037823129</id><published>2008-01-28T18:02:00.000-08:00</published><updated>2008-01-28T18:03:22.035-08:00</updated><title type='text'>HoloSketch: A virtual reality Sketching/Animation Tool</title><content type='html'>&lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-size: 11pt;"&gt;This paper, considering the technology of 1995 when, it was published presented a nice method of augmenting the 2D drawing to the 3D. Users manipulate the virtual world, which they see using in the big CRT monitor by using a hand held mouse like device, which authors call wand. This wand has 3 top buttons and one side button along with a protruding road which acts like a cursor. To keep the objects in the virtual world look close to the naturally viewed objects, computer calculates a new viewing matrix separately for each eye and the refreshing rate is kept close to at least 13 Hz. This ensures that high resolution close to reality image is observed in the virtual world.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-size: 11pt;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-size: 11pt;"&gt;In their effort to extend the 2D menus to the 3D in the virtual world, certain modifications had to be made to prevent occlusion and also to take care of the Fitt’s law in 3D as in 3D wand has to move more distance in 3D compared to one in the 2D.So in order to cope up with the problems, authors proposed another way of using huge pop menu with fade up menus to give way to other menus .All this is manipulated using the wand.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-size: 11pt;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-size: 11pt;"&gt;Holosketch has lot of features to explore by the users. There is a drawing mode in which users cab draw objects like cylinder, cones, rings etc and even create another instance of it using the left button and keeping that button pressed, object can be manipulated in the 3D space.Object for edition can be selected by placing the wand tip on it and pressing the middle button to select the object for edition. To support the movement in 3D much intuitive, they have supported the side button to present the grasp like feeling to the user and then working with the object in 3D. In order to prevent the accidental pressing of the button by the user, it is necessary that the user presses simultaneously the additional key board keys. The Holo-sketch supports three movements: rotational, orientation, rotation + orientation. They are also bound to the designated key board keys. Authors have highlighted the need to take care of the jitters and noise addition because of them and proposed a use of some kind of 10X reduction mode which provides finer control. The interesting part of the Holo-sketch is elementary animation like rotation, angle, movement between positions, scaling and oscillations.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-size: 11pt;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-size: 11pt;"&gt;Discussion:&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-size: 11pt;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-size: 11pt;"&gt;This paper presents a nice method to sketch in 3D and also add some animation to the creations. It is interesting that the system gives close to real-time observation as system is calculating matrix for each head movement. I believe with 1995 hardware, it is indeed a good step, however use of the CRT monitor doesn’t address to the usage of the system in the real world scenarios. Usage of virtual 3D projection technology along with augmented reality glasses would be a nice addition to the interface. Also, I believe there should be a kind of erase mode where user can erase a small part of the sketch in 3D as it will be challenging to do so in 3D considering occlusion cases. Also in this paper I wondered what is the 10x mode? There is actually no reference for that. However, I believe, for a device like wand, taking care of jitters was really a good approach though, I was unable to understand it.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-size: 11pt;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-size: 11pt;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-4917069231037823129?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/4917069231037823129/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=4917069231037823129' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/4917069231037823129'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/4917069231037823129'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/01/holosketch-virtual-reality.html' title='HoloSketch: A virtual reality Sketching/Animation Tool'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-7309552700493120984</id><published>2008-01-27T16:53:00.000-08:00</published><updated>2008-01-27T16:54:00.789-08:00</updated><title type='text'>An architecture for gesture based control for mobile robots</title><content type='html'>&lt;span style="font-size: 12pt; line-height: 115%; font-family: &amp;quot;Times New Roman&amp;quot;,&amp;quot;serif&amp;quot;;"&gt;This paper presents an interesting method to control a mobile robot by using the hand gestures. Authors have stressed that it is very important that the robot should be able to interpret the meaning of the action rather than just imitate the action Hand gestures have been described in the paper as a natural and rich input modality to interact with the robot.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-size: 12pt; line-height: 115%; font-family: &amp;quot;Times New Roman&amp;quot;,&amp;quot;serif&amp;quot;;"&gt;The complete system consists of a mobile robot, a CyberGlove, a Polhemus 6DOF position sensor and a geo-location sensor that tracks the position and orientation of the mobile robot. Apart from this there are two servers, Geo-location server and the Gesture server that are communication with each other. The task of the Geo-Location server is to keep track of the position and the direction of the mobile robot in 3 by 6 universal coordinate system. The role of the Gesture Recognition server is to interpret the gestures of the user, which are captured using the cyber glove and the Polhemus sensors, and then provide the interpretation to robot so that it can act on the input provided. All the components of the system are integrated within CyberRAVE which is a multi-Architecture robot positioning system for distributed robots and these servers communicate using the CyberRAVE interface. In order to recognize the gestures, authors have used a temporal HMM approach in order to take advantage of the temporal nature of the gestures. Instead of using the sensory information from all the 18 sensors, they condensed the feature vector from 18 dimensions to 10 features by linearly combining certain responses. This feature vector is then augmented by the &lt;span style=""&gt; &lt;/span&gt;first derivative of the 10 dimensional feature vector obtained in the previous step to obtain a 20 D column vector which is reduced to a single dimension codeword using the famous vector quantization. Authors, after examining the level of detail required for correct interpretation of the actions, chose 32 codewords. &lt;span style=""&gt; &lt;/span&gt;The code book for these codewords is trained offline where, they experimented with 5000 measurements which, as per them, captured all the possible samples of gestures and non gestures covering the entire span of the hand space. This set is them partitioned onto 32 final clusters and the centroid of the cluster forms the final codeword for the gestures in that cluster. These 32 code words are then used to define the 6 final gestures as a sequence of the codewords. The selected gestures are:&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-size: 12pt; line-height: 115%; font-family: &amp;quot;Times New Roman&amp;quot;,&amp;quot;serif&amp;quot;;"&gt;Opening: Moving from close fist to open hand.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-size: 12pt; line-height: 115%; font-family: &amp;quot;Times New Roman&amp;quot;,&amp;quot;serif&amp;quot;;"&gt;Opened: &lt;span style=""&gt; &lt;/span&gt;&lt;span style=""&gt; &lt;/span&gt;Flat opened hand.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-size: 12pt; line-height: 115%; font-family: &amp;quot;Times New Roman&amp;quot;,&amp;quot;serif&amp;quot;;"&gt;Closing:&lt;span style=""&gt;  &lt;/span&gt;&lt;span style=""&gt; &lt;/span&gt;Moving from a flat open hand to a closed fist.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-size: 12pt; line-height: 115%; font-family: &amp;quot;Times New Roman&amp;quot;,&amp;quot;serif&amp;quot;;"&gt;Pointing: Moving from the flat open hand to index pointing, or from a closed fist to index finger pointing.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-size: 12pt; line-height: 115%; font-family: &amp;quot;Times New Roman&amp;quot;,&amp;quot;serif&amp;quot;;"&gt;Waiving Left: Fingers extended and waiving left.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-size: 12pt; line-height: 115%; font-family: &amp;quot;Times New Roman&amp;quot;,&amp;quot;serif&amp;quot;;"&gt;Waiving Right: Fingers extended and waiving right.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-size: 12pt; line-height: 115%; font-family: &amp;quot;Times New Roman&amp;quot;,&amp;quot;serif&amp;quot;;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-size: 12pt; line-height: 115%; font-family: &amp;quot;Times New Roman&amp;quot;,&amp;quot;serif&amp;quot;;"&gt;HMM’s being a learning based models are bound to converge to a gesture interpretation even if the gesture does not imply anything. In order to prevent this, additional state called the wait state is introduced which, is the node state and there is an equal transition probability to all gesture models and itself. &lt;span style=""&gt; &lt;/span&gt;As an observation is made, the probability of being in the state is updated for all existing states and is normalized to 1. This model ensures that for all non-identified gestures, probability of being in the wait state is “maximum”. Thus, all unwanted gestures are eliminated from being recognized as one of the selected gestures. In case of correct gestures, the model which represents that gesture, would yield the highest probability and hence would be selected as the interpretation of the gesture.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-size: 12pt; line-height: 115%; font-family: &amp;quot;Times New Roman&amp;quot;,&amp;quot;serif&amp;quot;;"&gt;&lt;span style="font-weight: bold;"&gt;Discussion:&lt;/span&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-size: 12pt; line-height: 115%; font-family: &amp;quot;Times New Roman&amp;quot;,&amp;quot;serif&amp;quot;;"&gt;This paper presents a beautiful approach of using the hand gestures as the mode of communicating with the mobile robot. We humans use hand for communication when words fail to convey the meaning and augmenting the same approach, authors have developed a beautiful way of controlling the actions of the robot. Though, it is controlled environment communication, I consider it as a good step towards more complex systems. I believe, instead of using the ccd camera for the geo-location purposes, GPS can be used along with the magnet and gyroscope on board to convey the geographical and orientation information to the server which can communicate with the gesture interpretation server. This will give more mobility to the robot as well independence from the controlled environment. Also, onboard stereo cameras, along with the IR and ultrasonic sensors can be used for controlling the local motion of the robot. I believe, additional joystick can be used with the other hand to switch between the two modes when required and also to control the orientation of proposed on board &lt;span style=""&gt; &lt;/span&gt;stereo cameras.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-7309552700493120984?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/7309552700493120984/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=7309552700493120984' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/7309552700493120984'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/7309552700493120984'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/01/architecture-for-gesture-based-control.html' title='An architecture for gesture based control for mobile robots'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-5179661772887201729</id><published>2008-01-23T19:16:00.001-08:00</published><updated>2008-01-23T19:20:05.372-08:00</updated><title type='text'></title><content type='html'>&lt;p class="MsoNormal" style="text-align: center;" align="center"&gt;&lt;b style=""&gt;An Introduction to Hidden Markov Model&lt;o:p&gt;&lt;/o:p&gt;&lt;/b&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;b style=""&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/b&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;Author has presented a beautiful short tutorial on one of the best time series analysis model used in machine learning called hidden markov model. Hidden markov models tend to explain the process that caused a certain response without knowing about the underlying procedure but by just looking the observations. In other words, in HMM, underlining process that generated certain response is not known but by looking at the sequence of generation of response, the process is predicted. This is a stochastic process which similar to the&lt;span style=""&gt;  &lt;/span&gt;markov chains involve states transition probabilities, however unlike markov chains, each state is not fixed and is capable of generating outcome of any of the states in the model with certain probability. The capability of an HMM to generate the observation probability (which may be discrete or continuous) gives additional ability to the HMMS to generate any possible state observation in the process.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;A HMM is represented by:&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;T = Duration of observation sequence &lt;/p&gt;  &lt;p class="MsoNormal"&gt;O = Observation sequence of T observations&lt;/p&gt;  &lt;p class="MsoNormal"&gt;M = number of observation symbols&lt;br /&gt;N = number of states &lt;/p&gt;  &lt;p class="MsoNormal"&gt;Q = set of N states&lt;br /&gt;V = set of M possible observations&lt;br /&gt;A = state transition probability&lt;br /&gt;B&lt;sub&gt;t&lt;/sub&gt;(i) = observation probability distribution for a given state ‘i’ at time t&lt;br /&gt;π = initial state distribution&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;br /&gt;Usually HMM is represented as λ = (A, B, π).&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;After the basic introduction of the HMM, author introduces us to the three problems of HMM. The first problem of HMM is finding the probability of a given observation sequence given the model λ which is represented by P(O| λ). The second problem of HMM is estimating the optimum state sequence which is done by the ‘veretbi algorithm’ using the forward or backward procedure (which is mainly an induction procedure where track of optimum sequence till a given time‘t’ is kept). The third problem of HMM is the problem of estimation of parameters which is basically done by Baulm Welsh algorithm which is basically a form of the &lt;i style=""&gt;Expectation Maximization&lt;/i&gt; algorithm. (FYI: The third problem is considered to be the most difficult problem in HMM).&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;After discussing about the problem, author introduces us to the different structures of HMM like ergodic in which each state is revisited more than once and then the non ergodic models where constraints are put on the transitions. A good example of non ergodic model is the left right model in which transition happen either in Left or Right direction.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;The author also highlighted to us the issues dealing with the training of the HMM’s and warned us that if there is no occurrence of symbol in the training set, it will lead to the 0 probability and hence the model may fail. As such sufficient training should be made available to deal with the issue.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;Discussion:&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;This is a classical paper on HMM and short form of much detailed ‘Tutorial on HMM’&lt;span style=""&gt;  &lt;/span&gt;which is one of my favorite papers. I don’t find any drawbacks in the method except that HMM’s are too much training dependent like most of the Machine learning Algorithms but they have been proved to perform much than others in time analysis problems.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-5179661772887201729?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/5179661772887201729/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=5179661772887201729' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/5179661772887201729'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/5179661772887201729'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/01/introduction-to-hidden-markov-model_23.html' title=''/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-7237010150815738652</id><published>2008-01-23T18:09:00.001-08:00</published><updated>2008-01-23T18:09:33.589-08:00</updated><title type='text'>Environmental Technology: Making the Real World Virtual</title><content type='html'>&lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-size: 11pt;"&gt;Reading this paper brings to my mind some of the sci-fi fiction movies where machines and humans are interacting with each making world more comfortable and machines more useful. Author talks in similar terms about integrating computers and their power in the daily life of the human beings. Author suggests that the interaction between the machine and humans should be natural and comfortable and he has mentioned about Sutherlands head mounted device which he rejected as it was not comfortable. In order to demonstrate his intention, he worked under a framework where the environment interacted with the user based on his actions. He also talks about another of his systems where, two geographically separated individuals were able to communicate naturally by superimposing the data image of the user over the computer graphics which were controlled by the data tablet. In another form of such a communication system, the hand images of the users were superimposed on computer graphics so they are able to use their hands to communicate as it they were sitting together. Author also mentions about usage of projection displays to create a virtual world where, the subject can fly by leaning towards the direction in which he would like to fly. Another interesting application in the paper was VIDEOPLACE which enables users to interact with the 2D projection of the images in a special room. VIDEOPLACE was improved to VIDEODESK where interaction occurs in three dimensions which could be helpful in 3D sculpting. The paper also about the system where children were animated in a graphical scene to learn the concepts while experiencing it .&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;    &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-size: 11pt;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;br /&gt;Discussion&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;    &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;span style="font-size: 11pt;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;br /&gt;This paper presents various efforts by the author during his research career in bringing computers into the daily life of humans through interactions. I believe this is the common goal in the Computer human interaction to use the existing technologies, improve them and make them more usable for the masses. This paper for me is motivation about how simple concepts and ideas of the existing technology can be manipulated keeping in mind the psychology and perceptual knowledge about the human brain to build more useful computer technology. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-7237010150815738652?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/7237010150815738652/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=7237010150815738652' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/7237010150815738652'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/7237010150815738652'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/01/environmental-technology-making-real.html' title='Environmental Technology: Making the Real World Virtual'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-9202434854283855113</id><published>2008-01-23T18:07:00.000-08:00</published><updated>2008-01-23T18:08:49.431-08:00</updated><title type='text'>American Sign Language Finger Spelling Recognition System</title><content type='html'>&lt;h3 style="text-align: justify;"&gt;&lt;span style="font-size: 11pt; font-weight: normal;"&gt;This paper presents a simple way of recognizing the sign gestures which after being recognized can be used as an input to a speech engine or text editing software for speaking or displaying the alphabet. The authors have used 18 sensor cyberglove to capture the sensor response for the sign language gesture for an alphabet and trained a neural network (perceptron) to recognize the alphabet corresponding to the gesture. The input to the neural network is the 18x24 matrix which represents the sensor response of 24 alphabets (except J and Z) and also a 24x24 identity matrix which represents the targeted Alphabet. This input is used in to train a Neural Network using MATLAB toolbox and then, for real-time usage, integrated with Lab view (a product developed by National Instruments for real-time applications).&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/h3&gt;  &lt;h3 style="text-align: justify;"&gt;&lt;span style="font-size: 11pt; font-weight: normal;"&gt;In order to recognize a gesture, user makes the sign language gesture corresponding to the alphabet, which is then fed to the Neural Network framework running on Lab view. This framework responds back with a 1x24 matrix with ‘1’ at the position of the alphabet corresponding to the gesture interpretation by the neural network.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/h3&gt;      &lt;h3 style="text-align: justify;"&gt;&lt;span style="font-size: 11pt; font-weight: normal;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;span style="font-weight: bold;"&gt;Discussion:&lt;/span&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/h3&gt;  &lt;h3 style="text-align: justify;"&gt;&lt;span style="font-size: 11pt; font-weight: normal;"&gt;This is a pretty straight forward and simple approach to recognize the sign languages. However the limitation is that it is very user dependent and it works only, if the network is trained from the data obtained by the user who is intended to use the system. Another drawback is the omission of ‘J’ and ‘Z’ which makes it incomplete. It would have been better if some other gestures are used for J and Z which can be taught to the intended users. Also, neural network is very prone to noise and it would have been better, if the authors would have tried to add artificial noise while training which would have made training robust enough and might have increased accuracy. I feel for besides the alphabets, there should be some gesture for deletion and break also as then we can use the system to recognize alphabets which will form words, completion of the word can be represented by the break gesture, and then it can be fed to the speech generation system which can speak the word. Before using break, we can edit the word using the delete gesture which can delete an alphabet.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/h3&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-9202434854283855113?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/9202434854283855113/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=9202434854283855113' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/9202434854283855113'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/9202434854283855113'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/01/american-sign-language-finger-spelling.html' title='American Sign Language Finger Spelling Recognition System'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9183475114237580876.post-3087711945883758707</id><published>2008-01-23T18:04:00.000-08:00</published><updated>2008-01-23T18:07:39.944-08:00</updated><title type='text'>Flexible Gesture Recognition for Immersive Virtual Environments.</title><content type='html'>&lt;p class="MsoNormal" style="text-align: justify;"&gt;This paper presents a method based on gesture recognition to interact with the virtual environment in 3D. Highlighting the cons of the current means of Gesture Recognition, author emphasizes that it is best to have an&lt;span style=""&gt;   &lt;/span&gt;interacting system which is more natural and doesn’t require special clothing, background and other location based constraints for recognizing gestures. As a solution, they have proposed usage of an inexpensive hand glove based devices to interact with the virtual environment emphasizing the role of hands while interacting even with the natural environment. Authors have highlighted that though the hand gloves are good for interaction, the problem lies in measuring their orientation and position in the space. With the cheaper P5 gloves (which authors have used) the position in space cannot be found accurately as the position sensor is an infrared sensor based on reflection of the IR radiation and in the professional glove, does not have a location sensor and requires additional Electromagnetic radiation based Flocker birds to locate the position of the gloves in the space. Apart form this, professional gloves require additional wire that connects the glove with the device and send back the sensor information. Though, this advance set up is able to give best estimation of position, it is cumbersome to use and affected if there are metallic surfaces in the vicinity of the usage area. Keeping these problems in mind, the authors have used gestures based on the flexion information and incorporated additional information about the orientation for defining gestures and eliminated the usage of gestures which require hand motion in space over time. Since between gestures there are always some unintended movements which are not the gestures, to deal with them, they have added time constraint. Thus with this time constraint only the gestures that are held static for some predetermined time are recorded and rest non gesture movements are discarded. This method helps in constructing complex gestures formed by multiple small gestures. Based on the discussion above, authors have defined gesture as sequence of succeeding postures.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;Gestures are recorded in form of a 5D vector for finger flexions, were each dimension corresponds to the sensor value received from the P5 glove, orientation information and another value, indicating the relevance of orientation. In order to recognize the gestures, they have framed a gesture manager which has a template for each gesture defined by the authors. The template is in form a response from the P5 finger sensors which are framed as a 5D vector, based on the flexion values corresponding to the particular gestures. In order to deal with the variability in the gestures (even by the same person), they have used a gesture averaged over several similar gestures by the person, as a template for a gesture.&lt;span style=""&gt;  &lt;/span&gt;Each gesture corresponds to some identity which can trigger an event. For the recognition, the input value is obtained from the glove and compared against the templates to measure a distance metric. If there is a gesture in the library, which corresponds to the minimum distance metric keeping cognizance of the defined thresholds, then orientation is compared and if that too is within the defined threshold, the input gesture is recognized. After the gesture is recognized, the identity associated with that gesture triggers the associated event. If there is no match, no gesture is returned.&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify; font-weight: bold;"&gt;Discussion:&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify;"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-align: justify; font-family: times new roman;"&gt;&lt;span style="font-size:130%;"&gt;This is a simple paper presenting a beautiful way of interacting with the virtual environment. I liked the way they have explained the previous works on image based gesture recognition and the problems associated with them and also the problems with the glove based gesture recognition. Based on only the flexion values and the orientation though they have eased out the complexity of the problem, but still the solution is very affective to deal with the simple interactions. But this method lacks a lot. First of all, since we are dealing with just orientations and flexion values, there should be a proper position of the infrared receiver tower and the location of the hand for both training and testing as change in any of the position would add errors to the input as different positions (usually at an angle to the receiver) would add different phase to the reflected infra red (IR) radiation and thus different response will be recorded. Also, we have to be very close to the tower to get orientation feedback as the IR responses are not very strong at larger distances. Secondly, I believe that there are fewer gestures that look very different for each other while using just the fingers. So there is a good chance of misclassification in similar looking gestures. Another drawback of the paper is that they have not mentioned the distance metric they have used, I believe simple Euclidian distance is not a good measure of similarity as&lt;/span&gt;&lt;span style=";font-size:130%;" &gt;  &lt;/span&gt;&lt;span style="font-size:130%;"&gt;larger response of even a single sensor may lead to larger over all distance even though all other responses may lead to smaller distances. May be a normalized measure of distance can provide a better solution. Apart from this, the biggest drawback is that there is no information about the performance results and the gestures that they have tried and if some gestures caused some ambiguity. I would be also interested to know if we can look for a totally user independent way to interact with the virtual environment by using some normalized sensor responses as input.&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;o:p&gt; &lt;/o:p&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9183475114237580876-3087711945883758707?l=pankaj-haptics.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pankaj-haptics.blogspot.com/feeds/3087711945883758707/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=9183475114237580876&amp;postID=3087711945883758707' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/3087711945883758707'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9183475114237580876/posts/default/3087711945883758707'/><link rel='alternate' type='text/html' href='http://pankaj-haptics.blogspot.com/2008/01/flexible-gesture-recognition-for.html' title='Flexible Gesture Recognition for Immersive Virtual Environments.'/><author><name>Pankaj</name><uri>http://www.blogger.com/profile/08364023494991232400</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://bp3.blogger.com/_3DcJ3hARvOY/R5f2tGH-aeI/AAAAAAAAAPU/tlJZTBh_2jg/S220/Sai+Baba+004.JPG'/></author><thr:total>6</thr:total></entry></feed>
