Wednesday, January 23, 2008

An Introduction to Hidden Markov Model

Author has presented a beautiful short tutorial on one of the best time series analysis model used in machine learning called hidden markov model. Hidden markov models tend to explain the process that caused a certain response without knowing about the underlying procedure but by just looking the observations. In other words, in HMM, underlining process that generated certain response is not known but by looking at the sequence of generation of response, the process is predicted. This is a stochastic process which similar to the markov chains involve states transition probabilities, however unlike markov chains, each state is not fixed and is capable of generating outcome of any of the states in the model with certain probability. The capability of an HMM to generate the observation probability (which may be discrete or continuous) gives additional ability to the HMMS to generate any possible state observation in the process.

A HMM is represented by:

T = Duration of observation sequence

O = Observation sequence of T observations

M = number of observation symbols
N = number of states

Q = set of N states
V = set of M possible observations
A = state transition probability
Bt(i) = observation probability distribution for a given state ‘i’ at time t
π = initial state distribution


Usually HMM is represented as λ = (A, B, π).

After the basic introduction of the HMM, author introduces us to the three problems of HMM. The first problem of HMM is finding the probability of a given observation sequence given the model λ which is represented by P(O| λ). The second problem of HMM is estimating the optimum state sequence which is done by the ‘veretbi algorithm’ using the forward or backward procedure (which is mainly an induction procedure where track of optimum sequence till a given time‘t’ is kept). The third problem of HMM is the problem of estimation of parameters which is basically done by Baulm Welsh algorithm which is basically a form of the Expectation Maximization algorithm. (FYI: The third problem is considered to be the most difficult problem in HMM).

After discussing about the problem, author introduces us to the different structures of HMM like ergodic in which each state is revisited more than once and then the non ergodic models where constraints are put on the transitions. A good example of non ergodic model is the left right model in which transition happen either in Left or Right direction.

The author also highlighted to us the issues dealing with the training of the HMM’s and warned us that if there is no occurrence of symbol in the training set, it will lead to the 0 probability and hence the model may fail. As such sufficient training should be made available to deal with the issue.

Discussion:

This is a classical paper on HMM and short form of much detailed ‘Tutorial on HMM’ which is one of my favorite papers. I don’t find any drawbacks in the method except that HMM’s are too much training dependent like most of the Machine learning Algorithms but they have been proved to perform much than others in time analysis problems.

4 comments:

- D said...

Everything is training dependent. I don't think you necessarily need a lot of training data for HMMs, you just need a representative sample (like anything else).

Pankaj said...

Yes, I do agree that everything is training dependent but I mean to say if you do not have any instance of the an example in the training, HMMS are not going to be very good at classification. It implies to any machine learning Technique!!! [:)]

Brandon said...

if only someone would develop an end-all, be-all learning algorithm that requires no training at all...

but then again, that would probably put us out of business

Paul Taele said...

Gah! What's with you guys and your mastery in the art of HMMs?! All this peer pressure is making me want to know more...

Perhaps since I just started taking Machine Learning that I'm reading this introductory paper as if though it were some sort of alien concept (even w/ the brief intro Dr. Hammond gave to the SR class in the second part of that semester). The coin example explanation could have been done better, since I had to read it at least three times to make some progress in understanding it. I did, however, love the urn and ball example. I think that's the model HMM example to use.