MMIHMM: Maximum Mutual Information Hidden Markov Models

  • Nuria Oliver ,
  • Ashutosh Garg

MSR-TR-2002-13 |

This paper proposes a new family of Hidden Markov Models named Maximum Mutual Information Hidden Markov Models (MMIHMMs). MMIHMMs have the same graphical structure as HMMs. However, the cost function being optimized is not the joint likelihood of the observations and the hidden states. It consists of the weighted linear combination of the mutual information between the hidden states and the observations and the likelihood of the observations and the states. We present both theoretical and practical motivations for having such a cost function. Next, we derive the parameter estimation (learning) equations for both the discrete and continuous observation cases. Finally we illustrate the superiority of our approach in different classification tasks by comparing the classification performance of our proposed Maximum Mutual Information HMMs (MMIHMMs) with standard Maximum Likelihood HMMs (HMMs), in the case of synthetic and real, discrete and continuous, supervised and unsupervised data. We believe that MMIHMMs are a powerful tool to solve many of the problems associated with HMMs when used for classification and/or clustering.