Selecting hidden Markov model state number with cross-validated likelihood

Celeux, Gilles; Durand, Jean-Baptiste

doi:10.1007/s00180-007-0097-1

Selecting hidden Markov model state number with cross-validated likelihood

Original Paper
Published: 07 December 2007

Volume 23, pages 541–564, (2008)
Cite this article

Computational Statistics Aims and scope Submit manuscript

1393 Accesses
142 Citations
3 Altmetric
Explore all metrics

Abstract

The problem of estimating the number of hidden states in a hidden Markov model is considered. Emphasis is placed on cross-validated likelihood criteria. Using cross-validation to assess the number of hidden states allows to circumvent the well-documented technical difficulties of the order identification problem in mixture models. Moreover, in a predictive perspective, it does not require that the sampling distribution belongs to one of the models in competition. However, computing cross-validated likelihood for hidden Markov models for which only one training sample is available, involves difficulties since the data are not independent. Two approaches are proposed to compute cross-validated likelihood for a hidden Markov model. The first one consists of using a deterministic half-sampling procedure, and the second one consists of an adaptation of the EM algorithm for hidden Markov models, to take into account randomly missing values induced by cross-validation. Numerical experiments on both simulated and real data sets compare different versions of cross-validated likelihood criterion and penalised likelihood criteria, including BIC and a penalised marginal likelihood criterion. Those numerical experiments highlight a promising behaviour of the deterministic half-sampling criterion.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+

from £29.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Out-of-time cross-validation strategies for classification in the presence of dataset shift

Article 21 August 2021

Improved Cross-Validation for Classifiers that Make Algorithmic Choices to Minimise Runtime Without Compromising Output Correctness

Multivariate hidden Markov regression models: random covariates and heavy-tailed distributions

Article 18 November 2019

References

Akaike H (1973). Information theory as an extension of the maximum likelihood theory. In: Petrov, BN and Csaki, F (eds) Second International Symposium on Information Theory, pp 267–281. Akademiai Kiado, Budapest
Google Scholar
Baum LE, Petrie T, Soules G and Weiss N (1970). A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Ann Math Stat 41(1): 164–171
Article MATH MathSciNet Google Scholar
Bernardo JM and Smith AFM (1994). Bayesian theory. Wiley, Chichester
MATH Google Scholar
Biernacki C, Celeux G and Govaert G (2001). Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans Pattern Anal Mach Intel 22(7): 719–725
Article Google Scholar
Biernacki C, Celeux G and Govaert G (2003). Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models. Comput Stat Data Anal 41(3–4): 561–575
Article MathSciNet Google Scholar
Boucheron S, Gassiat E (2005) Inference in hidden Markov models, chapter order estimation. In: Cappé O, Moulines E, Rydén T (eds) Springer, Heidelberg
Celeux G, Clairambault J (1992) Estimation de chaînes de Markov cachées : méthodes et problèmes. In: Actes des journées thématiques Approches markoviennes en signal et images. GDR signal-images CNRS, pp 5–20
Churchill GA (1989). Stochastic models for heterogeneous DNA sequences. Bull Math Biol 51: 79–94
MATH MathSciNet Google Scholar
Clairambault J, Curzi-Dascalova L, Kauffmann F, Médigue C and Leffler C (1992). Heart rate variability in normal sleeping full-term and preterm neonates. Early Human Dev 28: 169–183
Article Google Scholar
Dempster AP, Laird NM and Rubin DB (1977). Maximum likelihood from incomplete data via the EM Algorithm. J R Stat Soc Ser B 39: 1–38
MATH MathSciNet Google Scholar
Devijver PA (1985). Baum’s forward–backward Algorithm revisited. Pattern Recogn Lett 3: 369–373
Article MATH Google Scholar
Durand J-B (2003) Modèles à structure cachée : inférence, s諥ction de modèles et applications (in French). Ph.D. thesis, Université Grenoble 1 - Joseph Fourier
Ephraim Y and Merhav N (2002). Hidden Markov processes. IEEE Trans Inform Theory 48: 1518–1569
Article MATH MathSciNet Google Scholar
Fraley C and Raftery AE (2002). Model-based clustering, discriminant Analysis and density estimation. J Am Stat Assoc 97: 611–631
Article MATH MathSciNet Google Scholar
Gassiat E (2002). Likelihood ratio inequalities with application to various mixtures. Ann Inst Henri Poincaré 38: 897–906
Article MATH MathSciNet Google Scholar
Gassiat E and Kéribin C (2000). The likelihood ratio test for the number of components in a mixture with Markov regime. ESAIM P S 4: 25–52
Article MATH Google Scholar
Kass RE and Raftery AE (1995). Bayes factors. J Am Stat Assoc 90(430): 773–795
Article MATH Google Scholar
Kéribin C (2000). Consistent estimation of the order of mixture models. Sankhya Ser A 62: 49–66
MATH MathSciNet Google Scholar
McLachlan GJ and Peel D (1997). On a resampling approach to choosing the number of components in normal mixture models. In: Billard, L and Fisher, NI (eds) Computing science and statistics, vol 28, pp 260–266. Interface Foundation of North America, Fairfax Station
Google Scholar
McLachlan GJ and Peel D (2000). Finite mixture models. Wiley Series in probability and statistics. Wiley, London
Google Scholar
Rabiner LR (1989). A tutorial on hidden Markov models and selected Applications in speech recognition. Proc IEEE 77: 257–286 (February)
Article Google Scholar
Redner RA and Walker HF (1984). Mixture densities, maximum likelihood and the EM algorithm. SIAM Rev 26(2): 195–239
Article MATH MathSciNet Google Scholar
Ripley BD (1996). Pattern recognition and neural networks. Cambridge University Press, London
MATH Google Scholar
Robert CP, Celeux G and Diebolt J (1993). Bayesian estimation of hidden Markov chains: A stochastic implementation. Stat Probab Lett 16(1): 77–83
Article MATH MathSciNet Google Scholar
Robertson AW, Kirshner S and Smyth P (2004). Downscaling of daily rainfall occurence over Northeast Brazil using a hidden Markov model. J Clim 17(7): 4407–4424
Article Google Scholar
Roeder K and Wasserman L (1997). Practical Bayesian density estimation using mixtures of normals. J Am Stat Assoc 92(439): 894–902
Article MATH MathSciNet Google Scholar
Schwarz G (1978). Estimating the dimension of a model. Ann Stat 6: 461–464
Article MATH Google Scholar
Smyth P (2000). Model selection for probabilistic clustering using cross-validated likelihood. Stat Comput 10(1): 63–72
Article Google Scholar
Spiegelhalter DJ, Best NG and Carlin BP (2000). Bayesian measures of model complexity and fit (with discussion). J R Stat Soc Ser B 64(4): 583–639
Article Google Scholar
Yang Y (2005). Can the strengths of AIC and BIC be shared? A confict between model identification and regression estimation. Biometrika 92: 937–950
Article MathSciNet Google Scholar
Zhang P (1993). Model selection via multifold cross validation. Ann Stat 21(1): 299–313
Article Google Scholar
Zhang NR and Siegmund DO (2007). A modified Bayes information criterion with applications to the analysis of comparative genomic hybridization data. Biometrics 63(1): 22–32
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Département de Mathématiques, INRIA Futurs, Orsay, Université Paris-Sud, Bâtiment 425, 91405, Orsay Cedex, France
Gilles Celeux
Laboratoire Jean Kuntzmann, INRIA Rhône-Alpes, Grenoble Universités, 51 rue des Mathématiques, B.P. 53,, 38 041, Grenoble Cedex 9, France
Jean-Baptiste Durand

Authors

Gilles Celeux
View author publications
Search author on:PubMed Google Scholar
Jean-Baptiste Durand
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Jean-Baptiste Durand.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Celeux, G., Durand, JB. Selecting hidden Markov model state number with cross-validated likelihood. Comput Stat 23, 541–564 (2008). https://doi.org/10.1007/s00180-007-0097-1

Download citation

Received: 14 December 2006
Accepted: 14 November 2007
Published: 07 December 2007
Issue date: October 2008
DOI: https://doi.org/10.1007/s00180-007-0097-1

Keywords

Profiles

Jean-Baptiste Durand View author profile

Access this article

Log in via an institution

Subscribe and save

Springer+

from £29.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Selecting hidden Markov model state number with cross-validated likelihood

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Out-of-time cross-validation strategies for classification in the presence of dataset shift

Improved Cross-Validation for Classifiers that Make Algorithmic Choices to Minimise Runtime Without Compromising Output Correctness

Multivariate hidden Markov regression models: random covariates and heavy-tailed distributions

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Profiles

Subscribe and save

Buy Now