Robust Mixture Clustering Using Pearson Type VII Distribution

J Sun; Ata Kaban; JM Garibaldi

doi:10.1016/j.patrec.2010.07.015

Robust Mixture Clustering Using Pearson Type VII Distribution

J Sun, Ata Kaban, JM Garibaldi

Computer Science

Research output: Contribution to journal › Article

18 Citations (Scopus)

Abstract

A mixture of Student t-distributions (MoT) has been widely used to model multivariate data sets with atypical observations, or outliers for robust clustering. In this paper, we developed a novel robust clustering approach by modeling the data sets using mixture of Pearson type VII distributions (MoP). An EM algorithm is developed for the maximum likelihood estimation of the model parameters. An outlier detection criterion is derived from the EM solution. Controlled experimental results on the synthetic datasets show that the MoP is more viable than the MoT. The MoP performs comparably if not better, on average, in terms of outlier detection accuracy and out-of-sample log-likelihood with the MoT. Furthermore, we compared the performances of the Pearson type VII and the student t mixtures on the classification of several real pattern recognition data sets. The comparison favours the developed Pearson type VII mixtures. (C) 2010 Elsevier B.V. All rights reserved.

Original language	English
Pages (from-to)	2447-2454
Number of pages	8
Journal	Pattern Recognition Letters
Volume	31
Issue number	16
DOIs	https://doi.org/10.1016/j.patrec.2010.07.015
Publication status	Published - 1 Dec 2010

Keywords

Pearson type VII distribution
Robust learning
Outlier detection
Robust mixture modeling

Access to Document

10.1016/j.patrec.2010.07.015

Cite this

@article{87ae4b20561447198ad3257dbf9f8c92,

title = "Robust Mixture Clustering Using Pearson Type VII Distribution",

abstract = "A mixture of Student t-distributions (MoT) has been widely used to model multivariate data sets with atypical observations, or outliers for robust clustering. In this paper, we developed a novel robust clustering approach by modeling the data sets using mixture of Pearson type VII distributions (MoP). An EM algorithm is developed for the maximum likelihood estimation of the model parameters. An outlier detection criterion is derived from the EM solution. Controlled experimental results on the synthetic datasets show that the MoP is more viable than the MoT. The MoP performs comparably if not better, on average, in terms of outlier detection accuracy and out-of-sample log-likelihood with the MoT. Furthermore, we compared the performances of the Pearson type VII and the student t mixtures on the classification of several real pattern recognition data sets. The comparison favours the developed Pearson type VII mixtures. (C) 2010 Elsevier B.V. All rights reserved.",

keywords = "Pearson type VII distribution, Robust learning, Outlier detection, Robust mixture modeling",

author = "J Sun and Ata Kaban and JM Garibaldi",

year = "2010",

month = dec,

day = "1",

doi = "10.1016/j.patrec.2010.07.015",

language = "English",

volume = "31",

pages = "2447--2454",

journal = "Pattern Recognition Letters",

publisher = "Elsevier",

number = "16",

}

TY - JOUR

T1 - Robust Mixture Clustering Using Pearson Type VII Distribution

AU - Sun, J

AU - Kaban, Ata

AU - Garibaldi, JM

PY - 2010/12/1

Y1 - 2010/12/1

N2 - A mixture of Student t-distributions (MoT) has been widely used to model multivariate data sets with atypical observations, or outliers for robust clustering. In this paper, we developed a novel robust clustering approach by modeling the data sets using mixture of Pearson type VII distributions (MoP). An EM algorithm is developed for the maximum likelihood estimation of the model parameters. An outlier detection criterion is derived from the EM solution. Controlled experimental results on the synthetic datasets show that the MoP is more viable than the MoT. The MoP performs comparably if not better, on average, in terms of outlier detection accuracy and out-of-sample log-likelihood with the MoT. Furthermore, we compared the performances of the Pearson type VII and the student t mixtures on the classification of several real pattern recognition data sets. The comparison favours the developed Pearson type VII mixtures. (C) 2010 Elsevier B.V. All rights reserved.

AB - A mixture of Student t-distributions (MoT) has been widely used to model multivariate data sets with atypical observations, or outliers for robust clustering. In this paper, we developed a novel robust clustering approach by modeling the data sets using mixture of Pearson type VII distributions (MoP). An EM algorithm is developed for the maximum likelihood estimation of the model parameters. An outlier detection criterion is derived from the EM solution. Controlled experimental results on the synthetic datasets show that the MoP is more viable than the MoT. The MoP performs comparably if not better, on average, in terms of outlier detection accuracy and out-of-sample log-likelihood with the MoT. Furthermore, we compared the performances of the Pearson type VII and the student t mixtures on the classification of several real pattern recognition data sets. The comparison favours the developed Pearson type VII mixtures. (C) 2010 Elsevier B.V. All rights reserved.

KW - Pearson type VII distribution

KW - Robust learning

KW - Outlier detection

KW - Robust mixture modeling

U2 - 10.1016/j.patrec.2010.07.015

DO - 10.1016/j.patrec.2010.07.015

M3 - Article

VL - 31

SP - 2447

EP - 2454

JO - Pattern Recognition Letters

JF - Pattern Recognition Letters

IS - 16

ER -

Robust Mixture Clustering Using Pearson Type VII Distribution

Abstract

Keywords

Access to Document

Fingerprint

Cite this