Graph-based approaches for over-sampling in the context of ordinal regression

María Pérez-Ortiz; Pedro Antonio Gutiérrez; César Hervás-Martínez; Xin Yao

doi:10.1109/TKDE.2014.2365780

Graph-based approaches for over-sampling in the context of ordinal regression

María Pérez-Ortiz, Pedro Antonio Gutiérrez, César Hervás-Martínez, Xin Yao

Computer Science

Research output: Contribution to journal › Article › peer-review

41 Citations (Scopus)

Abstract

The classification of patterns into naturally ordered labels is referred to as ordinal regression or ordinal classification. Usually, this classification setting is by nature highly imbalanced, because there are classes in the problem that are a priori more probable than others. Although standard over-sampling methods can improve the classification of minority classes in ordinal classification, they tend to introduce severe errors in terms of the ordinal label scale, given that they do not take the ordering into account. A specific ordinal over-sampling method is developed in this paper for the first time in order to improve the performance of machine learning classifiers. The method proposed includes ordinal information by approaching over-sampling from a graph-based perspective. The results presented in this paper show the good synergy of a popular ordinal regression method (a reformulation of support vector machines) with the graph-based proposed algorithms, and the possibility of improving both the classification and the ordering of minority classes. A cost-sensitive version of the ordinal regression method is also introduced and compared with the over-sampling proposals, showing in general lower performance for minority classes.

Original language	English
Article number	6940273
Pages (from-to)	1233-1245
Number of pages	13
Journal	IEEE Transactions on Knowledge and Data Engineering
Volume	27
Issue number	5
DOIs	https://doi.org/10.1109/TKDE.2014.2365780
Publication status	Published - 1 May 2015

Keywords

imbalanced classification
ordinal classification
ordinal regression
Over-sampling

ASJC Scopus subject areas

Computational Theory and Mathematics
Information Systems
Computer Science Applications

Access to Document

10.1109/TKDE.2014.2365780

Cite this

@article{33198a74144f46e180bedc18ac61237a,

title = "Graph-based approaches for over-sampling in the context of ordinal regression",

abstract = "The classification of patterns into naturally ordered labels is referred to as ordinal regression or ordinal classification. Usually, this classification setting is by nature highly imbalanced, because there are classes in the problem that are a priori more probable than others. Although standard over-sampling methods can improve the classification of minority classes in ordinal classification, they tend to introduce severe errors in terms of the ordinal label scale, given that they do not take the ordering into account. A specific ordinal over-sampling method is developed in this paper for the first time in order to improve the performance of machine learning classifiers. The method proposed includes ordinal information by approaching over-sampling from a graph-based perspective. The results presented in this paper show the good synergy of a popular ordinal regression method (a reformulation of support vector machines) with the graph-based proposed algorithms, and the possibility of improving both the classification and the ordering of minority classes. A cost-sensitive version of the ordinal regression method is also introduced and compared with the over-sampling proposals, showing in general lower performance for minority classes.",

keywords = "imbalanced classification, ordinal classification, ordinal regression, Over-sampling",

author = "Mar{\'i}a P{\'e}rez-Ortiz and Guti{\'e}rrez, {Pedro Antonio} and C{\'e}sar Herv{\'a}s-Mart{\'i}nez and Xin Yao",

year = "2015",

month = may,

day = "1",

doi = "10.1109/TKDE.2014.2365780",

language = "English",

volume = "27",

pages = "1233--1245",

journal = "IEEE Transactions on Knowledge and Data Engineering",

issn = "1041-4347",

publisher = "Institute of Electrical and Electronics Engineers (IEEE)",

number = "5",

}

TY - JOUR

T1 - Graph-based approaches for over-sampling in the context of ordinal regression

AU - Pérez-Ortiz, María

AU - Gutiérrez, Pedro Antonio

AU - Hervás-Martínez, César

AU - Yao, Xin

PY - 2015/5/1

Y1 - 2015/5/1

N2 - The classification of patterns into naturally ordered labels is referred to as ordinal regression or ordinal classification. Usually, this classification setting is by nature highly imbalanced, because there are classes in the problem that are a priori more probable than others. Although standard over-sampling methods can improve the classification of minority classes in ordinal classification, they tend to introduce severe errors in terms of the ordinal label scale, given that they do not take the ordering into account. A specific ordinal over-sampling method is developed in this paper for the first time in order to improve the performance of machine learning classifiers. The method proposed includes ordinal information by approaching over-sampling from a graph-based perspective. The results presented in this paper show the good synergy of a popular ordinal regression method (a reformulation of support vector machines) with the graph-based proposed algorithms, and the possibility of improving both the classification and the ordering of minority classes. A cost-sensitive version of the ordinal regression method is also introduced and compared with the over-sampling proposals, showing in general lower performance for minority classes.

AB - The classification of patterns into naturally ordered labels is referred to as ordinal regression or ordinal classification. Usually, this classification setting is by nature highly imbalanced, because there are classes in the problem that are a priori more probable than others. Although standard over-sampling methods can improve the classification of minority classes in ordinal classification, they tend to introduce severe errors in terms of the ordinal label scale, given that they do not take the ordering into account. A specific ordinal over-sampling method is developed in this paper for the first time in order to improve the performance of machine learning classifiers. The method proposed includes ordinal information by approaching over-sampling from a graph-based perspective. The results presented in this paper show the good synergy of a popular ordinal regression method (a reformulation of support vector machines) with the graph-based proposed algorithms, and the possibility of improving both the classification and the ordering of minority classes. A cost-sensitive version of the ordinal regression method is also introduced and compared with the over-sampling proposals, showing in general lower performance for minority classes.

KW - imbalanced classification

KW - ordinal classification

KW - ordinal regression

KW - Over-sampling

UR - http://www.scopus.com/inward/record.url?scp=84926628016&partnerID=8YFLogxK

U2 - 10.1109/TKDE.2014.2365780

DO - 10.1109/TKDE.2014.2365780

M3 - Article

AN - SCOPUS:84926628016

SN - 1041-4347

VL - 27

SP - 1233

EP - 1245

JO - IEEE Transactions on Knowledge and Data Engineering

JF - IEEE Transactions on Knowledge and Data Engineering

IS - 5

M1 - 6940273

ER -

Graph-based approaches for over-sampling in the context of ordinal regression

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Fingerprint

Cite this