Feature relevance determination for ordinal regression in the context of feature redundancies and privileged information

Lukas Pfannschmidt; Jonathan Jacob; Fabian  Hinder; Michael  Biehl; Peter Tino; Barbara Hammer

doi:10.1016/j.neucom.2019.12.133

Feature relevance determination for ordinal regression in the context of feature redundancies and privileged information

Lukas Pfannschmidt, Jonathan Jacob, Fabian Hinder, Michael Biehl, Peter Tino, Barbara Hammer

Research output: Contribution to journal › Article › peer-review

2 Citations (Scopus)

98 Downloads (Pure)

Abstract

Advances in machine learning technologies have led to increasingly powerful models in particular in the context of big data. Yet, many application scenarios demand for robustly interpretable models rather than optimum model accuracy; as an example, this is the case if potential biomarkers or causal factors should be discovered based on a set of given measurements. In this contribution, we focus on feature selection paradigms, which enable us to uncover relevant factors of a given regularity based on a sparse model. We focus on the important specific setting of linear ordinal regression, i.e. data have to be ranked into one of a finite number of ordered categories by a linear projection. Unlike previous work, we consider the case that features are potentially redundant, such that no unique minimum set of relevant features exists. We aim for an identification of all strongly and all weakly relevant features as well as their type of relevance (strong or weak); we achieve this goal by determining feature relevance bounds, which correspond to the minimum and maximum feature relevance, respectively, if searched over all equivalent models. In addition, we discuss how this setting enables us to substitute some of the features, e.g. due to their semantics, and how to extend the framework of feature relevance intervals to the setting of privileged information, i.e. potentially relevant information is available for training purposes only, but cannot be used for the prediction itself.

Original language	English
Pages (from-to)	266-279
Number of pages	14
Journal	Neurocomputing
Volume	416
Early online date	9 Apr 2020
DOIs	https://doi.org/10.1016/j.neucom.2019.12.133
Publication status	E-pub ahead of print - 9 Apr 2020

Keywords

Feature selection
Global feature relevance
Interpretability
Ordinal regression
Privileged information

Access to Document

10.1016/j.neucom.2019.12.133Licence: None: All rights reserved

Pfannschmidt_et_al_Feature_relevance_determination_for_ordinal_regression_Neurocomputing_2019Accepted author manuscript, 723 KBLicence: Creative Commons: Attribution-NonCommercial-NoDerivs (CC BY-NC-ND)

https://www.sciencedirect.com/science/article/pii/S0925231220305038Licence: None: All rights reserved

Cite this

@article{31a2ff1722cc42459c875089b3cce89c,

title = "Feature relevance determination for ordinal regression in the context of feature redundancies and privileged information",

abstract = "Advances in machine learning technologies have led to increasingly powerful models in particular in the context of big data. Yet, many application scenarios demand for robustly interpretable models rather than optimum model accuracy; as an example, this is the case if potential biomarkers or causal factors should be discovered based on a set of given measurements. In this contribution, we focus on feature selection paradigms, which enable us to uncover relevant factors of a given regularity based on a sparse model. We focus on the important specific setting of linear ordinal regression, i.e. data have to be ranked into one of a finite number of ordered categories by a linear projection. Unlike previous work, we consider the case that features are potentially redundant, such that no unique minimum set of relevant features exists. We aim for an identification of all strongly and all weakly relevant features as well as their type of relevance (strong or weak); we achieve this goal by determining feature relevance bounds, which correspond to the minimum and maximum feature relevance, respectively, if searched over all equivalent models. In addition, we discuss how this setting enables us to substitute some of the features, e.g. due to their semantics, and how to extend the framework of feature relevance intervals to the setting of privileged information, i.e. potentially relevant information is available for training purposes only, but cannot be used for the prediction itself.",

keywords = "Feature selection, Global feature relevance, Interpretability, Ordinal regression, Privileged information",

author = "Lukas Pfannschmidt and Jonathan Jacob and Fabian Hinder and Michael Biehl and Peter Tino and Barbara Hammer",

year = "2020",

month = apr,

day = "9",

doi = "10.1016/j.neucom.2019.12.133",

language = "English",

volume = "416",

pages = "266--279",

journal = "Neurocomputing",

issn = "0925-2312",

publisher = "Elsevier",

}

TY - JOUR

T1 - Feature relevance determination for ordinal regression in the context of feature redundancies and privileged information

AU - Pfannschmidt, Lukas

AU - Jacob, Jonathan

AU - Hinder, Fabian

AU - Biehl, Michael

AU - Tino, Peter

AU - Hammer, Barbara

PY - 2020/4/9

Y1 - 2020/4/9

N2 - Advances in machine learning technologies have led to increasingly powerful models in particular in the context of big data. Yet, many application scenarios demand for robustly interpretable models rather than optimum model accuracy; as an example, this is the case if potential biomarkers or causal factors should be discovered based on a set of given measurements. In this contribution, we focus on feature selection paradigms, which enable us to uncover relevant factors of a given regularity based on a sparse model. We focus on the important specific setting of linear ordinal regression, i.e. data have to be ranked into one of a finite number of ordered categories by a linear projection. Unlike previous work, we consider the case that features are potentially redundant, such that no unique minimum set of relevant features exists. We aim for an identification of all strongly and all weakly relevant features as well as their type of relevance (strong or weak); we achieve this goal by determining feature relevance bounds, which correspond to the minimum and maximum feature relevance, respectively, if searched over all equivalent models. In addition, we discuss how this setting enables us to substitute some of the features, e.g. due to their semantics, and how to extend the framework of feature relevance intervals to the setting of privileged information, i.e. potentially relevant information is available for training purposes only, but cannot be used for the prediction itself.

AB - Advances in machine learning technologies have led to increasingly powerful models in particular in the context of big data. Yet, many application scenarios demand for robustly interpretable models rather than optimum model accuracy; as an example, this is the case if potential biomarkers or causal factors should be discovered based on a set of given measurements. In this contribution, we focus on feature selection paradigms, which enable us to uncover relevant factors of a given regularity based on a sparse model. We focus on the important specific setting of linear ordinal regression, i.e. data have to be ranked into one of a finite number of ordered categories by a linear projection. Unlike previous work, we consider the case that features are potentially redundant, such that no unique minimum set of relevant features exists. We aim for an identification of all strongly and all weakly relevant features as well as their type of relevance (strong or weak); we achieve this goal by determining feature relevance bounds, which correspond to the minimum and maximum feature relevance, respectively, if searched over all equivalent models. In addition, we discuss how this setting enables us to substitute some of the features, e.g. due to their semantics, and how to extend the framework of feature relevance intervals to the setting of privileged information, i.e. potentially relevant information is available for training purposes only, but cannot be used for the prediction itself.

KW - Feature selection

KW - Global feature relevance

KW - Interpretability

KW - Ordinal regression

KW - Privileged information

UR - https://www.journals.elsevier.com/neurocomputing/

U2 - 10.1016/j.neucom.2019.12.133

DO - 10.1016/j.neucom.2019.12.133

M3 - Article

SN - 0925-2312

VL - 416

SP - 266

EP - 279

JO - Neurocomputing

JF - Neurocomputing

ER -

Feature relevance determination for ordinal regression in the context of feature redundancies and privileged information

Abstract

Keywords

Access to Document

Fingerprint

Cite this