Modeling recurrent distributions in streams using possible worlds

Michael Geilke; Andreas Karwath; Stefan Kramer

doi:10.1109/DSAA.2015.7344814

Modeling recurrent distributions in streams using possible worlds

Michael Geilke, Andreas Karwath, Stefan Kramer

Cancer and Genomic Sciences

Research output: Contribution to conference (unpublished) › Paper › peer-review

1 Citation (Scopus)

Abstract

Discovering changes in the data distribution of streams and discovering recurrent data distributions are challenging problems in data mining and machine learning. Both have received a lot of attention in the context of classification. With the ever increasing growth of data, however, there is a high demand of compact and universal representations of data streams that enable the user to analyze current as well as historic data without having access to the raw data. To make a first step towards this direction, we propose a condensed representation that captures the various - possibly recurrent - data distributions of the stream by extending the notion of possible worlds. The representation enables queries concerning the whole stream and can, hence, serve as a tool for supporting decision-making processes or serve as a basis for implementing data mining and machine learning algorithms on top of it. We evaluate this condensed representation on synthetic and real-world data.

Original language	English
Pages	1-9
DOIs	https://doi.org/10.1109/DSAA.2015.7344814
Publication status	Published - 2015

Keywords

density estimation, machine learning, possible worlds, stream mining

Access to Document

10.1109/DSAA.2015.7344814

http://dx.doi.org/10.1109/DSAA.2015.7344814

Cite this

@conference{34d8e01497b94f86a2426e2ddb3fca5d,

title = "Modeling recurrent distributions in streams using possible worlds",

abstract = "Discovering changes in the data distribution of streams and discovering recurrent data distributions are challenging problems in data mining and machine learning. Both have received a lot of attention in the context of classification. With the ever increasing growth of data, however, there is a high demand of compact and universal representations of data streams that enable the user to analyze current as well as historic data without having access to the raw data. To make a first step towards this direction, we propose a condensed representation that captures the various - possibly recurrent - data distributions of the stream by extending the notion of possible worlds. The representation enables queries concerning the whole stream and can, hence, serve as a tool for supporting decision-making processes or serve as a basis for implementing data mining and machine learning algorithms on top of it. We evaluate this condensed representation on synthetic and real-world data.",

keywords = "density estimation, machine learning, possible worlds, stream mining",

author = "Michael Geilke and Andreas Karwath and Stefan Kramer",

year = "2015",

doi = "10.1109/DSAA.2015.7344814",

language = "English",

pages = "1--9",

}

TY - CONF

T1 - Modeling recurrent distributions in streams using possible worlds

AU - Geilke, Michael

AU - Karwath, Andreas

AU - Kramer, Stefan

PY - 2015

Y1 - 2015

N2 - Discovering changes in the data distribution of streams and discovering recurrent data distributions are challenging problems in data mining and machine learning. Both have received a lot of attention in the context of classification. With the ever increasing growth of data, however, there is a high demand of compact and universal representations of data streams that enable the user to analyze current as well as historic data without having access to the raw data. To make a first step towards this direction, we propose a condensed representation that captures the various - possibly recurrent - data distributions of the stream by extending the notion of possible worlds. The representation enables queries concerning the whole stream and can, hence, serve as a tool for supporting decision-making processes or serve as a basis for implementing data mining and machine learning algorithms on top of it. We evaluate this condensed representation on synthetic and real-world data.

AB - Discovering changes in the data distribution of streams and discovering recurrent data distributions are challenging problems in data mining and machine learning. Both have received a lot of attention in the context of classification. With the ever increasing growth of data, however, there is a high demand of compact and universal representations of data streams that enable the user to analyze current as well as historic data without having access to the raw data. To make a first step towards this direction, we propose a condensed representation that captures the various - possibly recurrent - data distributions of the stream by extending the notion of possible worlds. The representation enables queries concerning the whole stream and can, hence, serve as a tool for supporting decision-making processes or serve as a basis for implementing data mining and machine learning algorithms on top of it. We evaluate this condensed representation on synthetic and real-world data.

KW - density estimation, machine learning, possible worlds, stream mining

U2 - 10.1109/DSAA.2015.7344814

DO - 10.1109/DSAA.2015.7344814

M3 - Paper

SP - 1

EP - 9

ER -

Modeling recurrent distributions in streams using possible worlds

Abstract

Keywords

Access to Document

Fingerprint

Cite this