Robust Real-Time Music Transcription with a Compositional Hierarchical Model

Matevz Pesek; Ales Leonardis; Matija Marolt

doi:10.1371/journal.pone.0169411

Robust Real-Time Music Transcription with a Compositional Hierarchical Model

Matevz Pesek, Ales Leonardis, Matija Marolt

Computer Science

Research output: Contribution to journal › Article › peer-review

7 Citations (Scopus)

162 Downloads (Pure)

Abstract

The paper presents a new compositional hierarchical model for robust music transcription. Its main features are unsupervised learning of a hierarchical representation of input data, transparency, which enables insights into the learned representation, as well as robustness and speed which make it suitable for real-world and real-time use. The model consists of multiple layers, each composed of a number of parts. The hierarchical nature of the model corresponds well to hierarchical structures in music. The parts in lower layers correspond to low-level concepts (e.g. tone partials), while the parts in higher layers combine lower-level representations into more complex concepts (tones, chords). The layers are learned in an unsupervised manner from music signals. Parts in each layer are compositions of parts from previous layers based on statistical co-occurrences as the driving force of the learning process. In the paper, we present the model’s structure and compare it to other hierarchical approaches in the field of music information retrieval. We evaluate the model’s performance for the multiple fundamental frequency estimation. Finally, we elaborate on extensions of the model towards other music information retrieval tasks.

Original language	English
Article number	e0169411
Number of pages	19
Journal	PLoS ONE
Volume	12
Issue number	1
DOIs	https://doi.org/10.1371/journal.pone.0169411
Publication status	Published - 3 Jan 2017

Access to Document

10.1371/journal.pone.0169411Licence: Creative Commons: Attribution (CC BY)

Pesek_et_al_Robust_real-time_music_transcription_Plos_One
Checked 6/1/2017
Final published version, 1.9 MBLicence: Creative Commons: Attribution (CC BY)

Cite this

@article{da83fc13d87c4fea9e2b9fd3f1afd15b,

title = "Robust Real-Time Music Transcription with a Compositional Hierarchical Model",

abstract = "The paper presents a new compositional hierarchical model for robust music transcription. Its main features are unsupervised learning of a hierarchical representation of input data, transparency, which enables insights into the learned representation, as well as robustness and speed which make it suitable for real-world and real-time use. The model consists of multiple layers, each composed of a number of parts. The hierarchical nature of the model corresponds well to hierarchical structures in music. The parts in lower layers correspond to low-level concepts (e.g. tone partials), while the parts in higher layers combine lower-level representations into more complex concepts (tones, chords). The layers are learned in an unsupervised manner from music signals. Parts in each layer are compositions of parts from previous layers based on statistical co-occurrences as the driving force of the learning process. In the paper, we present the model{\textquoteright}s structure and compare it to other hierarchical approaches in the field of music information retrieval. We evaluate the model{\textquoteright}s performance for the multiple fundamental frequency estimation. Finally, we elaborate on extensions of the model towards other music information retrieval tasks.",

author = "Matevz Pesek and Ales Leonardis and Matija Marolt",

year = "2017",

month = jan,

day = "3",

doi = "10.1371/journal.pone.0169411",

language = "English",

volume = "12",

journal = "PLoS ONE",

issn = "1932-6203",

publisher = "Public Library of Science (PLOS)",

number = "1",

}

TY - JOUR

T1 - Robust Real-Time Music Transcription with a Compositional Hierarchical Model

AU - Pesek, Matevz

AU - Leonardis, Ales

AU - Marolt, Matija

PY - 2017/1/3

Y1 - 2017/1/3

N2 - The paper presents a new compositional hierarchical model for robust music transcription. Its main features are unsupervised learning of a hierarchical representation of input data, transparency, which enables insights into the learned representation, as well as robustness and speed which make it suitable for real-world and real-time use. The model consists of multiple layers, each composed of a number of parts. The hierarchical nature of the model corresponds well to hierarchical structures in music. The parts in lower layers correspond to low-level concepts (e.g. tone partials), while the parts in higher layers combine lower-level representations into more complex concepts (tones, chords). The layers are learned in an unsupervised manner from music signals. Parts in each layer are compositions of parts from previous layers based on statistical co-occurrences as the driving force of the learning process. In the paper, we present the model’s structure and compare it to other hierarchical approaches in the field of music information retrieval. We evaluate the model’s performance for the multiple fundamental frequency estimation. Finally, we elaborate on extensions of the model towards other music information retrieval tasks.

AB - The paper presents a new compositional hierarchical model for robust music transcription. Its main features are unsupervised learning of a hierarchical representation of input data, transparency, which enables insights into the learned representation, as well as robustness and speed which make it suitable for real-world and real-time use. The model consists of multiple layers, each composed of a number of parts. The hierarchical nature of the model corresponds well to hierarchical structures in music. The parts in lower layers correspond to low-level concepts (e.g. tone partials), while the parts in higher layers combine lower-level representations into more complex concepts (tones, chords). The layers are learned in an unsupervised manner from music signals. Parts in each layer are compositions of parts from previous layers based on statistical co-occurrences as the driving force of the learning process. In the paper, we present the model’s structure and compare it to other hierarchical approaches in the field of music information retrieval. We evaluate the model’s performance for the multiple fundamental frequency estimation. Finally, we elaborate on extensions of the model towards other music information retrieval tasks.

U2 - 10.1371/journal.pone.0169411

DO - 10.1371/journal.pone.0169411

M3 - Article

SN - 1932-6203

VL - 12

JO - PLoS ONE

JF - PLoS ONE

IS - 1

M1 - e0169411

ER -

Robust Real-Time Music Transcription with a Compositional Hierarchical Model

Abstract

Access to Document

Fingerprint

Cite this