Recognition of Multiple Bird Species based on Penalised Maximum Likelihood and HMM-based Modelling of Individual Elements

Peter Jancovic; Munevver Kokuer Jancovic

doi:10.21437/Interspeech.2016-669

Recognition of Multiple Bird Species based on Penalised Maximum Likelihood and HMM-based Modelling of Individual Elements

Peter Jancovic, Munevver Kokuer Jancovic

Electronic, Electrical and Systems Engineering

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

2 Citations (Scopus)

105 Downloads (Pure)

Abstract

This paper presents an extension of our recent work on recognition of multiple bird species from their vocalisations by incorporating an improved acoustic modelling. The acoustic scene is segmented into spectro-temporal isolated segments by employing a sinusoidal detection algorithm, which is able to handle multiple simultaneous bird vocalisations. Each segment is represented as a temporal sequence of frequencies of the detected sinusoid. Each bird species is represented by a set of hidden Markov models (HMMs), each HMM modelling a particular vocalisation element. A set of elements is discovered in an unsupervised manner using a partial dynamic time warping algorithm and agglomerative hierarchical clustering. Recognition of multiple bird species is performed based on maximising the likelihood of the set of detected segments on a subset of bird species models, with a penalisation applied for increasing the number of bird species. Experimental evaluations used audio field recordings containing 30 bird species. Detected segments from several bird species are joined to simulate the presence of multiple bird species. It is demonstrated that the use of improved acoustic modelling in conjunction with the maximum likelihood score combination method provides considerable improvements over previous results and the use of majority voting.

Original language	English
Title of host publication	Interspeech 2016
Editors	Nelson Morgan
Publisher	ISCA
Pages	2612-2616
DOIs	https://doi.org/10.21437/Interspeech.2016-669
Publication status	Published - 8 Sept 2016
Event	Interspeech 2016 - San Francisco, San Francisco, United States Duration: 8 Sept 2016 → 12 Sept 2016 http://interspeech2016.org/

Publication series

Name	Interspeech Proceedings
ISSN (Electronic)	1990-9772

Conference

Conference	Interspeech 2016
Country/Territory	United States
City	San Francisco
Period	8/09/16 → 12/09/16
Internet address	http://interspeech2016.org/

Keywords

multiple bird species recognition
HMM
vocalisation
element
unsupervised training
sinusoid detection

Access to Document

10.21437/Interspeech.2016-669Licence: None: All rights reserved

0669
Jančovič, P., Köküer, M. (2016) Recognition of Multiple Bird Species Based on Penalised Maximum Likelihood and HMM-Based Modelling of Individual Vocalisation Elements. Proc. Interspeech 2016, 2612-2616. Copyright © 2016 ISCA
Final published version, 286 KBLicence: Other (please specify with Rights Statement)

Cite this

@inproceedings{fa30679d90fb467096c2007b9e439993,

title = "Recognition of Multiple Bird Species based on Penalised Maximum Likelihood and HMM-based Modelling of Individual Elements",

abstract = "This paper presents an extension of our recent work on recognition of multiple bird species from their vocalisations by incorporating an improved acoustic modelling. The acoustic scene is segmented into spectro-temporal isolated segments by employing a sinusoidal detection algorithm, which is able to handle multiple simultaneous bird vocalisations. Each segment is represented as a temporal sequence of frequencies of the detected sinusoid. Each bird species is represented by a set of hidden Markov models (HMMs), each HMM modelling a particular vocalisation element. A set of elements is discovered in an unsupervised manner using a partial dynamic time warping algorithm and agglomerative hierarchical clustering. Recognition of multiple bird species is performed based on maximising the likelihood of the set of detected segments on a subset of bird species models, with a penalisation applied for increasing the number of bird species. Experimental evaluations used audio field recordings containing 30 bird species. Detected segments from several bird species are joined to simulate the presence of multiple bird species. It is demonstrated that the use of improved acoustic modelling in conjunction with the maximum likelihood score combination method provides considerable improvements over previous results and the use of majority voting.",

keywords = "multiple bird species recognition, HMM, vocalisation, element, unsupervised training, sinusoid detection",

author = "Peter Jancovic and {Kokuer Jancovic}, Munevver",

year = "2016",

month = sep,

day = "8",

doi = "10.21437/Interspeech.2016-669",

language = "English",

series = "Interspeech Proceedings",

publisher = "ISCA",

pages = "2612--2616",

editor = "Nelson Morgan",

booktitle = "Interspeech 2016",

note = "Interspeech 2016 ; Conference date: 08-09-2016 Through 12-09-2016",

url = "http://interspeech2016.org/",

}

TY - GEN

T1 - Recognition of Multiple Bird Species based on Penalised Maximum Likelihood and HMM-based Modelling of Individual Elements

AU - Jancovic, Peter

AU - Kokuer Jancovic, Munevver

PY - 2016/9/8

Y1 - 2016/9/8

N2 - This paper presents an extension of our recent work on recognition of multiple bird species from their vocalisations by incorporating an improved acoustic modelling. The acoustic scene is segmented into spectro-temporal isolated segments by employing a sinusoidal detection algorithm, which is able to handle multiple simultaneous bird vocalisations. Each segment is represented as a temporal sequence of frequencies of the detected sinusoid. Each bird species is represented by a set of hidden Markov models (HMMs), each HMM modelling a particular vocalisation element. A set of elements is discovered in an unsupervised manner using a partial dynamic time warping algorithm and agglomerative hierarchical clustering. Recognition of multiple bird species is performed based on maximising the likelihood of the set of detected segments on a subset of bird species models, with a penalisation applied for increasing the number of bird species. Experimental evaluations used audio field recordings containing 30 bird species. Detected segments from several bird species are joined to simulate the presence of multiple bird species. It is demonstrated that the use of improved acoustic modelling in conjunction with the maximum likelihood score combination method provides considerable improvements over previous results and the use of majority voting.

AB - This paper presents an extension of our recent work on recognition of multiple bird species from their vocalisations by incorporating an improved acoustic modelling. The acoustic scene is segmented into spectro-temporal isolated segments by employing a sinusoidal detection algorithm, which is able to handle multiple simultaneous bird vocalisations. Each segment is represented as a temporal sequence of frequencies of the detected sinusoid. Each bird species is represented by a set of hidden Markov models (HMMs), each HMM modelling a particular vocalisation element. A set of elements is discovered in an unsupervised manner using a partial dynamic time warping algorithm and agglomerative hierarchical clustering. Recognition of multiple bird species is performed based on maximising the likelihood of the set of detected segments on a subset of bird species models, with a penalisation applied for increasing the number of bird species. Experimental evaluations used audio field recordings containing 30 bird species. Detected segments from several bird species are joined to simulate the presence of multiple bird species. It is demonstrated that the use of improved acoustic modelling in conjunction with the maximum likelihood score combination method provides considerable improvements over previous results and the use of majority voting.

KW - multiple bird species recognition

KW - HMM

KW - vocalisation

KW - element

KW - unsupervised training

KW - sinusoid detection

UR - http://10.21437/Interspeech.2016

U2 - 10.21437/Interspeech.2016-669

DO - 10.21437/Interspeech.2016-669

M3 - Conference contribution

T3 - Interspeech Proceedings

SP - 2612

EP - 2616

BT - Interspeech 2016

A2 - Morgan, Nelson

PB - ISCA

T2 - Interspeech 2016

Y2 - 8 September 2016 through 12 September 2016

ER -

Recognition of Multiple Bird Species based on Penalised Maximum Likelihood and HMM-based Modelling of Individual Elements

Abstract

Publication series

Conference

Keywords

Access to Document

Fingerprint

Cite this