Bird species recognition using HMM-based unsupervised modelling of individual syllables with incorporated duration modelling

Peter Jancovic; Munevver Kokuer; Masoud Zakeri; Martin Russell

doi:10.1109/ICASSP.2016.7471737

Bird species recognition using HMM-based unsupervised modelling of individual syllables with incorporated duration modelling

Peter Jancovic, Munevver Kokuer, Masoud Zakeri, Martin Russell

Electronic, Electrical and Systems Engineering

Research output: Contribution to conference (unpublished) › Paper › peer-review

7 Citations (Scopus)

Abstract

This paper presents an HMM-based automatic system for recognition of bird species from audio field recordings. It includes an improved unsupervised modelling of individual bird syllables and duration modelling. The acoustic signal is decomposed into isolated segments, each segment containing a temporal evolution of a detected sinusodal component. Modelling of bird syllables is performed using Hidden Markov models (HMMs). A set of syllables of bird vocalisations is discovered in an unsupervised manner by employing dynamic time warping and agglomerative hierarchical clustering. A novel iterative maximum likelihood procedure is used to train individual HMMs for syllables of each species. Modelling of the state duration is employed in a post-recognition stage by combining the likelihood of the acoustic and duration modelling. Experiments are performed on over 33 hours of field recordings, containing 30 bird species. Evaluations demonstrate that the use of the proposed unsupervised iterative HMM training procedure and the duration modelling provides in average 45% error rate reduction. The presented system recognises bird species with accuracy of 97.8% using 3 seconds of the detected signal.

Original language	English
Number of pages	5
DOIs	https://doi.org/10.1109/ICASSP.2016.7471737
Publication status	Published - 20 Mar 2016
Event	IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP) 2016 - China, Shanghai, China Duration: 20 Mar 2016 → 25 Mar 2016 http://www.icassp2016.org

Conference

Conference	IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP) 2016
Country/Territory	China
City	Shanghai
Period	20/03/16 → 25/03/16
Internet address	http://www.icassp2016.org

Access to Document

10.1109/ICASSP.2016.7471737

Cite this

@conference{41b9a91949bd436388bc6195d9888ff3,

title = "Bird species recognition using HMM-based unsupervised modelling of individual syllables with incorporated duration modelling",

abstract = "This paper presents an HMM-based automatic system for recognition of bird species from audio field recordings. It includes an improved unsupervised modelling of individual bird syllables and duration modelling. The acoustic signal is decomposed into isolated segments, each segment containing a temporal evolution of a detected sinusodal component. Modelling of bird syllables is performed using Hidden Markov models (HMMs). A set of syllables of bird vocalisations is discovered in an unsupervised manner by employing dynamic time warping and agglomerative hierarchical clustering. A novel iterative maximum likelihood procedure is used to train individual HMMs for syllables of each species. Modelling of the state duration is employed in a post-recognition stage by combining the likelihood of the acoustic and duration modelling. Experiments are performed on over 33 hours of field recordings, containing 30 bird species. Evaluations demonstrate that the use of the proposed unsupervised iterative HMM training procedure and the duration modelling provides in average 45% error rate reduction. The presented system recognises bird species with accuracy of 97.8% using 3 seconds of the detected signal.",

author = "Peter Jancovic and Munevver Kokuer and Masoud Zakeri and Martin Russell",

year = "2016",

month = mar,

day = "20",

doi = "10.1109/ICASSP.2016.7471737",

language = "English",

note = "IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP) 2016 ; Conference date: 20-03-2016 Through 25-03-2016",

url = "http://www.icassp2016.org",

}

Bird species recognition using HMM-based unsupervised modelling of individual syllables with incorporated duration modelling. / Jancovic, Peter; Kokuer, Munevver; Zakeri, Masoud et al.
2016. Paper presented at IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP) 2016, Shanghai, China.

Research output: Contribution to conference (unpublished) › Paper › peer-review

TY - CONF

T1 - Bird species recognition using HMM-based unsupervised modelling of individual syllables with incorporated duration modelling

AU - Jancovic, Peter

AU - Kokuer, Munevver

AU - Zakeri, Masoud

AU - Russell, Martin

PY - 2016/3/20

Y1 - 2016/3/20

N2 - This paper presents an HMM-based automatic system for recognition of bird species from audio field recordings. It includes an improved unsupervised modelling of individual bird syllables and duration modelling. The acoustic signal is decomposed into isolated segments, each segment containing a temporal evolution of a detected sinusodal component. Modelling of bird syllables is performed using Hidden Markov models (HMMs). A set of syllables of bird vocalisations is discovered in an unsupervised manner by employing dynamic time warping and agglomerative hierarchical clustering. A novel iterative maximum likelihood procedure is used to train individual HMMs for syllables of each species. Modelling of the state duration is employed in a post-recognition stage by combining the likelihood of the acoustic and duration modelling. Experiments are performed on over 33 hours of field recordings, containing 30 bird species. Evaluations demonstrate that the use of the proposed unsupervised iterative HMM training procedure and the duration modelling provides in average 45% error rate reduction. The presented system recognises bird species with accuracy of 97.8% using 3 seconds of the detected signal.

AB - This paper presents an HMM-based automatic system for recognition of bird species from audio field recordings. It includes an improved unsupervised modelling of individual bird syllables and duration modelling. The acoustic signal is decomposed into isolated segments, each segment containing a temporal evolution of a detected sinusodal component. Modelling of bird syllables is performed using Hidden Markov models (HMMs). A set of syllables of bird vocalisations is discovered in an unsupervised manner by employing dynamic time warping and agglomerative hierarchical clustering. A novel iterative maximum likelihood procedure is used to train individual HMMs for syllables of each species. Modelling of the state duration is employed in a post-recognition stage by combining the likelihood of the acoustic and duration modelling. Experiments are performed on over 33 hours of field recordings, containing 30 bird species. Evaluations demonstrate that the use of the proposed unsupervised iterative HMM training procedure and the duration modelling provides in average 45% error rate reduction. The presented system recognises bird species with accuracy of 97.8% using 3 seconds of the detected signal.

U2 - 10.1109/ICASSP.2016.7471737

DO - 10.1109/ICASSP.2016.7471737

M3 - Paper

T2 - IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP) 2016

Y2 - 20 March 2016 through 25 March 2016

ER -

Bird species recognition using HMM-based unsupervised modelling of individual syllables with incorporated duration modelling

Abstract

Conference

Access to Document

Fingerprint

Cite this