Abstract
This paper presents an extension of our recent work on recognition of multiple bird species from their vocalisations by incorporating an improved acoustic modelling. The acoustic scene is segmented into spectro-temporal isolated segments by employing a sinusoidal detection algorithm, which is able to handle multiple simultaneous bird vocalisations. Each segment is represented as a temporal sequence of frequencies of the detected sinusoid. Each bird species is represented by a set of hidden Markov models (HMMs), each HMM modelling a particular vocalisation element. A set of elements is discovered in an unsupervised manner using a partial dynamic time warping algorithm and agglomerative hierarchical clustering. Recognition of multiple bird species is performed based on maximising the likelihood of the set of detected segments on a subset of bird species models, with a penalisation applied for increasing the number of bird species. Experimental evaluations used audio field recordings containing 30 bird species. Detected segments from several bird species are joined to simulate the presence of multiple bird species. It is demonstrated that the use of improved acoustic modelling in conjunction with the maximum likelihood score combination method provides considerable improvements over previous results and the use of majority voting.
Original language | English |
---|---|
Title of host publication | Interspeech 2016 |
Editors | Nelson Morgan |
Publisher | ISCA |
Pages | 2612-2616 |
DOIs | |
Publication status | Published - 8 Sept 2016 |
Event | Interspeech 2016 - San Francisco, San Francisco, United States Duration: 8 Sept 2016 → 12 Sept 2016 http://interspeech2016.org/ |
Publication series
Name | Interspeech Proceedings |
---|---|
ISSN (Electronic) | 1990-9772 |
Conference
Conference | Interspeech 2016 |
---|---|
Country/Territory | United States |
City | San Francisco |
Period | 8/09/16 → 12/09/16 |
Internet address |
Keywords
- multiple bird species recognition
- HMM
- vocalisation
- element
- unsupervised training
- sinusoid detection