Morpheme-based Language Models for Improving the Speech Recognition of Arabic Dialects

Khalid Almeman, Mark Lee

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Downloads (Pure)

Abstract

In this paper, innovative experiments were done to
improve the Language Models (LMs) for three parallel dialects. In
each dialect, two different LMs were produced: a closed domain
LM and an open domain LM. The methodology of the second
part of the multi dialect morphology analyser, involved retrieval
of web frequencies for different parts of a word; this methodology
was modified and then used to extract the three suggested forms
of the word; stem alone, prefix+stem and stem+suffix. Six results
were then extracted per dialect, giving a total of eighteen results.
All the experiments yielded positive results, between 0.5% to
6.8% in WERs.
Original languageEnglish
Title of host publicationProceedings of the 5th International Conference on Arabic Language Processing
Subtitle of host publication(CITALA '14)
PublisherCITALA
Pages49-56
Publication statusPublished - Nov 2014
Event5th International Conference on Arabic Language Processing (CITALA '14) - Oujda, Morocco
Duration: 26 Nov 201427 Nov 2014

Conference

Conference5th International Conference on Arabic Language Processing (CITALA '14)
Country/TerritoryMorocco
CityOujda
Period26/11/1427/11/14

Fingerprint

Dive into the research topics of 'Morpheme-based Language Models for Improving the Speech Recognition of Arabic Dialects'. Together they form a unique fingerprint.

Cite this