Morpheme-based Language Models for Improving the Speech Recognition of Arabic Dialects

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Colleges, School and Institutes


In this paper, innovative experiments were done to
improve the Language Models (LMs) for three parallel dialects. In
each dialect, two different LMs were produced: a closed domain
LM and an open domain LM. The methodology of the second
part of the multi dialect morphology analyser, involved retrieval
of web frequencies for different parts of a word; this methodology
was modified and then used to extract the three suggested forms
of the word; stem alone, prefix+stem and stem+suffix. Six results
were then extracted per dialect, giving a total of eighteen results.
All the experiments yielded positive results, between 0.5% to
6.8% in WERs.


Original languageEnglish
Title of host publicationProceedings of the 5th International Conference on Arabic Language Processing
Subtitle of host publication(CITALA '14)
Publication statusPublished - Nov 2014
Event5th International Conference on Arabic Language Processing (CITALA '14) - Oujda, Morocco
Duration: 26 Nov 201427 Nov 2014


Conference5th International Conference on Arabic Language Processing (CITALA '14)