Enhancing contextualised language models with static character and word embeddings for emotional intensity and sentiment strength detection in Arabic tweets

Research output: Contribution to journalConference articlepeer-review

Authors

  • Abdullah I. Alharbi
  • Phillip Smith
  • Mark Lee

Colleges, School and Institutes

External organisations

  • King Abdulaziz University

Abstract

Many studies have focused on Arabic sentiment or emotion classification tasks. However, research on alternative aspects of affect, such as emotional intensity and sentiment strength tasks, has been somewhat limited. In this paper, we propose a method for enriching a contextualised language model that incorporates static character and word embeddings for emotional intensity and sentiment strength in Arabic tweets. We examine the assumption that models using static embeddings that are trained specifically on a corpus containing extensive Arabic affect-related words can boost the performance of language models. Through the development of character-level embeddings, we have found that our method is able to overcome the limitations associated with out-of-vocabulary words, which is a common problem when dealing with Arabic informal text. Given this, the method that we have developed achieves state-of-the-art results for the detection of the intensity of emotion and sentiment strength in Arabic social media.

Details

Original languageEnglish
Pages (from-to)258-265
Number of pages8
JournalProcedia CIRP
Volume189
Publication statusPublished - 14 Jul 2021
Event5th International Conference on Artificial Intelligence in Computational Linguistics, ACLing 2021 - Virtual, Online, United Arab Emirates
Duration: 4 Jun 20215 Jun 2021

Keywords

  • Arabic Emotional Intensity, Character, Contextualised Language Models, Word Embeddings