Enhancing contextualised language models with static character and word embeddings for emotional intensity and sentiment strength detection in Arabic tweets

Abdullah I. Alharbi*, Phillip Smith, Mark Lee

*Corresponding author for this work

Research output: Contribution to journalConference articlepeer-review

61 Downloads (Pure)

Abstract

Many studies have focused on Arabic sentiment or emotion classification tasks. However, research on alternative aspects of affect, such as emotional intensity and sentiment strength tasks, has been somewhat limited. In this paper, we propose a method for enriching a contextualised language model that incorporates static character and word embeddings for emotional intensity and sentiment strength in Arabic tweets. We examine the assumption that models using static embeddings that are trained specifically on a corpus containing extensive Arabic affect-related words can boost the performance of language models. Through the development of character-level embeddings, we have found that our method is able to overcome the limitations associated with out-of-vocabulary words, which is a common problem when dealing with Arabic informal text. Given this, the method that we have developed achieves state-of-the-art results for the detection of the intensity of emotion and sentiment strength in Arabic social media.

Original languageEnglish
Pages (from-to)258-265
Number of pages8
JournalProcedia CIRP
Volume189
DOIs
Publication statusPublished - 14 Jul 2021
Event5th International Conference on Artificial Intelligence in Computational Linguistics, ACLing 2021 - Virtual, Online, United Arab Emirates
Duration: 4 Jun 20215 Jun 2021

Keywords

  • Arabic Emotional Intensity
  • Character
  • Contextualised Language Models
  • Word Embeddings

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Industrial and Manufacturing Engineering

Cite this