Abstract
Social media (SM) platforms such as Twitter offer a rich source of real-time information about crises from which useful information can be extracted to support situational awareness. The task of automatically identifying SM messages related to a specific event poses many challenges, including processing large volumes of short, noisy data in real time. This paper explored the problem of extracting crisis-related messages from Arabic Twitter data. We focused on high-risk floods as they are one of the main hazards in the Middle East. In this work, we presented a goldstandard Arabic Twitter corpus for four highrisk floods that occurred in 2018. Using the annotated dataset, we investigated the performance of different classical machine learning (ML) and deep neural network (DNN) classifiers. The results showed that deep learning is promising in identifying flood-related posts.
Original language | English |
---|---|
Title of host publication | Proceedings of the 3rd Workshop on Arabic Corpus Linguistics |
Publisher | Association for Computational Linguistics, ACL |
Pages | 72-79 |
Number of pages | 8 |
ISBN (Electronic) | 978-1-950737-32-1 |
Publication status | Published - 22 Jun 2019 |
Event | The 3rd Workshop on Arabic Corpus Linguistics (WACL-3) - Cardiff, United Kingdom Duration: 22 Jul 2019 → 24 Jul 2019 |
Conference
Conference | The 3rd Workshop on Arabic Corpus Linguistics (WACL-3) |
---|---|
Abbreviated title | WACL-3 |
Country/Territory | United Kingdom |
City | Cardiff |
Period | 22/07/19 → 24/07/19 |