A software architecture for Twitter collection, search and geolocation services

Research output: Contribution to journalArticlepeer-review

Abstract

The substantial increase of social networks and their combination with mobile devices make rigorous analysis of the outcomes of such system of paramount importance for intelligence gathering and decision making purposes. Since the introduction of Twitter system in 2006, tweeting emerged as an efficient open social network that attracted interest from various research/commercial and military communities. This paper investigates the current software architecture of Twitter system and put forward a new architecture dedicated for semantic and spatial analysis of Twitter data. Especially, Twitter Streaming API was used as a basis for tweet collection data stored in MySQL like database. While Lucene system together with WordNet lexical database linked to advanced natural language processing and PostGIS platform were used to ensure semantic and spatial analysis of the collected data. A functional diversity approach was implemented to enforce fault tolerance for the data collection part where its performances were evaluated through comparison with alternative approaches. The proposal enables the discovery of spatial patterns within geo-located Twitter and can provide the user or operator with useful unforeseen elements.

Details

Original languageEnglish
Pages (from-to)105-120
JournalKnowledge-Based Systems
Volume37
Early online date7 Aug 2012
Publication statusPublished - 1 Jan 2013

Keywords

  • Data mining, Tweet, Social network, Software architecture, Semantic analysis