Boosting computational effectiveness in big spatial flow data analysis with intelligent data reduction

Research output: Contribution to journalArticlepeer-review

Authors

Colleges, School and Institutes

External organisations

  • University of South Florida Tampa
  • Tsinghua University
  • North Carolina State University

Abstract

One of the enduring issues of spatial origin-destination (OD) flow data analysis is the computational inefficiency or even the impossibility to handle large datasets. Despite the recent advancements in high performance computing (HPC) and the ready availability of powerful computing infrastructure, we argue that the best solutions are based on a thorough understanding of the fundamental properties of the data. This paper focuses on overcoming the computational challenge through data reduction that intelligently takes advantage of the heavy-tailed distributional property of most flow datasets. We specifically propose the classification technique of head/tail breaks to this end. We test this approach with representative algorithms from three common method families, namely flowAMOEBA from flow clustering, Louvain from network community detection, and PageRank from network centrality algorithms. A variety of flow datasets are adopted for the experiments, including inter-city travel flows, cellphone call flows, and synthetic flows. We propose a standard evaluation framework to evaluate the applicability of not only the selected three algorithms, but any given method in a systematic way. The results prove that head/tail breaks can significantly improve the computational capability and efficiency of flow data analyses while preserving result quality, on condition that the analysis emphasizes the "head" part of the dataset or the flows with high absolute values. We recommend considering this easy-toimplement data reduction technique before analyzing a large flow dataset.

Bibliographic note

Publisher Copyright: © 2020 by the authors. Licensee MDPI, Basel, Switzerland. Copyright: Copyright 2020 Elsevier B.V., All rights reserved.

Details

Original languageEnglish
Article number299
JournalISPRS International Journal of Geo-Information
Volume9
Issue number5
Publication statusPublished - May 2020

Keywords

  • Big flow data, Data reduction, Geocomputation, Head/tail breaks, Network analysis