Road and travel time cross-validation for urban modelling

Henry Crosby*, Theodoros Damoulas, Stephen A. Jarvis

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

The physical and social processes in urban systems are inherently spatial and hence data describing them contain spatial autocorrelation (a proximity-based interdependency on a variable) that need to be accounted for. Standard k-fold cross-validation (KCV) techniques that attempt to measure the generalisation performance of machine learning and statistical algorithms are inappropriate in this setting due to their inherent i.i.d assumption, which is violated by spatial dependency. As such, more appropriate validation methods have been considered, notably blocking and spatial k-fold cross-validation (SKCV). However, the physical barriers and complex network structures which make up a city’s landscape mean that these methods are also inappropriate, largely because the travel patterns (and hence Spatial Autocorrelation (SAC)) in most urban spaces are rarely Euclidean in nature. To overcome this problem, we propose a new road distance and travel time k-fold cross-validation method, RT-KCV. We show how this outperforms the prior art in providing better estimates of the true generalisation performance to unseen data.

Original languageEnglish
Pages (from-to)98-118
Number of pages21
JournalInternational Journal of Geographical Information Science
Volume34
Issue number1
DOIs
Publication statusPublished - 2 Jan 2020

Bibliographical note

Funding Information:
This work was supported by the Engineering and Physical Sciences Research Council [EP/ L016400/ 1,EP/N510129/1];Lloyd?s Register Foundation programme on Data Centric Engineering; We would like to thank the Engineering and Physical Sciences Research Council (EPSRC) Centre for Doctoral Training in Urban Science (EP/L016400/1) and the Alan Turing Institute (EP/N510129/1) grants. In addition, we are supported by the Lloyds Register Foundation programme on Data Centric Engineering. Finally, our gratitude goes to all the open source mapping contributors.

Funding Information:
We would like to thank the Engineering and Physical Sciences Research Council (EPSRC) Centre for Doctoral Training in Urban Science (EP/L016400/1) and the Alan Turing Institute (EP/N510129/1) grants. In addition, we are supported by the Lloyds Register Foundation programme on Data Centric Engineering. Finally, our gratitude goes to all the open source mapping contributors.

Publisher Copyright:
© 2019, © 2019 Informa UK Limited, trading as Taylor & Francis Group.

Keywords

  • cross-validation
  • GIS
  • Road distance
  • travel time
  • urban science

ASJC Scopus subject areas

  • Information Systems
  • Geography, Planning and Development
  • Library and Information Sciences

Fingerprint

Dive into the research topics of 'Road and travel time cross-validation for urban modelling'. Together they form a unique fingerprint.

Cite this