Data driven region-of-interest selection without inflating type I error rate

Research output: Contribution to journalArticlepeer-review


Colleges, School and Institutes

External organisations

  • University of Kent


In event-related potentials (ERP) and other large multi-dimensional neuroscience datasets, researchers often select regions-of-interest (ROIs) for analysis. The method of ROI selection can critically affect the conclusions of a study by causing the researcher to miss effects in the data or to detect spurious effects. In practice, to avoid inflating Type I error rate (i.e., false positives), ROIs are often based on a priori hypotheses or independent information. However, this can be insensitive to experiment-specific variations in effect location (e.g. latency shifts) reducing power to detect effects. Data-driven ROI selection, in contrast, is non-independent and uses the data under analysis to determine ROI positions. Therefore, it has potential to select ROIs based on experiment-specific information and increase power for detecting effects. However, data driven methods have been criticized because they can substantially inflate Type I error rate. Here we demonstrate, using simulations of simple ERP experiments, that data-driven ROI selection can indeed be more powerful than a priori hypotheses or independent information. Furthermore, we show that data-driven ROI selection using the aggregate-grand-average from trials (AGAT), despite being based on the data at hand, can be safely used for ROI selection under many circumstances. However, when there is a noise difference between conditions, using the AGAT can inflate Type 1 error and should be avoided. We identify critical assumptions for use of the AGAT and provide a basis for researchers to use, and reviewers to assess, data-driven methods of ROI localization in ERP and other studies.


Original languageEnglish
Pages (from-to)100-113
Issue number1
Early online date20 Dec 2016
Publication statusPublished - Jan 2017


  • EEG, ERPs, window selection, Type I error rate