Class imbalance evolution and verification latency in just-in-time software defect prediction

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Authors

Colleges, School and Institutes

External organisations

  • Federal Rural University of Pernambuco, Brazil
  • Concordia University, Canada
  • Concordia University, Canada

Abstract

Just-in-Time Software Defect Prediction (JIT-SDP) is an SDP approach that makes defect predictions at the software change level. Most existing JIT-SDP work assumes that the characteristics of the problem remain the same over time. However, JIT-SDP may suffer from class imbalance evolution. Specifically, the imbalance status of the problem (i.e., how much underrepresented the defect-inducing changes are) may be intensified or reduced over time. If occurring, this could render existing JIT-SDP approaches unsuitable, including those that rebuild classifiers over time using only recent data. This work thus provides the first investigation of whether class imbalance evolution poses a threat to JIT-SDP. This investigation is performed in a realistic scenario by taking into account verification latency – the often overlooked fact that labeled training examples arrive with a delay. Based on 10 GitHub projects, we show that JIT-SDP suffers from class imbalance evolution, significantly hindering the predictive performance of existing JIT-SDP approaches. Compared to state-of-the-art class imbalance evolution learning approaches, the predictive performance of JIT-SDP approaches was up to 97.2% lower in terms of g-mean. Hence, it is essential to tackle class imbalance evolution in JIT-SDP. We then propose a novel class imbalance evolution approach for the specific context of JIT-SDP. While maintaining top ranked g-means, this approach managed to produce up to 63.59% more balanced recalls on the defect-inducing and clean classes than state-of-theart class imbalance evolution approaches. We thus recommend it to avoid overemphasizing one class over the other in JIT-SDP.

Details

Original languageEnglish
Title of host publicationProceedings of the 41st ACM/IEEE International Conference on Software Engineering (ICSE 2019)
Publication statusPublished - 25 May 2019
Event41st ACM/IEEE International Conference on Software Engineering (ICSE 2019) - Montreal, Canada
Duration: 25 May 201931 May 2019

Conference

Conference41st ACM/IEEE International Conference on Software Engineering (ICSE 2019)
CountryCanada
CityMontreal
Period25/05/1931/05/19