Software Effort Estimation (SEE) may suffer from changes in the relationship between features describing software projects and their required effort over time, hindering predictive performance of machine learning models. To cope with that, most machine learning-based SEE approaches rely on receiving a large number of Within-Company (WC) projects for training over time, being prohibitively expensive. The approach Dycom reduces the number of required WC training projects by transferring knowledge from Cross-Company (CC) projects. However, it assumes that CC projects have no chronology and are entirely available before WC projects start being estimated. Given the importance of taking chronology into account to cope with changes, it may be beneficial to also take the chronology of CC projects into account. This paper thus investigates whether and under what circumstances treating CC projects as multiple data streams to be learned over time may be useful for improving SEE. For that, an extension of Dycom called OATES is proposed to enable multi-stream online learning, so that both incoming WC and CC data streams can be learnt over time. OATES is then compared against Dycom and five other approaches on a case study using four different scenarios derived from the ISBSG Repository. The results show that OATES improved predictive performance over the state-of-the-art when the number of CC projects available beforehand was small. Learning CC projects over time as multiple data streams is thus recommended for improving SEE in such scenario. When the number of CC projects available beforehand was large, OATES obtained similar predictive performance to the state-of-the-art. Therefore, CC data streams are unnecessary in this scenario, but are not detrimental either.
|Title of host publication||PROMISE 2021|
|Subtitle of host publication||Proceedings of the 17th ACM International Conference on Predictive Models and Data Analytics in Software Engineering|
|Editors||Shane McIntosh, Xin Xia, Sousuke Amasaki|
|Publisher||Association for Computing Machinery (ACM)|
|Number of pages||10|
|Publication status||Published - 19 Aug 2021|
|Event||The 17th International Conference on Predictive Models and Data Analytics in Software Engineering - Athens, Greece|
Duration: 19 Aug 2021 → 20 Aug 2021
|Name||PROMISE: Predictor Models in Software Engineering|
|Conference||The 17th International Conference on Predictive Models and Data Analytics in Software Engineering|
|Period||19/08/21 → 20/08/21|
Bibliographical noteFunding Information:
This work was supported by EPSRC Grant No. EP/R006660/2.
© 2021 ACM.
- Software effort estimation
- concept drift
- cross-company learning
- data streams
- transfer learning
ASJC Scopus subject areas
- Electrical and Electronic Engineering