TY - GEN
T1 - The effectiveness of a new negative correlation learning algorithm for classification ensembles
AU - Wang, Shuo
AU - Yao, Xin
PY - 2010
Y1 - 2010
N2 - In an earlier paper, we proposed a new negative correlation learning (NCL) algorithm for classification ensembles, called AdaBoost.NC, which has significantly better performance than the standard AdaBoost and other NCL algorithms on many benchmark data sets with low computation cost. In this paper, we give deeper insight into this algorithm from both theoretical and experimental aspects to understand its effectiveness. We explain why AdaBoost.NC can reduce error correlation within the ensemble and improve the classification performance. We also show the role of the amb (penalty) term in the training error. Finally, we examine the effectiveness of AdaBoost.NC by varying two pre-defined parameters - penalty strength λ and ensemble size T. Experiments are carried out on both artificial and real-world data sets, which show that AdaBoost.NC does produce smaller error correlation along with training epochs, and a lower test error comparing to the standard AdaBoost. The optimal λ depends on problem domains and base learners. The performance of AdaBoost.NC becomes stable as T gets larger. It is more effective when T is comparatively small.
AB - In an earlier paper, we proposed a new negative correlation learning (NCL) algorithm for classification ensembles, called AdaBoost.NC, which has significantly better performance than the standard AdaBoost and other NCL algorithms on many benchmark data sets with low computation cost. In this paper, we give deeper insight into this algorithm from both theoretical and experimental aspects to understand its effectiveness. We explain why AdaBoost.NC can reduce error correlation within the ensemble and improve the classification performance. We also show the role of the amb (penalty) term in the training error. Finally, we examine the effectiveness of AdaBoost.NC by varying two pre-defined parameters - penalty strength λ and ensemble size T. Experiments are carried out on both artificial and real-world data sets, which show that AdaBoost.NC does produce smaller error correlation along with training epochs, and a lower test error comparing to the standard AdaBoost. The optimal λ depends on problem domains and base learners. The performance of AdaBoost.NC becomes stable as T gets larger. It is more effective when T is comparatively small.
KW - Classification
KW - Diversity
KW - Ensemble learning
KW - Negative correlation learning
UR - http://www.scopus.com/inward/record.url?scp=79951730881&partnerID=8YFLogxK
U2 - 10.1109/ICDMW.2010.196
DO - 10.1109/ICDMW.2010.196
M3 - Conference contribution
AN - SCOPUS:79951730881
SN - 9780769542577
T3 - Proceedings - IEEE International Conference on Data Mining, ICDM
SP - 1013
EP - 1020
BT - Proceedings - 10th IEEE International Conference on Data Mining Workshops, ICDMW 2010
T2 - 10th IEEE International Conference on Data Mining Workshops, ICDMW 2010
Y2 - 14 December 2010 through 17 December 2010
ER -