A new minimum description length based pruning technique for rule induction algorithms

Duc Pham, AA Afify

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

When learning is based on noisy data, the induced rule sets have a tendency to overfit the training data, and this degrades the performance of the resulting classifier. Consequently, the ability to tolerate noise is a necessity for robust, practical learning methods. Pruning is a common way of handling noisy data. This paper presents a new pruning technique built on the sound foundation of the minimum description length principle. The proposed pruning technique has the advantage that it does not require the set of examples employed for pruning to be distinct from the set used to build the rule set. The new technique is designed to improve the performance of the RULe Extraction System (RULES) family of inductive learning algorithms, but can be used for pruning rule sets created by other learning algorithms. It was tested in RULES-6, the latest algorithm in the family, and showed significant performance improvements.
Original languageEnglish
Pages (from-to)1339-1352
Number of pages14
JournalInstitution of Mechanical Engineers. Proceedings. Part C: Journal of Mechanical Engineering Science
Volume222
Issue number7
DOIs
Publication statusPublished - 1 Jul 2008

Keywords

  • data mining
  • knowledge discovery
  • pruning
  • inductive learning
  • machine learning
  • rule induction
  • minimum description length principle
  • noise handling

Fingerprint

Dive into the research topics of 'A new minimum description length based pruning technique for rule induction algorithms'. Together they form a unique fingerprint.

Cite this