Abstract
Predictive graph mining approaches in chemical databases are extremely popular and effective. Most of these approaches first extract frequent sub-graphs and then use them as features to build predictive models. In the work presented here, the approach taken is similar. However, instead of frequent sub-graphs, frequent trees, based on SMILES strings are derived. For this, the SMILES strings of chemical compounds are decomposed into fragment trees, which in turn are mined for interesting sub-trees. These tree based patterns are then used as features by a classifier to build predictive models. The approach is experimentally evaluated on a real world chemical data set.
Original language | English |
---|---|
Publication status | Published - 2004 |
Bibliographical note
Berlin, GermanyKeywords
- cheminformatics, graph mining, machine learning