Predictive graph mining approaches in chemical databases are extremely popular and effective. Most of these approaches first extract frequent sub-graphs and then use them as features to build predictive models. In the work presented here, the approach taken is similar. However, instead of frequent sub-graphs, frequent trees, based on SMILES strings are derived. For this, the SMILES strings of chemical compounds are decomposed into fragment trees, which in turn are mined for interesting sub-trees. These tree based patterns are then used as features by a classifier to build predictive models. The approach is experimentally evaluated on a real world chemical data set.
|Publication status||Published - 2004|
Bibliographical noteBerlin, Germany
- cheminformatics, graph mining, machine learning