Abstract
In chemical literature much information is given in the form of diagrams depicting molecules. In order to access this information diagrams have to be recognised and translated into a processable format. We present an approach that models the principal recognition steps for molecule diagrams in a strictly rule based system, providing rules to identify the main components - atoms and bonds - as well as to resolve possible ambiguities. The result of the process is a translation into a graph representation that can be used for further processing. We show the effectiveness of our approach by describing its embedding into a full recognition system and present an experimental evaluation that demonstrates how our current implementation outperforms the leading open source system currently available.
Original language | English |
---|---|
Article number | 82970E |
Journal | Proceedings of SPIE - The International Society for Optical Engineering |
Volume | 8297 |
DOIs | |
Publication status | Published - 1 Jan 2012 |
Event | Document Recognition and Retrieval XIX - Burlingame, CA, United States Duration: 22 Jan 2012 → … |
Keywords
- Chemical molecule recognition
- diagram recognition