Abstract
We present a general method for integrating visual components into a multi-modal cognitive system. The integration is very generic and can combine an arbitrary set of modalities. We illustrate our integration approach with a specific instantiation of the architecture schema that focuses on integration of vision and language: a cognitive system able to collaborate with a human, learn and display some understanding of its surroundings. As examples of cross-modal interaction we describe mechanisms for clarification and visual learning.
Original language | English |
---|---|
Pages | 3140-3147 |
Number of pages | 8 |
DOIs | |
Publication status | Published - 15 Dec 2009 |
Event | Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, 2009 (IROS 2009) - Duration: 15 Dec 2009 → … |
Conference
Conference | Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, 2009 (IROS 2009) |
---|---|
Period | 15/12/09 → … |
Bibliographical note
nullAccepted at 2009 IEEE/RSJ International Conference on Intelligent RObots and Systems