In this paper, we present a probabilistic generative approach for constructing topographic maps of tree-structured data. Our model defines a low-dimensional manifold of local noise models, namely, (hidden) Markov tree models, induced by a smooth mapping from low-dimensional latent space. We contrast our approach with that of topographic map formation using recursive neural-based techniques, namely, the self-organizing map for structured data (SOMSD) (Hagenbuchner , 2003). The probabilistic nature of our model brings a number of benefits: 1) naturally defined cost function that drives the model optimization; 2) principled model comparison and testing for overfitting; 3) a potential for transparent interpretation of the map by inspecting the underlying local noise models; 4) natural accommodation of alternative local noise models implicitly expressing different notions of structured data similarity. Furthermore, in contrast with the recursive neural-based approaches, the smooth nature of the mapping from the latent space to the local model space allows for calculation of magnification factors--a useful tool for the detection of data clusters. We demonstrate our approach on three data sets: a toy data set, an artificially generated data set, and on a data set of images represented as quadtrees.
- topographic mapping
- structured data
- hidden Markov tree model (HMTM)