Evolutionary optimization of an hierarchical object recognition model
Research output: Contribution to journal › Article
Colleges, School and Institutes
A major problem in designing artificial neural networks is the proper choice of the network architecture. Especially for vision networks classifying three-dimensional (3-D) objects this problem is very challenging, as these networks are necessarily large and therefore the search space for defining the needed networks is of a very high dimensionality. This strongly increases the chances of obtaining only suboptimal structures from standard optimization algorithms. We tackle this problem in two ways. First, we use biologically inspired hierarchical vision models to narrow the space of possible architectures and to reduce the dimensionality of the search space. Second, we employ evolutionary optimization techniques to determine optimal features and nonlinearities of the visual hierarchy. Here, we especially focus on higher order complex features in higher hierarchical stages. We compare two different approaches to perform an evolutionary optimization of these features. In the first setting, we directly code the features into the genome. In the second setting, in analogy to an ontogenetical development process, we suggest the new method of an indirect coding of the features via an unsupervised learning process, which is embedded into the evolutionary optimization. In both cases the processing nonlinearities are encoded directly into the genome and are thus subject to optimization. The fitness of the individuals for the evolutionary selection process is computed by measuring the network classification performance on a benchmark image database. Here, we use a nearest-neighbor classification approach, based on the hierarchical feature output. We compare the found solutions with respect to their ability to generalize. We differentiate between a first- and a second-order generalization. The first-order generalization denotes how well the vision system, after evolutionary optimization of the features and nonlinearities using a database A, can classify previously unseen test views of objects from this database A. As second-order generalization, we denote the ability of the vision system to perform classification on a database B using the features and nonlinearities optimized on database A. We show that the direct feature coding approach leads to networks with a better first-order generalization, whereas the second-order generalization is on an equally high level for both direct and indirect coding. We also compare the second-order generalization results with other state-of-the-art recognition systems and show that both approaches lead to optimized recognition systems, which are highly competitive with recent recognition algorithms.
|Number of pages||12|
|Journal||IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)|
|Publication status||Published - 1 Jun 2005|