TY - JOUR

T1 - Evolving Efficient Learning Algorithms for Binary Mappings

AU - Bullinaria, John

PY - 2003/7/1

Y1 - 2003/7/1

N2 - Gradient descent training of sigmoidal feed-forward neural networks on binary mappings often gets stuck with some outputs totally wrong. This is because a sum-squared-error cost function leads to weight updates that depend on the derivative of the output sigmoid which goes to zero as the output approaches maximal error. Although it is easy to understand the cause, the best remedy is not so obvious. Common solutions involve modifying the training data, deviating from true gradient descent, or changing the cost function. In general, finding the best learning procedures for particular classes of problem is difficult because each usually depends on a number of interacting parameters that need to be set to optimal values for a fair comparison. In this paper I shall use simulated evolution to optimise all the relevant parameters, and come to a clear conclusion concerning the most efficient approach for learning binary mappings.

AB - Gradient descent training of sigmoidal feed-forward neural networks on binary mappings often gets stuck with some outputs totally wrong. This is because a sum-squared-error cost function leads to weight updates that depend on the derivative of the output sigmoid which goes to zero as the output approaches maximal error. Although it is easy to understand the cause, the best remedy is not so obvious. Common solutions involve modifying the training data, deviating from true gradient descent, or changing the cost function. In general, finding the best learning procedures for particular classes of problem is difficult because each usually depends on a number of interacting parameters that need to be set to optimal values for a fair comparison. In this paper I shall use simulated evolution to optimise all the relevant parameters, and come to a clear conclusion concerning the most efficient approach for learning binary mappings.

UR - http://www.scopus.com/inward/record.url?scp=0038355078&partnerID=8YFLogxK

U2 - 10.1016/S0893-6080(03)00093-5

DO - 10.1016/S0893-6080(03)00093-5

M3 - Article

C2 - 12850036

VL - 16

SP - 793

EP - 800

JO - Neural Networks

JF - Neural Networks

SN - 0893-6080

IS - 5-6

ER -