Fusion of Neural Networks, Fuzzy Systems and Genetic Algorithms: Industrial Applications Fusion of Neural Networks, Fuzzy Systems and Genetic Algorithms: Industrial Applications
by Lakhmi C. Jain; N.M. Martin
CRC Press, CRC Press LLC
ISBN: 0849398045   Pub Date: 11/01/98
  

Previous Table of Contents Next


Several learning algorithms for RBF-networks have emerged in the last few years. They are all similar in the weight adaption employing the well-known gradient-descent method or the recursive least square algorithm, but they differ very much in how to find the center and the width. The positioning of the center is especially critical for convergence and approximation precision [38]. Often the number of neurons and the position of their centers are determined empirically, e.g., by evenly spacing them. A self-organized clustering algorithm for the input data was proposed in [29]. The centers of the RBF-functions are then placed at the centers of the clusters. Their standard deviations are set equal to the distance to the next nearest center. Since this method does not try to minimize the modeling error, this might not be appropriate for nonlinear system modeling. An adaptive learning algorithm for RBF-networks which optimizes the output errors with respect to the weights, the centers, and the widths was developed in [26]. A gradient-descent method is used to adapt all three vectors,

with the learning rates ηx and the instant output error index ε(k). By employing one-dimensional optimization for the learning rates, the convergence can be speeded up but the additional amount of computation slows down the training process.

The location property of the RBF-function is utilized by defining active and inactive neurons. If the output of a RBF-node exceeds a certain threshold, i.e., the distance between the input and the neuron’s center stays below a certain value, the neuron is called active. Otherwise, it will hardly contribute to the function approximation and will therefore be called inactive. Only the neurons that are active are included in the adaption process. A minimal and maximal number of active neurons is required. Additionally, the area where RBF-neurons can be placed should be limited, depending on the range of input values to be expected. Otherwise, an excessive number of neurons are placed that do not greatly contribute to the approximation of the presented data [23]. Normally, in order to minimize the global approximation error and not only the current output error, randomly distributed input sequences are used. This can be avoided by implementing a multistep learning algorithm as proposed in [26]. A sliding window of size μ ∈ is used, where all μ input-output data vectors contribute to the error calculation. Thus, the data coming from a process can be used as it becomes available, which saves off-line work and enhances convergence properties.


Figure 12  RBF-network structure for identifying the input-output form [19] © 1995 IEEE.

For static neural networks, such as RBF-networks, the input-output form for the nonlinear system description is considered here in order to perform the nonlinear system modeling [32], [26].

A different approach to nonlinear system modeling, using dynamic neural networks such as recurrent networks, will be presented in the following section.

4.1.2 Recurrent Neural Networks (RNN)

In this chapter, recurrent neural networks, the second type of neural networks proposed for nonlinear system identification, will be employed for residual generation. This type of neural network offers the possibility of modeling dynamic nonlinear systems as given by the nonlinear state space description (Figure 13) [22], [23].


Figure 13  Recurrent neural network in vector form.

with the vector of the states x(t), the input vector u(t), the output vector y(t), the activation a(t), the combined input vector = (x1 . . ., xn, . . ., um)T, and the combined state vector = (x1, . . ., xn, u1, . . ., um, 1)T. W and C are known as the weight and the output matrices, respectively. For the transfer function f(. . .), the tanh(. . .) function has been used here. The outputs of the n neurons represent the n states.

The objective of a learning process is to adapt the elements of the weight matrix W and the output matrix C such that the desired input-output behaviour of the given dynamic system will be assumed. This training can be performed by applying the delta-rule as shown below for the weight matrix [22].

which means for a single component

The application of this learning rule involves the following major problems [22], [23]:

1.  In recurrent network structures, the computational effort to calculate the gradient is much higher than for networks without feedback. Employing a combined algorithm consisting of Backpropagation Through Time and Real Time Recurrent Learning [44] the computational effort can be minimized to the order of n3.
2.  The learning algorithm is very sensitive with respect to a proper choice of the learning rate η. In the case where the selected value is too large, existing minima might not be recognized during learning, while in the case where the value is too small, the training time becomes excessively high.
3.  Since the delta rule belongs to the class of recursive learning rules, initial values for W and C have to be found. The choice is critical since for this type of network the possibility for the existence of local minima is very high [2].

In the following, solutions to the above problems are presented which have been applied to the actuator benchmark problem as described in Section 4.3 of this chapter.


Previous Table of Contents Next

Copyright © CRC Press LLC