Abstract
1- Introduction
2- Convolutional neural networks for system modeling
3- Frequency domain analysis of random weights
4- CNN training with random weights
5- Simulations
6- Conclusion
References
Abstract
Randomized algorithms have been successfully applied in modelling dynamic system. How do random weights affect system identification and why do they sometimes work well? In this paper, we use the convolutional neural network (CNN) as an identification model to answer these questions. Since the convolution operation is an important property of the dynamic system and in the frequency domain it becomes the product, the CNN model is analyzed in the frequency domain. We first modify the CNN model, so that it can model both the input and the output series. Then we analyze the impact of the random weights of CNN in the frequency domain. We prove the existence of optimal weights and analyze the modeling accuracy under optimal weights and random weights. Through theoretical analysis, we propose a two-step training method and compare it with the random weight algorithm. The proposed CNN model with random weights is validated with three benchmark problems.
Introduction
To predict the future behavior of the dynamic system, to apply model-based control or to understand the physical process, system identification is needed. Neural networks are black box models, which only use input and output data. These data-based modeling methods still have many problems. The hyper-parameters must be defined before they are applied, for example, how many hidden layers and hidden nodes are needed [21]. The universal approximation theorem guarantees that the neural model can approach almost all continuous systems with enough hidden nodes, however, the fact of increasing the hidden nodes causes the over-fit problem [34]. The alternative method to increase the hidden nodes is to use more hidden layers. It is a basic idea of deep neural networks. Theory and application show that deep learning models achieve better training precision [2], and get impressive results on many difficult tasks [14]. The convolutional neural network (CNN) is one of the most important deep learning models [22][33]. The main difference between CNN and normal neural networks is the convolution operation. The first successful CNN is at early 90s for the classification of digits [23], which includes the invariances in two-dimensional forms using local connections and weight restrictions. The training uses the maximum likelihood estimation and the modified backpropagation algorithm [26]. To give a better understanding of how convolutional networks work, [39] proposed the de-convolutional technique to visualize the operations of the hidden layer. By using GPU (Graphic Processing Unit), faster learning and testing are obtained by [20]. It has been proven that CNNs have great performances in classification tasks, such as image processing [30]. CNNs are also applied for data regression and time series modeling. In [4], the time series is formed in the autoregressive model. In [12], CNN is applied to classify the time series.