Abstract
1- INTRODUCTION
2- DEEP NEURAL NETWORKS
3- FEATURES OF SEISMIC DATA ANALYSIS
4- MULTITASK LEARNING
5- RESULTS AND DISCUSSION
6- CONCLUSIONS
REFERENCES
Abstract
The number of seismological studies based on artificial neural networks has been increasing. However, neural networks with one hidden layer have almost reached the limit of their capabilities. In the last few years, there has been a new boom in neuroinformatics associated with the development of third-generation networks, deep neural networks. These networks operate with data at a higher level. Unlabeled data can be used to pretrain the network, i.e., there is no need for an expert to determine in advance the phenomenon to which these data correspond. Final training requires a small amount of labeled data. Deep networks have a higher level of abstraction and produce fewer errors. The same network can be used to solve several tasks at the same time, or it is easy to retrain it from one task to another. The paper discusses the possibility of applying deep networks in seismology. We have described what deep networks are, their advantages, how they are trained, how to adapt them to the features of seismic data, and what prospects are opening up in connection with their use.
INTRODUCTION
Artificial neural networks (NNs) are widely used for the processing of seismic data (Böse et al., 2008; Gravirov et al., 2012; Lin et al., 2012; Kislov and Gravirov, 2017). It is possible to train the NN to solve such complex tasks as pattern recognition, signal detection, nonlinear modeling, classification, and regression using a training sample in which the correct answer is known for each example (supervised learning). However, the expansion of neural network technologies is constrained by a large number of heuristic rules for network design and training. The main thing is that, in this case, it is never known whether the architecture of the constructed network is optimal or whether it has been trained in the best way, i.e., whether the global minimum of the error function is found. Although an NN with one hidden layer can approximate any function with any accuracy, it can be considered to be a lookup table for the training sample with the more or less correct interpolation of intermediate values and extrapolation at the edges (Cybenko, 1989). The main limitations of the applicability of the NN are also associated with the problems of overfitting, stability-plasticity tradeoff, and the curse of dimensionality (Friedman, 1994). Obviously, different methods are developed to bypass these difficulties, but they are mostly heuristic. A trained NN works fast, but the learning process requires an indefinite time. In addition, the preparation of the training sample is usually a time-consuming process in itself (Gravirov and Kislov, 2015). Some solutions for this problem have also been found. For example, some NNs can cluster examples with an unknown answer (unsupervised learning), which reduces preparatory work (Köhler et al., 2010).