Abstract
1- Introduction
2- Preliminaries
3- Multi-projection deep computation model (MPDCM)
4- Learning algorithm
5- Experiments
6- Related work
7- Conclusion
References
Abstract
The double-projection deep computation model (DPDCM) proved to be effective for big data feature learning. However, DPDCM cannot capture the underlying correlations over the different modalities enough since it projects the input data into only two subspaces. To tackle this problem, this paper presents a multi-projection deep computation model (MPDCM) to generalize DPDCM for smart data in Internet of Things. Specially, MPDCM maps the input data into multiple nonlinear subspaces to learn the interacted features of IoT big data by substituting each hidden layer with a multi-projection layer. Furthermore, a learning algorithm based on back-propagation and gradient descent is designed to train the parameters of the presented model. Finally, extensive experiments are conducted on two representative datasets, i.e, Animal-20 and NUS-WIDE-14, to verify the presented model by comparing with DPDCM. Results show that the presented model achieves higher classification accuracy than DPDCM, proving the potential of the presented model to drill smart data for Internet of Things.
Introduction
Recently, Internet of Things (IoT) have achieved great progress by integrating advanced sensing devices such as sensors and RFIDs into communication networks [1]. Specially, big data processing techniques such as data compression, deep learning, correlation analysis and clustering are playing a remarkable role in Internet of Things [2], [3]. For example, deep learning, an recently advanced artificial intelligence technique is used to find the valuable information, i.e., smart data, from IoT big data for smart market analysis in industrial manufacture. A unique property of IoT big data is its high variety, i.e., data comes from various sources such as cameras and sensors, with different formats like text, image and audio [4]. Typically, each heterogeneous data object has more than one modalities, implying that heterogeneous data is typically multi-modal [5]. For instance, a piece of video usually contains two modalities, i.e., image and audio, or three modalities, i.e., image, audio and text.