Abstract
1. Introduction
2. Related work
3. Preliminaries
4. Algorithm
5. Experimental evaluations
6. Conclusion
Acknowledgements
References
Abstract
Web anomaly detection aims to find deviations from normal behaviour that happened in our system at most of the time. With the development of the Internet, it is vital for the security of the Internet to detect web-based anomalies. Clustering based on feature extraction by manually has been verified as a significant way to detect new anomalies. But the presentations of these features cannot express semantic information of the URLs. In addition, few studies try to cluster the anomalies into specific types like SQL-injection. In order to solve these two problems, we provide a weighted deep learning enabled subspace spectral ensemble clustering approach for web anomaly detection called WDL-SSEC. This approach has three steps. Firstly, an ensemble clustering model is applied to separate anomalies from normal samples. Then we use word2vec to get the semantical presentations of tokens and concatenate weighted tokens to get vectors of the URLs. Finally, another ensemble clustering based on subspace and locally adaptive clustering (LAC) multi-cluster anomalies into specific types. Our approach is run on a real-life data set. The results achieves better performance than existing approaches, , which demonstrates that our model has the ability to cluster anomalies into appropriate types.