Abstract
1- Introduction
2- Related work
3- System architecture
4- Results and discussion
5- Conclusion
References
Abstract
The Internet contains both structured and unstructured data. The enormous flow of Internet data creates challenges in relation to effective information retrieval. Semantic Web Mining explores Web addresses using ontological and semantic structures. For effective information retrieval in Web Mining and Text Mining, text feature extraction plays an important role. The effectiveness of the text processing is determined by the complexity and dimensionality reduction of the feature vector. In this paper, a new approach is proposed based on the semantic structure of the Web data. It combines both feature extraction and feature selection techniques for data mapping and retrieval, involving standard features for effective text mapping. This process reduces the dimension complexity in the feature vector for effective information retrieval.
Introduction
The basic problem in information retrieval is extracting the features of text sentences for text classification [1]. Without processing, text cannot be used directly for similarity measurement, as the required results cannot be obtained with the desired accuracy. For effective text processing, first we need to explore the semantic structure of the Web efficiently. It is necessary to identify and strongly define the relationships between data. Tagging data using appropriate techniques plays a key role. In this article, we present our own data description approach. Exploring the semantic structures efficiently helps to extract and select features to properly represent the vector space with reduced dimensions. Ensuring the vector space remains manageable is highly important. Generally, dimensionality reduction algorithms [2] are classified into two types, namely feature extraction and feature selection algorithms. Feature extraction algorithms [3,4] reduce the vector space through algebraic transformations and a new feature set is created from the base set. Feature selection algorithms focus on reducing the vector space by considering the subset features from the base set. In this paper, we combine both feature extraction and feature selection to reduce the dimensions of the feature vector space.