Abstract
1. Introduction
2. Related work
3. The social information discovery system
4. Sentiment analysis
5. Implementation
6. Evaluation
7. Conclusion
References
Abstract
In recent years, the massive diffusion of social networks has made available a large amount of user-generated content, for the most part in the form of textual data that contain people’s thoughts and emotions about a great variety of topics. In order to exploit these publicly available information, in this work we introduce a social information discovery system which elaborates simultaneously over more-than-one social network in an integrated scenario. The system is designed to ensure flexibility and scalability, thus enabling for (near-)real-time analysis even in case of high rates of content’s creation and large amounts of heterogeneous data. Furthermore, a noise detection technique ensures a high relevance of analyzed posts/tweets to the domain of interest. We also propose a lexicon-based sentiment analysis algorithm to extract and measure users’ opinion, in order to support collaboration and open innovation. Polysemous words and negations are typically challenging for lexicon-based approaches: for this reason, we introduce both a word sense disambiguation algorithm and a negation handling technique. Experiments on several datasets have proven that the combined use of both techniques improves the classification accuracy on 3-class sentiment analysis.