Abstract
1- Introduction
2- Related works
3- Methodology
4- Dataset statistics and preprocessing
5- Feature sets
6- Experimental results and insights
7- Conclusion
References
Abstract
The last decade witnessed an enormous growth in popularity of several social media platforms. Although these platforms are generally meant to share information and opinions, their ubiquity is being increasingly exploited for spreading news and events in real time. Hence these platforms have become a natural choice for news agencies for getting updates, comments and experts’ opinion of ongoing events which is crucial to understand the societal impact and for writing reports/editorial. However, with the plethora of these platforms available, each having its own uniqueness in content presentation, spreading patterns and also in the user interests, a comparative study of the efficacy of these platforms for different journalistic purposes would be useful. In this paper, we perform a comparative study of two leading social media platforms Reddit and Twitter. We have analyzed Reddit comments and Twitter feeds of six news categories to establish the efficacy of these platforms in terms of different journalistic requirements. Observations reveal that there exist significant differences across these platforms that can be suitably exploited depending upon the scope of the requirements; for example, while Twitter is a better choice for the evolutionary study of events, Reddit is the more natural choice for exploration during the initial phase of any event. While the availability and spread of updated information on Twitter can be key in emergency and disaster situations, critical analysis of posts in Reddit can be important for editorials.
Introduction
Increasing use of smartphones and high penetration of data connectivity have lent to the enormous growth in popularity of social media platforms. The propensity of users to share news updates and opinions over social media platforms makes them an important source for obtaining news and reactions in real time. According to Pew Research center survey report in 20171, 67% of the adults get news from social media. The news agencies consider these platforms as a rich source of information and use them for mining news and user opinions [1]. The recently released Cision 2017 Global Social Journalism revealed that nearly half of the journalists can not do their work without help of social media2. Journalists use the social media for meeting various journalistic requirements that include obtaining breaking news and real-time news-updates, finding opinions, enriching arguments, confirmation of facts and deriving information source [2–5]. A large variety of social media platforms are currently in place that includes, microblogs, discussion forums, question-answer sites and social news aggregators to name a few [6]. Although most of these platforms are used for sharing facts and updates, sharing users’ view and for the propagation of news events, the microblog and the news aggregator sites are currently playing a predominant role in event identification, news propagation, and opinion sharing [7–9]. Twitter has traditionally been recognized as the major microblogging platform for obtaining such updates, however, issues like short content lengths, large vocabulary gaps and the inherently noisy nature of the tweets pose subtle challenges in mining the required information [10,11]. On the other hand with the increasing popularity of news aggregator sites like Reddit, journalists are exploring these platforms for meeting the journalistic requirements mentioned above. Thus, considering the diverse features, functionalities and user behavior of the microblog and news aggregation platforms, one of the current needs is to determine the suitability of a platform in mining relevant information depending upon the specificity of the requirement.