TwitterNews: Real time event detection from the Twitter data stream
- Published
- Accepted
- Subject Areas
- Artificial Intelligence, Data Mining and Machine Learning, Social Computing, World Wide Web and Web Science
- Keywords
- Event detection, Twitter, Microblog, Incremental clustering, Locality sensitive hashing, Random indexing
- Copyright
- © 2016 Hasan et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
- Cite this article
- 2016. TwitterNews: Real time event detection from the Twitter data stream. PeerJ Preprints 4:e2297v1 https://doi.org/10.7287/peerj.preprints.2297v1
Abstract
Research in event detection from the Twitter streaming data has been gaining momentum in the last couple of years. Although such data is noisy and often contains misleading information, Twitter can be a rich source of information if harnessed properly. In this paper, we propose a scalable event detection system, TwitterNews, to detect and track newsworthy events in real time from Twitter. TwitterNews provides a novel approach, by combining random indexing based term vector model with locality sensitive hashing, that aids in performing incremental clustering of tweets related to various events within a fixed time. TwitterNews also incorporates an effective strategy to deal with the cluster fragmentation issue prevalent in incremental clustering. The set of candidate events generated by TwitterNews are then filtered, to report the newsworthy events along with an automatically selected representative tweet from each event cluster. Finally, we evaluate the effectiveness of TwitterNews, in terms of the recall and the precision, using a publicly available corpus.
Author Comment
This is a preprint submission to PeerJ Preprints.