Evaluating social network extraction for classic and modern fiction literature
- Published
- Accepted
- Subject Areas
- Computational Linguistics, Digital Libraries, Network Science and Online Social Networks
- Keywords
- social networks, named entity recognition, evaluation, digital humanities, classic and modern literature
- Copyright
- © 2018 Dekker et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
- Cite this article
- 2018. Evaluating social network extraction for classic and modern fiction literature. PeerJ Preprints 6:e27263v1 https://doi.org/10.7287/peerj.preprints.27263v1
Abstract
The analysis of literary works has experienced a surge in computer-assisted processing. To obtain insights into the community structures and social interactions portrayed in novels the creation of social networks from novels has gained popularity. Many methods rely on identifying named entities and relations for the construction of these networks, but many of these tools are not specifically created for the literary domain. Furthermore, many of the studies on information extraction from literature typically focus on 19th century source material. Because of this, it is unclear if these techniques are as suitable to modern-day science fiction and fantasy literature as they are to those 19th century classics. We present a study to compare classic literature to modern literature in terms of performance of natural language processing tools for the automatic extraction of social networks as well as their network structure. We find that there are no significant differences between the two sets of novels but that both are subject to a high amount of variance. Furthermore, we identify several issues that complicate named entity recognition in modern novels and we present methods to remedy these.
Author Comment
This is a submission to PeerJ Computer Science for review.