Character-Distinguishing Features in Fictional Dialogue: Quantifying Verbal Identities in Tolstoy’s War and Peace
Table of contents
Share
QR
Metrics
Character-Distinguishing Features in Fictional Dialogue: Quantifying Verbal Identities in Tolstoy’s War and Peace
Annotation
PII
S207987840001649-2-1
Publication type
Article
Status
Published
Authors
Anastasiya Bonch-Osmolovskaya 
Affiliation: Higher School of Economics
Address: Russian Federation, Moscow
Daniil Skorinkin
Affiliation: Higher School of Economics
Address: Russian Federation, Moscow
Abstract
This paper presents a quantitative study of spoken dialogue in Leo Tolstoy’s War and Peace. Tolstoy was known to put a lot of emphasis on the language in which fictional characters express themselves, and conscious modification of their speech is acknowledged by critics as part of his literary technique. Our goal was to try and find some formal markers that would help us distinguish the characters, measure some sort of speech-based similarity between them, and cluster them into meaningful groups. At the first stage we applied some well- established approaches of stylometry (computational stylistics) that were originally developed for real-world authorship attribution and rely mainly on word and n-gram frequencies. Then we tried our own alternative method based on more formal and structure-oriented features independent of actual word choice. Both approaches produced meaningful and interpretable results, which indicate overall applicability of quantitative methods to literary studies in general and to the analysis of specific characters in particular. At the same time, the difference between the two sets of results helped us demonstrate that sometimes more formal and structure-oriented features could be more revealing and ‘noise-resistant’ than word and n-gram frequencies.
Keywords
digital literary studies, quantitative methods in literary studies, stylometry, Delta, Russian literature, Leo Tolstoy
Received
04.11.2016
Publication date
01.12.2016
Number of characters
61867
Number of purchasers
49
Views
5856
Readers community rating
0.0 (0 votes)
Cite Download pdf 200 RUB / 1.0 SU

To download PDF you should pay the subscribtion

Full text is available to subscribers only
Subscribe right now
Only article and additional services
Whole issue and additional services
All issues and additional services for 2016

References



Additional sources and materials

  1. Mukhin M. Yu. Leksicheskaya statistika i idiostil' avtora: korpusnoe ideograficheskoe issledovanie: na materiale proizvedenij M. Bulgakova, V. Nabokova, A. Platonova i M. Sholokhova . Ekaterinburg, 2011.
  2. Yarkho B. I. Metodologiya tochnogo literaturovedeniya: Izbrannye trudy po teorii literatury. M.: Yazyki slavyanskikh kul'tur, 2006.
  3. Argamon S. Interpreting Burrows's Delta: Geometric and Probabilistic Foundations // Literary and Linguistic Computing. 2008. № 23. C. 131—147.
  4. Argamon S., Levitan S. Measuring the usefulness of function words for authorship attribution // Literary and Linguistic Computing. 2004. C. 1—3.
  5. Bei Yu Function Words for Chinese Authorship Attribution  // Proceedings of the Workshop on Computational Linguistics for Literature. Montreal. 2012. C. 45—53.
  6. Bogdanova D., Lazaridou A. Cross-Language Authorship Attribution  // LREC proceedings. 2014. C. 2015—2020.
  7. Burrows J. “Delta”: a measure of stylistic difference and a guide to likely authorship // Literary and Linguistic Computing. 2002. S. 267—287.
  8. Burrows J. All the Way Through: Testing for Authorship in Different Frequency Strata // Literary and Linguistic Computing. April 2007. № 22. S. 27—47.
  9. Burrows J. Computation into Criticism: A Study of Jane Austen’s Novels and an Experiment in Method. Oxford: Clarendon Press, 1987.
  10. Eder M. Style-markers in authorship attribution: A cross-language study of the authorial fingerprint // Studies in Polish Linguistics. 2011. № 6. S. 99—114.
  11. Forsyth R., Holmes D. Feature-finding for text classification // Literary and Linguistic Computing. 1996. № 11. S. 163—174.
  12. Hoover D. Testing Burrows’s Delta // Literary and Linguistic Computing. 2004. № 19. S. 453—475.
  13. Hoover D., Culpeper J., O'Halloran K. Digital Literary Studies: Corpus Approaches to Poetry, Prose, and Drama. Routledge Advances in Corpus Linguistics, 2014.
  14. Juola P. i Baayen H. A Controlled-corpus Experiment in Authorship Identification by Cross-Entropy // Literary and Linguistic Computing. 2005. № 20. C. 59—67.
  15. Koppel M., Schler J. Exploiting stylistic idiosyncrasies for authorship attribution  // Proceedings of IJCAI’03 Workshop on Computational Approaches to Style Analysis and Synthesis. 2003. C. 69—72.
  16. Mendehal Thomas The Characteristic Curves of Composition // Science. 1887. C. 237—246.
  17. Mosteller F., Wallace D. Inference in an Authorship Problem // Journal of the American Statistical Association. 1963. T. 58. S. 302.
  18. Rybicki J., Eder M. Deeper Delta across Genres and Languages: Do We Really Need the Most Frequent Words?  // Literary and Linguistic Computing. 2011. № 23. S. 315—321.
  19. Schreibman S., Siemens R. A Companion to Digital Literary Studies. [Ehlektronnyj resurs]. URL: http://www.digitalhumanities.org/companionDLS/ (data dostupa 08.09.2016 g.)
  20. Seroussi Y., Bohnert F., Zuckerman I. Authorship Attribution with Author-aware Topic Models // Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. 2012. T. 2. S. 264—269.

 

Comments

No posts found

Write a review
Translate