The Computer Analysis of Latin Texts: Topic Modeling of “Historia de Regibus Gothorum, Vandalorum et Suevorum” by Isidore of Seville
Table of contents
Share
Metrics
The Computer Analysis of Latin Texts: Topic Modeling of “Historia de Regibus Gothorum, Vandalorum et Suevorum” by Isidore of Seville
Annotation
PII
S207987840009681-8-1
DOI
10.18254/S207987840009681-8
Publication type
Article
Status
Published
Authors
Aleksey Kuznetsov 
Affiliation: Institute of World History RAS
Address: Russian Federation, Moscow
Abstract

The article attempts to use the modern text mining methods to analyze the latin text of “Historia de regibus Gothorum, Vandalorum et Suevorum” by Isidore of Seville, in particular, to perform topic modeling to extract hidden semantic structures from the text. The main task of study was to clarify the attitude of Isidore of Seville toward the three barbarian peoples. The analysis of the text was performed with the R programming language. As a method for topic modeling was chosen the probabilistic topic model of Latent Dirichlet Allocation. The main research tool was the UDPipe package for R. Topic modeling was performed by means of pre-trained model created as part of the Universal Dependencies project and based on the Index Thomisticus treebank. Particular attention during the creation of the topic model was paid to the quality of the text preprocessing and the selection of the optimal number of topics based on the metric of topics coherence. At the end of the article, the results of the distribution of the topics identified by sections of the text by Isidore of Seville are analyzed.

 

Keywords
Isidore of Seville, early Middle Age historiography, computational text analysis, topic modelling, Latent Dirichlet Allocation, topic coherence
Received
12.11.2019
Publication date
12.05.2020
Number of characters
29393
Number of purchasers
19
Views
255
Readers community rating
0.0 (0 votes)
Cite Download pdf 100 RUB / 1.0 SU

To download PDF you should sign in

Full text is available to subscribers only
Subscribe right now
Only article
100 RUB / 1.0 SU
Whole issue
1000 RUB / 10.0 SU
All issues for 2020
1200 RUB / 20.0 SU

References

1. Vorontsov K. V. Obzor veroyatnostnykh tematicheskikh modelej // Avtomaticheskaya obrabotka tekstov na estestvennom yazyke i analiz dannykh: uchebnoe posobie / Bol'shakova E. I., Vorontsov K. V., Efremova N. Eh., Klyshinskij Eh. S., Lukashevich N. V., Sapin A. S. M., 2017. S. 195—268.

2. Vorontsov S. A. Wood J. The politics of identity in Visigothic Spain. Religion and power in the histories of Isidore of Seville. Brill, 2012 // Vestnik PSTGU. Seriya 1: Bogoslovie. Filosofiya. 2012. № 42 (4). S. 125—131.

3. Kuznetsov A. V. Primeneniya instrumentov text mining dlya analiza srednevekovykh latinoyazychnykh tekstov: predvaritel'naya obrabotka tekstov // Nauchnye issledovaniya i razrabotki. Sbornik nauchnykh rabot 57j Mezhdunarodnoj nauchnoj konferentsii Evrazijskogo Nauchnogo Ob'edineniya (g. Moskva, noyabr' 2019). M., 2019. C. 68—70.

4. Anandarajan M., Hill C., Nolan T. Practical Text Analytics. Maximizing the Value of Text Data. (Advances in Analytics and Data Science. Vol. 2.) Springer, 2019. P. 45—59.

5. Daud A., Li J., Zhou L., Muhammad F. Knowledge discovery through directed probabilistic topic models: a survey // Proceedings of Frontiers of Computer Science in China. June 2010. Vol. 4. Is. 2. P. 280—301.

6. Fridlund M., Brauer R. Historizing topic models: A distant reading of topic modeling texts within historical studies // Nauki o kul'ture v perspektive “digital humanities”: Materialy Mezhdunarodnoj konferentsii 3—5 oktyabrya 2013 g., Sankt-Peterburg / pod red. L. V. Nikiforovoj, N. V. Nikiforovoj. SPb., 2013. S. 152—163.

7. McGillivray B., Kilgarriff A. Tools for Historical Corpus Research, and a Corpus of Latin // New Methods in Historical Corpus Linguistics. № 3. 2013. P. 247—257.

8. Piotrowski M. 2012. Natural Language Processing for Historical Texts. (Synthesis Lectures on Human Language Technologies. Vol. 17.) Morgan & Claypool. San Rafael, 2012. P. 1—4.

9. Weingart S. B., Meeks E. The Digital Humanities Contribution to Topic Modeling // The Journal of Digital Humanities. Vol. 2 (1). Winter 2012. P. 1—5.

10. Wood J. The Politics of Identity in Visigothic Spain. Religion and Power in the Histories of Isidore of Seville. Leiden; Boston, 2012. P. 77, 159—260.