Methods of data mining in the task of distinguishing between folklore and author’s texts
Liudmila V. Shchegoleva
Aleksandr A. Lebedev @
Nikolai D. Moskin
Petrozavodsk State University, Petrozavodsk, Russia; perevodchik88@yandex.ru
Abstract:
The main problem of the study is the distinction between folklore texts and texts stylized as folklore by means of mathematical methods and computer technologies. Five groups of texts were considered: folklore songs from Zaonezhie of 19th — early 20th century, Luga songs from the repertoire of the
Gorodensky folk choir, and poems by N. A. Klyuev, A. K. Tolstoy and S. A. Yesenin stylized as folklore. For comparing texts on the basis of their graph-theoretical models, eight parameters were used. These parameters were used in a series of experiments, carried out in the R environment and involving fi ve methods of data mining. All methods showed a fairly high average recognition accuracy (more than 80 %).
For citation:
Shchegoleva L. V., Lebedev A. A., Moskin N. D. Methods of data mining in the task of distinguishing between folklore and author’s texts. Voprosy Jazykoznanija, 2020, 2: 61–74.