Referential choice in spoken and written stories: A comparative study based on the corpus “Funny life stories”
Marina A. Shumilina
Lomonosov Moscow State University, Moscow, Russia; Institute of Linguistics, Russian Academy of Sciences, Moscow, Russia; mari.an.shum@iling-ran.ru
Abstract:
The study is concerned with referential choice in spoken and written discourse in the Russian language. I consider referential choice to consist in a threefold opposition between full noun phrases, pronouns, and zero noun phrases. The study is based on discourses each of which was presented by its narrator twice, namely in the spoken and written forms. In each story, all the noun phrases were identified and described according to 29 parameters. I trained logistic regression models and decision trees on the collected samples and analyzed factor importance diagrams built on the basis of the decision trees. The interpretation of the models and diagrams shows that some factors have different impact on referential choice in spoken and written discourses, for instance, grammatical role, semantic hyperrole and sloppy identity between the anaphor and the antecedent. Besides, the models also demonstrate that the sets of significant factors for the two samples are not identical: in particular, the referent’s animacy and the anaphor’s semantic hyperrole are present solely in the decision tree for written discourse.
For citation:
Shumilina M. A. Referential choice in spoken and written stories: A comparative study based on the corpus “Funny life stories”. Voprosy Jazykoznanija, 2024, 6: 133–159.
Acknowledgements:
I am very grateful to A. A. Kibrik and the two anonymous reviewers for their extremely helpful suggestions and remarks on this work, as well as to G. Dobrov, A. Bolshina, and K. Studenikina for their recommendations on preliminary data processing, and also choice and application of machine-learning algorithms.