Morphology and word order in Slavic languages: Insights from annotated corpora


2021. №4, 131-159

Yan Jianwei a
Liu Haitao a, b, @
a Zhejiang University, Hangzhou, China;
b Guangdong University of Foreign Studies, Guangzhou, China; htliu@163.com

Abstract:

Slavic languages are generally assumed to possess rich morphological features with free syntactic word order. Exploring this complexity trade-off can help us better understand the relationship between morphology and syntax within natural languages. However, few quantitative investigations have been carried out into this relationship within Slavic languages. Based on 34 annotated corpora from Universal Dependencies, this paper paid special attention to the correlations between morphology and syntax within Slavic languages by applying two metrics of morphological richness and two of word order freedom, respectively. Our findings are as follows. First, the quantitative metrics adopted can well capture the distributions of morphological richness and word order freedom of languages. Second, the metrics can corroborate the correlation between morphological richness and word order freedom. Within Slavic languages, this correlation is moderate and statistically significant. Precisely, the richer the morphology, the less strict the word order. Third, Slavic languages can be clustered into three subgroups based on classification models. Most importantly, ancient Slavic languages are characterized by richer morphology and more flexible word order than modern ones. Fourth, as two possible disturbing factors, corpus size does not greatly affect the results of the metrics, whereas corpus genre does play an important part in the measurements of word order freedom. Specifically, the word order of formal written genres tends to be more rigid than that of informal written and spoken ones. Overall, based on annotated corpora, the results verify the negative correlation between morphological richness and word order rigidity within Slavic languages, which might shed light on the dynamic relations between morphology and syntax of natural languages and provide quantitative instantiations of how languages encode lexical and syntactic information for the purpose of efficient communication.

For citation:

Yan J., Liu H. Morphology and word order in Slavic languages: Insights from annotated corpora. Voprosy Jazykoznanija, 2021, 4: 131–159.