Multilingual parallel corpora: Alternative source of language data for typological studies, applying perspectives and problems


2019. №2, 111-125

Lyubov V. Nesterenko
National Research University Higher School of Economics, Moscow, Russia; lyu.klimenchenko@gmail.com

Abstract:

In this paper, we discuss the perspectives of using multilingual parallel corpora as a source of language data for cross-linguistic studies. Multilingual parallel corpora make it possible to apply quantitative methods to cross-linguistic data. However, they have not become popular among researchers yet. The reason for that is the lack of multilingual parallel corpora that are suitable for linguistic studies and also the absence of unified guidelines for multilingual parallel corpora development. In the paper, we will analyse the factors that make it difficult to use multilingual parallel corpora for linguistic experiments and present some ideas about the features one should take into account when building multilingual parallel corpora for typological studies.

For citation:

Nesterenko L. V. Multilingual parallel corpora: Alternative source of language data for typological studies, applying perspectives and problems. Voprosy Jazykoznanija, 2019, 2: 111–125.