Corpus of spoken Russian

The Corpus of Spoken Russian includes the recordings of public and spontaneous spoken Russian and the transcripts of the Russian movies. To record the spoken specimens the standard spelling was used. Lexical, morphological and semantic queries may be built. The building of the user's sub-corpora is available (for this purpose the usage of the sociological parameters is also possible). The corpus contains the patterns of different genres/types and of different geographic origins (Moscow, Saint Petersburg, Saratov, Ulyanovsk, Taganrog, Ekaterinburg, and so on). 


Check out the list of scientific publications on the Spoken Russian corpus via the link: To find other types of publications related to the corpus, use the filters in the "Publications" section.

Updated on 16.09.2024