RNC News
The collections in the Accentological and Spoken corpora were updated. We added transcripts of expert talks, oral memories, and everyday dialogic speech. These texts were recorded in different regions, including Moscow, Tomsk, and Voronezh Oblasts, Republics of Buryatia and Mari El.
We would like to thank for collecting and processing the texts: students and staff of the Voronezh State University, students of the Lomonosov Moscow State University, Grigori Korotkikh (Ilshat association, Tomsk), Egor Kashkin (Group for the Study of Contact Interaction of Russian with Indigenous Languages of Russia, Vinogradov Russian Language Institute).
The size of the Spoken Corpus amounts to 14,8 million tokens, the total size of the Accentological corpus, including naive poetry, is 134.8 million tokens.
In both corpora, it is now possible to select texts by number of word forms. In the subcorpus selection form, regions in the Spoken Corpus are now grouped by countries for easy searching.