The Russian National Corpus is a representative collection of texts in Russian, counting more than 2 bln tokens and completed with linguistic annotation and search tools
Search in corpora
News
Show allWe continue to develop the functionality of the corpus for teaching Russian in schools. Six new rules related to vowel alternation in the root have been added to the Practice Example Generator:
- The letters А and О in root -зар-/-зор-
- The letters А and О in root -кас-/-кос(н)-
- The letters А and О in root -мак-/-мок-
- The letters А and О in root -плав-/-плов-/-плы(в)-
- The letters А and О in root -рав-/-ров-
- The letters А and О in root -твар-/-твор-
You can access the generator page from the RNC for school page by clicking on the corresponding banner.
The Social Media corpus database has been expanded with a new collection of texts from regional online sources in Astrakhan, Vologda, Leningrad, and Sakhalin Oblasts, as well as Karelia and Mordovia Republics, covering the period from 2005 to 2023. The additions include posts by bloggers, discussions in local online communities, and content from regional groups on platforms such as VK, Telegram, LiveJournal, Zen, and others. The collection was prepared with the participation of staff from Voronezh State University. The update totals 5 million tokens.
The Dialect Corpus has been expanded with large new collections of Vyatka and Voronezh texts, along with other materials. The update adds 124,000 tokens and more than 16 hours of audio recordings. The corpus now also supports searching for several new grammatical features, including dialectal perfect (у меня не плачено долг) and pluperfect (наши туды были уехали), pluralia tantum, and toponyms (с Воронежа).