The Russian National Corpus is a representative collection of texts in Russian, counting more than 2 bln tokens and completed with linguistic annotation and search tools
Search in corpora
News
In the Word at a glance of the Main, Educational, and Media corpora, as well as the "From 2 to 15" and "Russian Classics" collections, a new Word Sketch Difference feature has been introduced!
This new functionality allows users to see similarities and differences in the usage of two words. For example, you can explore what время ‘time’ and деньги ‘money’ have in common or analyze what can be колючий versus колкий (both ≈ ‘sharp, prickly’).
Word Sketch Difference is available for nouns, adjectives, verbs, and adverbs. You can compare two lemmas belonging to the same part of speech. However, sketches are not generated for words that appear fewer than in three different texts, as well as for proper names, abbreviations, and words with non-standard spellings.
For comparison, the top 6 collocates are selected for each keyword. The comparative table may contain fewer than 12 collocates if one or both keywords have fewer than six collocates or if there is an overlap in their top 6 lists.
The Multimedia corpus has been expanded by 107,000 tokens. The following additions have been made: a collection of artistic reading recordings featuring short stories by Anton Chekhov performed by renowned actors such as Alexander Borisov, Leonid Bronevoy, Igor Ilyinsky, and Rostislav Plyatt; two theatrical productions, and recordings of television talk shows. The collection of regional speech recordings has been significantly enriched. It now includes conversations and interviews with residents of the Nizhny Novgorod, Murmansk, Ryazan, Sverdlovsk, and Tver regions, as well as the Krasnodar Krai, Yakutia, and other territories. These people are featured speakers in documentary films from the series "Letters from the Provinces" and in video blogs.
The corpus now offers the feature to filter subcorpora by region.
The Birchbark Letters and Inscriptions corpora now feature photographs and drawings of the original historical texts.
By default, preview images are displayed in the concordance: photographs are shown on the left, and drawings are on the right. Clicking on an image opens it in full-screen mode, where users can zoom in or out on the drawings and photographs and download them as needed.
In KWIC mode and when selecting a subcorpus, images can only be viewed in full-screen mode by clicking the icon to the right of the text header.
There is a setting to hide images. This option is saved in the user’s browser, so upon returning to the corpus, the settings will remain, and the results will be displayed without images.
This new functionality was made possible through collaboration with the development teams of gramoty.ru and epigrafika.ru. These platforms provide more detailed information about the letters and inscriptions. We extend our gratitude to our colleagues and look forward to continued successful collaboration.
The "Russian сlassics" corpus has been updated with the academic editions of complete works by Alexander Griboyedov and Fyodor Tyutchev. Their written legacy is relatively small in size (with Tyutchev having written even less in Russian than the "one-book author" Griboyedov). However, their language is of significant interest from various perspectives. The corpus also features variant readings in different revisions of the same texts. All the texts within the corpus have been re-annotated, incorporating improvements in the Rubic language model.