The Russian National Corpus is a representative collection of texts in Russian, counting more than 2 bln tokens and completed with linguistic annotation and search tools
Search in corpora
News
Personal accounts are now available on the Corpus website.
Its main task is to enhance the users’ individual workflow. Now you can save queries (in any corpus) or comparisons between them (in the corpora where this function is implemented) to your personal account and return to them when necessary.
To save a query or comparison, click the “Save query” button in the search result or “Save comparison” on the query comparison page. The corresponding tabs of your personal profile enable viewing the saved queries and comparisons, assigning names to them, copying short links for sharing, and deleting the queries. The number of queries and comparisons that can be saved is unlimited.
The profile settings have been expanded. Users can fill in their personal information (this data can only be seen by the user), change their password or delete their account. In the future, with the consent of the user, some of their data, such as name and affiliation, will become visible to others.
The personal account is available in the desktop and the mobile versions.
The Old East Slavic corpus was expanded with new texts and grew by 43 thousand token. On the one hand, it includes later texts of the 14th century (e.g., Ukrainian and Moscow business charters, the Pskovian Tale of Dovmont), and on the other hand, the annotation of some early texts (the Tale of Bygone Years according to the Laurentian manuscript or hagiographies) has been expanded. The vocabulary of the corpus now includes the ancestors of such familiar words as naprasno ‘unexpectedly; (Modern Russian) in vain’, peremolvit'sa ‘exchange words’, šapka ‘headwear’, or raznoglasie ‘discord’.
Users can select a subcorpus and get statistics by standard criteria (including the date of the text and copy, text genre, text size) and to find out to what extent the protagonists from chronicles go somewhere more often than those from charters and tales. It is now possible to search Greek lemmas and word forms in translated texts. Greek words can be typed on the virtual keyboard. For example, the word δόγμα (‘dogma’) is rendered by the Slavic translators not only as a direct borrowing but also as ‘command’, ‘doctrine’, or ‘statute’.
A new functionality is available in the Word at a glance function of the OES corpus, “Word Forms”. For Old East Slavic nouns in different orthographies, the paradigm of all the number and case forms encountered in the corpus are given, and it is possible to find out the frequency of these forms and follow the links to the word. For example, you can find out what forms the word drug ‘friend’ had. Some forms of the rarely used dual number are not yet attested in our corpus, so you should also consult grammars for the full paradigms.
Technical maintenance will be conducted on our servers between September 12, 20:00 and September 13, 18:00 (Moscow time).
Due to this, interruptions in the Corpus website's operation may occur.
The Poetry corpus within the Russian National Corpus is one of the most representative poetry corpora in the world today. All major trends in Russian poetry of the 18th-20th centuries are represented in it.
A unique feature of the RNC Poetry Corpus is its verse annotation. It reflects both the text properties and the structure of the verse itself and its individual lines.
Thanks to the verse annotation, it is possible to solve a variety of tasks. For example, one can get information about the frequency of fables as a poetic genre in Russian literature: there are 940 fables in the corpus out of 101,521 texts as of August 2024. Using sorting tools, one can find out that the first fable in the corpus was written in 1731 by the satirical poet Antioch Kantemir, who was also a diplomat in the Russian service.
For verse scholars, the annotation of the verse structure is more interesting. Using it, for example, it is possible to study the history of Russian word stress and of the pronunciation in general.
More information about the structure and practical applications of the Poetry Corpus can be found in the Publications section.