RNC News
In the Regional Corpus, keyword annotations in texts have been updated. The use of keywords facilitates the analysis of narrow thematic categories and helps navigate texts
New texts of about 100,000 tokens were added to the Dialect corpus. The new texts represent the dialects of the north (Arkhangelsk Oblast, Karelian and Komi
There are new features in the Parallel Corpus that facilitate processing it.
In the Japanese texts, the Semantics search field has been added to the bilingual
Texts by four poets: Vadim Shefner, Robert Rozhdestvensky, Lev Loseff and Maria Stepanova, have been added to the Poetry corpus. The size of the update is
The Main corpus of the RNC has been expanded by 15 million tokens, representing several thematic collections: plays of different periods, official texts and legalese, academic journals,
The Russian MultiPARC has been expanded and counts almost 300 thousand tokens. It now features Chekhov's play “Three Sisters” staged by four different theaters: Gorky Moscow Art
The parallel corpus was expanded by 3 million tokens. Half of this amount is accounted for by English-language non-fiction texts (popular science and journalistic). In addition, the