RNC News

The interface of the Middle Russian corpus has been significantly updated; the corpus is connected to the Overview feature.

The Regional corpus has a new type of output: Frequency. With it, the statistical distribution of search results by lemmas, word forms and a set of grammatical features can be analyzed. The frequency is calculated based on texts with automatically removed homonymy over a random subsample of 1 million search results. Users can control the confidence level to compare frequency confidence intervals.

The Dialect corpus has been updated and now contains 604 thousand tokens.
The SynTagRus has increased by 30 thousand tokens.

The corpus and subcorpus frequency dictionaries now feature 500 top lemmas rather than 100.

Show all