Corpus portrait

The Corpus portrait is a tool that allows you to analyse the peculiarities of the corpus and to estimate whether it is suitable for your academic or educational purposes.

You can also learn here how the corpus was created, who worked on it and which ideas are essential for it. 

Corpus header

The corpus header is always shown at the top. It provides high-level information about the corpus.

The header includes information about the corpus size in texts and in words.

All the corpora of the RNC are annotated with tags that allow one to classify them by historical periods, types of texts, orthography, special annotation, etc.

You can open any corpus portrait by clicking the "About the corpus" (i) button in its header.

Portrait structure

The corpus portrait includes:

  • description
  • actual statistics (currently in the Main, Educational, Media corpora, some historical corpora as well as "Russian classics" and "From 2 to 15", to be included in all other corpora later)
  • diachronic statistics (currently in the Main, and Regional & International media corpora only, to be included in other corpora later)
  • frequency dictionary (currently in the Main, Educational, Old East Slavic, Middle Russian, Media corpora as well as "Russian classics" and "From 2 to 15", to be included in all other corpora later)
  • list of authors (only in the Poetry Corpus)

The search page of the corpus can be opened by clicking the Magnifier button on the blue bar.

You can generate a short link to the corpus portrait by clicking the Copy link button.

Use the corpus selection menu near the corpus title to see the portraits of other corpora.

The user's subcorpus and the whole corpus can be compared using the Subcorpus portrait.

Updated on 28.05.2024