Word at a glance

Word at a glance is a word portrait that shows the word's behaviour in a given corpus. Here you can see its grammatical and semantic features, similar words, its typical combinations with other words in sentences, usage examples from the corpus and the distribution of examples by years and by texts properties.

To see Word at a glance for the Main corpus, click the corresponding banner at the RNC main page.

You can switch to Word at a glance from the pages of the other corpora through the Corpus selection menu.

This feature is only available for the corpora in the new RNC interface.

How to search

To show a word's portait in the Word at a glance mode, specify:

  • its lemma (its basic word form). You can use the suggest tool or the virtual keyboard to type in the lemma. If you specify a word form that does not match any lemma, information about grammar, semantics, and similar words will not be displayed.
  • its part of speech. If you do not select any part of speech, information about all parts of speech for this word that occur more than 5 times in the corpus will be shown. 

Click Show the portrait to see the versatile information about the word.

If the word you seached for can belong to more than one part of speech, you will be able to switch between these parts of speech to see different portraits.

The portrait of the word is built based on the full corpus without taking into account the user's subcorpus.However, when switching from the portrait of the word to the search results, the subcorpus selection will be applied, so the examples of the word's usage that are given in the portrait may not coincide with the first examples in the search results.

Word sketches

The Sketches widget helps one understand how the query word interacts with the other Russian words. It is shown through its collocations which include words of different parts of speech (taking into account the syntactic relations), covering the bulk of its uses. For each part of speech, the most representative set of syntactic relations is different, namely:

For nouns:

  • adjectives defining the noun
  • verbs for which the noun is the subject
  • verbs for which the noun is the direct object
  • verbs for which the noun is an indirect object without a preposition
  • verbs for which the noun is an indirect object with a preposition

For verbs:

  • nouns acting as subjects
  • nouns acting as direct objects
  • nouns acting as indirect objects without a preposition
  • nouns acting as indirect objects with a preposition

For adjectives:

  • nouns defined by the adjective
  • adverbial modifiers

For adverbs:

  • verbs modified by the adverb
  • adjectives defining the adverb

The widget shows up to 10 collocates for each sketch, using the Dice metric for ranking. Accordingly, the list of colocates may be empty if the search for collocations of a noun, adjective, verb or adverb with a given syntactic relation has not yielded results. Sketches are not displayed for other parts of speech.

To see more sketches, you can use the scrollbar or slider on mobile devices.

Currently the sketches are only available in the Main corpus. 

About the word

The About the word widget shows the grammatical and semantic features of the word.

For nouns, adjectives, verbs and adverbs, you can get the most complete grammatical information.

Similar words

The Similar words widget displays the closest semantic associates of the word. The proximity coefficient of words, which can be seen by hovering the mouse over a word in the Word cloud, is calculated using distributive semantics models based on the actual materials of the main corpus of the RNC. The closer the coefficient value is to 1, the larger the word in the Word cloud is, and the more similar the contexts with this word should be to the contexts with the keyword.

The current version of Similar words works only in the main corpus and only shows semantic associates of the same part of speech for nouns, verbs, adjectives and adverbs. For proper names, toponyms, abbreviations and words that have non-standard spellings or are rarely found in the corpus, similar words are not displayed.

Due to the fact that the selection of associates is completely automatical, errors may occur in the lists, for example, incorrectly formed words or word associations that are not intuitively clear.

Distribution by year

In several corpora, a word usage frequency by year graph (ipm, frequency per million word forms) is available.

You can use a ready-made graph that includes examples of the word usage for all years, or refine the displayed results by changing the time period.

Smoothing the graph allows you to see the overall trend behind random frequency fluctuations. For example, smoothing at 10 years averages the word frequency over the preceding and subsequent 5 years. To get accurate data for each year, you can set smoothing to 0.

By moving the mouse to any point on the line, you can see the relative usage frequency (ipm) for a particular year. The ipm value is defined as the number of occurrences of a word in a year divided by the size of the corpus in that year and multiplied by one million.

Distribution of texts

The pie chart shows in which types of corpus texts the query word occurs. You can select a meta-attribute for which to build a chart from the list of the most representative attributes of the corpus, as well as the unit of size measurement: texts or words. When switching the meta-attribute and/or unit of measurement, the chart is redrawn.

The chart shows the distribution of the top 10 meta-attribute values. The remaining values are merged into the Other category. To the right of the chart is the list of values and its percentages. When you hover the mouse over a sector of the chart, you can see the name of the value and the corresponding number of words or texts that include the query lemma.

Text distribution visualization is not yet available in all corpora.

Usage examples

The widget contains five usage examples of the word in the corpus. To select the examples, the lexical and grammatical search by lemma and part of speech is used. Display settings:

  • sorting as set in the corpus by default
  • no more than one example from each document
  • user's subcorpus is not taken into account

By clicking Show more examples, you can switch to the full search results (Concordance mode).

The word portrait is built based on the full corpus without taking into account the user's subcorpus. However, if you had specified the subcorpus earlier, by clicking the Show all examples button, it will be included. In this case, the word usage examples given in the portrait may not match the first examples in the search results.

Word at a glance feature (word portraits) is available in all corpora in the new interface. In some corpora, some of the widgets are not available.

Updated on