Search results distribution graph

The graph illustrating the distribution of search results by date shows the frequency of occurrence of examples within a given subcorpus. In the graphs, the distribution and smoothing of frequencies are consistently calculated across the entire chronological span of the corpus. The user is presented with the segment of the graph delimited by the dates specified for subcorpus with smoothing set to 3.

You can set other dates within the subcorpus. To do this, enter new time boundaries—for instance, from 1900 to 2000. In the Regional Media corpus you can opt for a detailed view by day and month such as from March 1, 1920, to March 1, 1950. When you switch from the Regional Media corpus to another corpus, the graph will be plotted using years.

Smoothing the graphs enables you to discern the overall trend behind random frequency fluctuations. For example, smoothing at 10 years averages the frequency of a word over the preceding and following five years. To get accurate data for each date, set smoothing to 0.

Clicking the Show button generates the updated graph.

Hovering the mouse over any point on the line reveas the relative usage frequency for a specific date (ipm). The ipm frequency is defined as the number of occurrences of a word per date (e.g., per year) divided by the size of the corpus for that date and multiplied by one million.

With the help of the windows for displaying dates and frequencies on graphs, you can zoom in or out certain sections of the graph, as well as navigate through the values on the axes. This is useful when you have large amounts of data and you want to consider a narrower date or frequency range.

Click Download to save the graph as a picture file.

Below the graph, on the right, there is a link to the Google Ngram Viewer service, operating on the Russian-language collection of Google Books texts. Please note that, despite similar ideologies, the formulas for calculating relative frequency in the RNC and in Google Ngram Viewer differ.

Count of texts

In some corpora, beneath the graph, you will find warming stripes, illustrating the number of texts in which examples were found in the given subcorpus. Data on the uniformity of the distribution of texts is important to consider when analyzing the graph. Uneven presentation of texts can impact the appearance of the graph when smoothing is applied.

In such instances, to verify results, we recommend considering a graph without smoothing.

Graph comparisons

In the Query Comparison option, separate warming stripes are displayed below the graph for each query involved in the comparison.

Beneath the graph, you will find tables containg the data utllized for graph generation. Clicking on the Show tables button reveals tables presenting absolute usage frequencies for each period.

Hyperlinks in the left table enable you to explore examples from the corpus. The year you selected will be stored as a temporary parameter for your user's subcorpus. You can remove it by clicking the Reset button.

The right table provides the total number of words in the texts for the specified period.

If a text was composed for a long period of time, such as from 2015 to 2021, the absolute frequency of the word is evenly distributed over the entire period (in this example, 1/7 for each year). In the tables, frequencies for such periods are given as a separate line.

When switching to the Graph from search results obtained by the link from the table, it is impossible to change the distribution, since in this case there is not enough data to create graphs in other units. To fully work with the graph, customize a subcorpus with a full range of dates.

Updated on 04.12.2023