Using MURCO

Ðóññêèé

Table of Contents

0. Shortlist of terms

Clixt — the pair clip + text, i.e. the transcript of the speech segment of the clip; a clixt is the main unit of the MURCO data. If a clip does not contain speech, but only non-verbal words and gestures, the unit of the data is a clip.

Accessory — all objects which are involved in this or that gesture (the extenders, the spoilers and the neutral objects).

Active organ — the part of the main organ which executes a gesture; usually an active organ is the most mobile part of the human body during the gesture execution. When the extreme reduction of a gesture takes place, the gesture comes exactly to the movement of the active organ. Every gesture is executed with this or that active organ, so to describe a gesture from the objective point of view the active organ is sure to be specified.

Adaptor — an object which does not present any part of the speaker's body, but is the absolutely necessary component of a gesture (for instance, the adaptor external object for the deictic gestures). One of the most important adaptors is interlocutor.

Alienation — a speech act and the meaning of one of the gestures of inner state, which reflect the speaker's dissociation from something or somebody or the speaker's reluctance to have something in common with some speech or event.

Authenticity — 1) the degree of the speaker's ownership or the speaker's possession of a gesture; 2) the degree of a gesture genuineness.

Autointerruption — the communicational situation when a speech act or a gesture is stopped by the speaking or gesticulating person themselves.

Cited gestures — the situation when the speaker cites the other's speech and applies to themselves the gestures which have accompanied the cited remark.

Closed questions — the questions which may be answered only with "yes" or "no".

Critical speech acts, critical gestures — the speech acts and the gestures whose meaning includes any kind of criticism.

Double beginning — the speech manner: two attempts to pronounce a word.

Echo — the repetition of the part of the interlocutor1's remark by the interlocutor2 (— It's very bad. — Very bad).

Envelope repetition — the repetition which sets off the phrase (I hate this town, I hate it!).

Etiquette question — the indirect question, which is used in the etiquette communicational situations; any etiquette speech act, which is presented in the form of a question (for example, Perhaps I may know what to call you? in the situation of acquaintance).

Extender — an object which takes part in a gesture and substitutes the active organ (totally or partially).

Feedback question — the interlocutor2's question which makes the interlocutor1 to acknowledge 1) his previous remark, or 2) the fact that the interlocutor1 is in touch with the interlocutor2.

Gesture instead of word — the speech act which ends in a gesture replacing the words.

Inarticulate question — a question which is put by means of an inarticulate non-verbal word or of a mumbling.

Incomplete utterance — a speech act which is not finished by the speaker because of the speaker's certitude that the termination of the speech act is absolutely predictable.

Indirect questions — any speech act which is a question formally, but not substantionally (for example, the request Would you be so kind to let me pass?).

Information feedforward — the speech act which notifies an interlocutor about the immediately consequent speech act.

Intercepted utterance — the communication situation when the interlocutor2 finishes the remark which has been begun by the interlocutor1.

Interlocutor (in gesticulation) — an adaptor; any person who is the addressee of this very gesture; consequently, an interlocutor may be silent and may not be the participant of this very communication act at all.

Interruption — the communication situation when a speech act or a gesture is stopped under the influence of the external circumstances.

Main organ — the zone of the human body where the gestures take place: head, torso, one arm, both arms, one leg, both legs. Each main organ has its own set of the active organs.

Mirror gesture — the gesture which is the interlocutor1's repetition of the gesture of the interlocutor2.

Modal utterances — the speech acts which reflect the speaker's attitude to any event (for example, complaint, opinion).

Opened questions — the questions which need substantial answers and cannot be answered with "yes" or "no".

Parcelling — the speech manner which is characterized by the change of the inter-verbal pauses into the inter-phrasal ones.

Passive organ — any organ or any part of the human body, which plays the passive role in gesticulation, i.e. which is immobile or moves restrictedly. A gesture may be executed without participation of any passive organ.

Performatives — the speech acts, the pronunciation of which is the action (for example, to swear).

Performed gesture — any gesture which accompanies musical or poetic composition.

Reduced gesture — a gesture, the execution of which is not full.

Relay — the repetition of the end of the previous phrase at the beginning of the next phrase (I'm afraid. I'm afraid of loneliness).

Speaking in measured tones — the manner of speech which is characterized by longer pauses between words.

Spoiler — any object, which prevents the execution of a gesture.

Syllabification — the manner of speech which is characterized by the partitioning of the words into the syllables.

Transformed gesture — a gesture which changes into another gesture or into any other body movement while executing.

Ventriloquism — the manner of speech which is characterized by immobility of the speaker's lips.

How to customize your own subcorpus?

At the main search page click the bookmark “customize subcorpus” «çàäàòü ïîäêîðïóñ»:

The window of the subcorpus customizing will open. Here you may choose the main parameters of the required text (the author, the title, the date of creation, the author’s gender and the date of his/her birth). Besides, here you may select the theme, type and the genre of the texts (for example, the different types of the spoken speech):

Here you also can customize the subcorpus according to the sociological characteristics of the speakers who take part in a clixt (the name of the actor, his/her age, the date of his/her birth, and his/her gender):

Your subcorpus being customized, click one of the buttons at the bottom of the page:

If you click the button "Clear", all the previously specified parameter values will be removed, and you will have the opportunity to customize your subcorpus from the very beginning If you click the button "Clear subcorpus", the command "Clear" will be executed, and you will be relocated to the main search page. In this case, you can search the data through the MURCO as a whole, not through the customized subcorpus If you click the button "Next", the parameters of your subcorpus will be saved and you will be relocated to the next page. Here you may estimate what the clixts you have chosen are:

If the list of the clixts is satisfactory, you ought to click the button "Save subcorpora and go to the search page"; if the list is defective, you ought to click the bookmark "Select subcorpus" and return to the page "customize subcorpus". If you manage to customize your subcorpus, the subsequent search will be executed through this very subcorpus.

Main search page in MURCO

The main search page of the Multimodal Russian Corpus has the same structure as all the other search pages in the Russian National Corpus, but includes some additional parameters which expand user’s search possibilities.

Search by exact form

The row “Search by exact form” in the MURCO has the same structure as the same row in the Accentological subcorpus of the RNC. Here you may search for the exact forms of a word or a phrase taking into account the position of the stress mark and the difference between the letters “å” and “¸”. So, the query “polu’chite” ‘you will receive’ returns you only the future tense of the verb “poluchit’” ‘to receive’, while the query “poluchi’te” ‘receive, 2nd person, Pl’ returns you only the imperative. In the same way the forms “chert” (=÷åðò) ‘a feature, GenPl’ and “ch’ort” (=÷¸ðò) ‘devil, NomSg’ will be distinguished:

After the required form is typed, you ought to click the button “Search”.

It has to be mentioned that if you don’t have the Russian keyboard layout to type the required form, you may use the virtual keyboard which is disposed near the search row and looks like this: ÀÁÂ.

Parameters of lexico-grammatical and semantic search

To find all the forms of a lemma, type the vocabulary form of the lemma in the word cell within the zone “Lexico-grammatical search”:

If you want to find a word-combination, type the 1st element in the 1st row, and the 2nd one in the 2nd row. If you need to find a three-element construction, click the down arrow near the 2nd row. The 3rd row appears:

The distance between the words within a word-combination is customized with the parameter “Distance”. By default the distance is zero:

To find a wordform with certain morphological features, click the button “select”. The dialog box being opened, you may select the required characteristics:

Select the desired parameter point and click the button OK.

Similarly, to customize the semantic features, click the button “select”, and after marking the required parameter points in the dialog box, click the button OK:

Using the additional features (punctuation marks, capital letters, repetitions) you may specify your query:

If it is necessary, the additional features may be selected for every word of a word-combination.

In conclusion, click the button “Search”:

Retrieval of orthoepic data

Every token in the MURCO is supplied with the list of the combinations of the vowel and consonant letters, which form this token and may be important for the researches concerning the normative and non-normative articulation. Within the word limits the combinations of 1) two and more consonants, and 2) two or more vowels are marked up. At the word boundaries the combinations 1) consonants + consonants, 2) vowels + vowels, 3) consonants + vowels, and also 4) the vowel combination + whatever word beginning, are marked up.

To find the required combination of letters, type it in the cell “Orthoepic structure” and click the button “Search”:

To find any combination of letters at the word boundaries, use the underscore character (_) as a sign of a word boundary:

If you want to find the combination of the final vowels of a token before whatever beginning of the next word, use the query of this form:

Retrieval of tokens of the same accentual structure

While in the MURCO the positions of real (not normative) stress are marked up, a user may find the tokens with the same accentual contour regardless of its morphological and semantic characteristics. That is to say, in a query one may take into account 1) the number of syllables, 2) the quality and the number of the stressed syllable, 3) the quality and the number of the pretonic syllable, 4) the quality and the number of post-tonic syllable. To form your query, you ought to click the bookmark “select” near the cell “Accentual structure” and to select the required parameters in the dialog box. For example, below the query ‘3rd stressed syllable + 1st pretonic vowel o’ is formed:

After clicking the button OK in the dialog box, the search cell should look like this:

It means that 1) the stressed syllable has the number 3 if we start counting at the beginning of a token, 2) the first pretonic syllable (i.e. the first syllable to the left of the stressed one) contains the phoneme o, 3) neither the quality nor the number of the post-tonic syllables are customized; the same is with the total number of the syllables. As a response to the query, you will retrieve the clixts which include the tokens georgi’ny (noun), potomu’ (conj), horosho’ (adv), zaodno’ (adv), storone’ (noun), gastrono’m (noun), razvora’chivajetsja (verb) and so on and so forth.

If you need to specify more than one pretonic or post-tonic vowel, you ought to add the required parameter right in the search cell:

The query of this form means that you want to repeat the previous search, but the second pretonic vowel must be a. As a response to the query, you will retrieve the tokens zaodno’, razvorachiva’jetsja, Krasnore’chensk, nahozhu’ and so on.

How can we find a speech act?

In the minor part of the MURCO (so called “deeply annotated” MURCO, or DA-MURCO) the different types of speech acts and gestures are marked up. To enable one to search them, the special interface has been elaborated. Because of its big size, it has been reduced and by default looks like this:

To begin the search of any speech act, you ought to click the bookmark “customize” and select the required parameters in the search window:

The initial four points characterize the clixt situation from the point of view of sociology (the number of participants, their gender, the language used in a clixt, and the type of sociological situation).

If you want to search through all types of clixts, don’t select any limitation in the sociological zone of the annotation.

Next zone of the annotation includes circa 150 varieties of speech acts, which are organized in 13 groups. To find out what types of speech acts are marked up in the MURCO and in what group this or that speech act is included, apply for information to the help list clicking the question mark near the first row:

In this window you may see the marked-up speech acts which are printed in bold, and the groups of speech acts which are placed within parentheses (so, call, prohibition and statement you ought to search in the group appeals, negations and affirmations, respectively).

If you click a button near the name of any group of speech acts, for example, the group “The other’s speech”, the dialog box will appear which contains the list of the corresponding speech acts. If you click the button “Invert” (= “to invert the selection”) in the dialog box, you will get the possibility to find all cases of the other’s speech without distinguishing the different kinds of it:

Besides, a user may select the clixts which include the utterances of different degree of completeness (parameter “Completeness of speech act”), the different types of repetitions (parameter “Types of repetitions”), the utterances characterized by this or that manner of speech (“Manner of speech”), containing the variety of non-verbal words, vocal gestures, physiological activities, and interjections:

Select the required point and click the OK button. After that you ought to go to the button “Search” on the main search page and click it:

How can we find a gesture?

In the minor part of the MURCO (so called “deeply annotated” MURCO, or DA-MURCO) the different types of speech acts and gestures are marked up. To enable one to search them, a special interface has been elaborated. Because of its big size, it has been reduced and by default looks like this:

To begin the search of any gesture, you ought to click the bookmark “customize” and select the required parameters in the search window:

Select the required point. After that you ought to go to the button “Search” at the main search page and click it:

Sociological characteristics of gesture

The parameter “Speaker’s (actor’s) name” lets us deal with the clixts, in which a certain actor is gesticulating (the name of an actor can be selected from the list):

The gender of a speaker (an actor), naturally, may be masculine or feminine, but in special cases (for example, if the gesticulating person cannot be seen) the gender may be marked up as unknown:

The gender of a character is more complicated than the gender of an actor, because there might be a situation when the real gender of an actor does not coincide with the gender of a character.

The age of an actor and of a character are described conventionally (a child, a teenager, an adult, an aged person):

It will be recalled that the exact age of a speaker (if it is known) may be selected while customizing the subcorpus by means of the sociological characteristics (see How to customize your own subcorpus?). In addition, the age of a speaker and the date of his/her birth are specified in every remark.

Îáúåêòèâíûå õàðàêòåðèñòèêè æåñòà

1. A user may define the main organ which forms a gesture (for example, a user may select clixts/clips which include the head movements). A certain active organ which correlates with this or that main organ, may be also defined (for example, it is possible to select clixts/clips, including the gestures which are executed with the forefinger). To know which main organs and active ones are annotated in the MURCO, you ought to click the bookmark “Main organ” or the bookmark “Active organ”

The main/active organ being specified, you ought to click the OK button (naturally, you can select more than one active organ). You also may select a passive organ

or an adaptor

2. The objective characteristic of a gesture includes also its spatial and temporal features. The temporal characteristics of a gesture describe it from the point of view of the multiplication factor:

It should be taken into account that the names of the multiple gestures are formed by means of the imperfectives, while the single gestures are named by the use of the perfectives (kivat’ ‘to nod many times’ vs. kivnut’ ‘to nod once’). The spatial characteristic of every gesture means the indication of the movement directions of an active organ. It should be mentioned that every movement direction is supplied with the list of the active organs which are specific for this very movement direction:

Having in mind the hand/arm gestures, we have added to the annotation such spatial characteristics as “palm orientation” and “hand orientation’:

Apparently, the former substantially determines the last, so the hand orientation should be selected in line with the palm orientation.

3. Finally, the objective characteristic of a gesture specifies the objects which are used during the gesture execution, and their role in the gesticulating act. Consequently, a user can select extenders, spoilers and accessory, which take part in a gesture:

Subjective characteristics of gesture

The gestures in the MURCO are described from the subjective (or functional-pragmatic) point of view, which means the definition of the triads gesture type / gesture meaning / gesture name. At the moment, 14 gesture types are defined in accordance with the functions and the provenance of the gestures (the provenance is an important parameter for the adopted gestures, which would be distributed among the other 13 types if they had not been adopted). To find out what gestures are marked up in the MURCO and to what type this or that gesture is referred, apply for information in the help list clicking the question mark near the row “Gestures”:

The help list being opened, you may see the list of the gesture meaning in alphabetical order, which are printed in bold; in parentheses after the gesture meanings the corresponding gesture types are indicated, and then all gesture names which correlate with this gesture meaning, are enumerated in italics.

If you want to analyze all the gestures of this or that type or some individual gesture meaning, you ought to click the bookmark “Gesture type” or “Gesture meaning”:

If you want to collect the clixts/clip which include the gestures of the same Russian name, you ought to click the bookmark “Gesture name”:

The opened dialog box will let you define (if necessary) the meaning of the required gesture more precisely (for example, for the gesture otmahnut’sja ‘to wave away’ you may select the meaning prenebrezhenije ‘neglect’ (gesture of inner state) or ne stoit blagodarnosti ‘not at all’ (etiquette gesture))

Additional characteristics of gesture

You may specify what manifestation of emotions accompanies this or that gesture. To obtain this information, click the bookmark “Emotions” and the dialog box being opened, select the required point in the list:

A gesture may be also characterized from the point of view of its completeness. To select the required parameter, click the bookmark “Completeness of gesture” and mark it in the dialog box:

Finally, the gestures are described from the point of view of their authenticity. You ought to click the bookmark “Authenticity of gesture” and select one of the values in the dialog box:

Outputting interface

Any of your queries will let you see the outputting window of the following interface:

  1. To start a video, click the big yellow arrow in the middle of the video.

  2. To choose the better video quality, click the button “Quality” (left upper angle of the video; outlined with the white and black circles).

  3. To download a video, click the bookmark “Download mp4” at the right upper angle of the video (outlined with the white and black circles).

  4. At the left lower angle of the video (the red triangle) the video duration is located.

  5. The lower row (the yellow rectangle) is intended for the unique name of the video in the database of the MURCO.

  6. The green arrow indicates the table of the gestures which are marked up in the clixt/clip (naturally, in the case that the clip has been annotated from the gesticulation point of view); the table contains the name of an actor (if it is known), the gender (it is important in the cases when the name of an actor is unknown: it facilitates the reference marks within the video), the active organ, the name and the meaning of a gesture. The aim of the table is to help a user to understand what the marked-up gestures in the clip are and how they have been interpreted.

  7. The orange pentagon designates the bookmark (three dots) which leads you to this very clixt and gives you the possibility to get its unique address in the Web:



Russian National Corpus
© 2003–2017
info@ruscorpora.ru