The Dialect Corpus has been in development since 2005. It includes recordings of dialect speech from various regions of Russia and neighbouring countries such as Belarus and Azerbaijan, representing both early-settlement dialects (North, Center, West, South) and later-settlement dialects (Volga region, Caucasus, Urals, Siberia, Far East). The corpus features both spontaneous speech and personal narratives, as well as folklore texts, both prose and verse. About one third of the texts is accompanied by audio and video recordings that correspond to the entire text, not just the excerpt shown in the search results.
The Dialect Corpus is expanded mainly through already published regional dialect readers, which were typically issued in small print runs as dialectology teaching aids for students at local universities, as well as through fieldwork materials from dialectological expeditions submitted to the Corpus. Transcriptions of field materials are accepted in phonetic transcription, in a phonologized (orthography-based) format, and even in near-standard orthography, preferably with stress marks and dialect-specific grammatical features preserved. Submissions accompanied by audio recordings are encouraged.