RNC News

The parallel corpus was expanded by 3 million tokens. Half of this amount is accounted for by English-language non-fiction texts (popular science and journalistic). In addition, the German and Spanish language pairs have been updated, mainly with works of fiction.

In the three language pairs that feature field transcripts of spoken texts, Veps, Karelian, and Khakas, subcorpus selection by dialect is available.