MuLaR  Search Word Frequency List

Corpus of Contemporary Latgalian Speech

The corpus consists of audio recordings and their transcripts. It documents natural, spontaneous speech, including field research recordings, interviews, TV and radio broadcasts.

Citation
Publication
A. Juško-Štekele and A. Kļavinska
Mūsdienu latgaliešu valodas runas korpusa izveide mazāk lietoto valodu dokumentēšanas kontekstā
Letonica, 226-242, 2022
PDF
Data
S. Martena, N. Nau, A. Kļavinska, A. Juško-Štekele, A. Kociņš-Kūceņš, A. Sprukte, A. Briška, I. Gusāns, L. Mazure
Corpus of Contemporary Latgalian Speech (MuLaR)
CLARIN-LV digital library, 2024
http://hdl.handle.net/20.500.12574/118
Corpus size 27 hours (200k tokens)
Data period 2009–2021
Development period 2021–2024
Developers Rezekne Academy of Technologies
Funding State Research Programme "Digital Resources for Humanities" (VPP-IZM-DH-2020/1-0001); State Research Programme "Diversity of Latvian in Time and Space" (VPP-LETONIKA-2021/4-0003)
Homepage https://mularkorpuss.rta.lv/#!/
CLARIN http://hdl.handle.net/20.500.12574/118