Corpora with tag latgalian (3)

MuLa2022

Corpus of Contemporary Latgalian Texts 2022

1988–2021, 2M words (2.8M tokens)
Developers: RAT, IMCS UL

BolsuTolka

Bolsutolka.lv Speech Corpus (Common Voice 17.0)

2023–2024, 24 hours (130k tokens)
Developers: RATA, IMCS, UL, ILFA UL, LATA

MuLa2012

Corpus of Contemporary Latgalian Texts 2012

1988–2012, 1M words (1.3M tokens)
Developers: IMCS UL, RAT
B. Saulīte, R. Darģis, N. Grūzītis, I. Auziņa, K. Levāne-Petrova, L. Pretkalniņa, L. Rituma, P. Paikens, A. Znotiņš, L. Strankale, K. Pokratniece, I. Poikāns, G. Bārzdiņš, I. Skadiņa, A. Baklāne, V. Saulespurēns, J. Ziediņš.
Latvian National Corpora Collection – Korpuss.lv
Proceedings of the 13th Language Resources and Evaluation Conference (LREC), 2022, pp. 5123–5129
PDF   BibTeX