text (30)
speech (7)
general (11)
specialised (26)
morphology (31)
syntax (3)
semantics (1)
error annotation (2)
manually annotated (5)
web (2)
learner (2)
literary (4)
parallel (1)
parliamentary (1)
diahronic (2)
newspapers (5)
representative (9)
latgalian (3)
blog (2)
Corpora with tag manually annotated (5)
BolsuTolka
Bolsutolka.lv Speech Corpus (Common Voice 16.1)
2024, 15 hours (85k tokens)
Developers: RATA, IMCS, UL, ILFA UL, LATA
FullStack-LV
Full Stack of Latvian Language Resources
2017–2019, 13691 sentences
Developers: IMCS UL
B. Saulīte, R. Darģis, N. Grūzītis, I. Auziņa, K. Levāne-Petrova, L. Pretkalniņa, L. Rituma, P. Paikens, A. Znotiņš, L. Strankale, K. Pokratniece, I. Poikāns, G. Bārzdiņš, I. Skadiņa, A. Baklāne, V. Saulespurēns, J. Ziediņš.
Latvian National Corpora Collection – Korpuss.lv
Proceedings of the 13th Language Resources and Evaluation Conference (LREC), 2022, pp. 5123–5129
Latvian National Corpora Collection – Korpuss.lv
Proceedings of the 13th Language Resources and Evaluation Conference (LREC), 2022, pp. 5123–5129