text (36)
speech (10)
general (11)
specialised (35)
morphology (41)
syntax (3)
semantics (1)
error annotation (2)
manually annotated (9)
diachronic (7)
web (3)
learner (2)
literary (5)
parallel (1)
parliamentary (1)
historical (2)
newspapers (5)
representative (9)
latgalian (5)
blog (3)
folklore (3)
Corpora with tag manually annotated (9)
Order by:
LATE-sarunas
LATE Conversational Speech Corpus
2012–2024, 44 hours (429k tokens)
Developers: IMCS UL, ILFA UL
BolsuTolka
Bolsutolka.lv Speech Corpus (Common Voice 19.0)
2023–2024, 29 hours (160k tokens)
Developers: IMCS UL, RTU Rezekne, ILFA UL, LATA
FullStack-LV
Full Stack of Latvian Language Resources
1991–2018, 13691 sentences
Developers: IMCS UL
R. Darģis, B. Saulīte
Korpuss.lv – a Versatile Platform for Digital Humanities
Baltic Journal of Modern Computing, 12(4), 2024, pp. 636–645
Korpuss.lv – a Versatile Platform for Digital Humanities
Baltic Journal of Modern Computing, 12(4), 2024, pp. 636–645
B. Saulīte, I. Auziņa, R. Darģis
Latvian National Corpora Collection Korpuss.lv | Nacionālā korpusu kolekcija Korpuss.lv
Linguistica Lettica, 31(1), 2023, pp. 202–223
Latvian National Corpora Collection Korpuss.lv | Nacionālā korpusu kolekcija Korpuss.lv
Linguistica Lettica, 31(1), 2023, pp. 202–223
B. Saulīte, R. Darģis, N. Grūzītis, I. Auziņa, K. Levāne-Petrova, L. Pretkalniņa, L. Rituma, P. Paikens, A. Znotiņš, L. Strankale, K. Pokratniece, I. Poikāns, G. Bārzdiņš, I. Skadiņa, A. Baklāne, V. Saulespurēns, J. Ziediņš.
Latvian National Corpora Collection – Korpuss.lv
Proceedings of the 13th Language Resources and Evaluation Conference (LREC), 2022, pp. 5123–5129
Latvian National Corpora Collection – Korpuss.lv
Proceedings of the 13th Language Resources and Evaluation Conference (LREC), 2022, pp. 5123–5129