UDLV-LVTB Search Word Frequency List
Latvian UD Treebank
The corpus ir annotated using UD dependency grammar. The data is converted form the manually annotated Latvian Treebank.
Citation
Publication
L. Pretkalnina,
L. Rituma,
B. Saulite
Deriving Enhanced Universal Dependencies from a Hybrid Dependency-Constituency Treebank
Springer, 2018
Deriving Enhanced Universal Dependencies from a Hybrid Dependency-Constituency Treebank
Springer, 2018
Corpus size | 19368 sentences (328K tokens) (v2.15) |
Data period | 1991–2023 |
Development period | 2015–2024 |
Developers | Institute of Mathematics and Computer Science UL |
Funding | European Regional Development Fund, "Full Stack of Language Resources for Natural Language Understanding and Generation in Latvian" (1.1.1.1/16/A/219); PostDoc grant No. 1.1.1.2/VIAA/1/16/118; State Research Programme "Digital Resources of the Humanities" (VPP-IZM-DH-2020/1-0001); State Research Programme "Research on Modern Latvian Language and Development of Language Technology" (VPP-LETONIKA-2021/1-0006) |
Homepage | http://sintakse.korpuss.lv/ |
CLARIN | http://hdl.handle.net/11234/1-5787 |
Other publications |
L. Pretkalnina
Formāls latviešu valodas gramatikas modelis un tā realizācija mašīnlasāmā sintakses korpusā 2023
N. Gruzitis,
L. Pretkalnina,
B. Saulite,
L. Rituma,
G. Nespore-Berzkalne,
A. Znotins,
P. Paikens
Creation of a Balanced State-of-the-Art Multilayer Corpus for NLU 2018 |