Ziņas  Search Word Frequency List

Articles from Latvian news portals

The corpus includes articles from various Latvian news portals. The paragraphs in the articles are sorted alphabetically. If the paragraph is repeated several times, only the oldest one is included in the corpus.

Corpus size 357.2M words (513.5M tokens)
Data period 2020–2022
Development period 2022
Developers Institute of Mathematics and Computer Science UL
Funding State Research Programme "Digital Resources of the Humanities" (VPP-IZM-DH-2020/1-0001)