Tīmeklis2007 Search Word Frequency List
Latvian Web Corpus 2007
The Latvian Web Corpus 2007 contains 700,000 Latvian webpages published before 2005. The corpus is automatically annotated.
Citation
Publication
J. Dzerins and
K. Dzonsons
Harvesting national language text corpora from the Web
2007
Harvesting national language text corpora from the Web
2007
Data
J. Džeriņš, K. Džonsons
Latvian Web Corpus 2007 (Tīmeklis2007)
CLARIN-LV digital library, 2007
http://hdl.handle.net/20.500.12574/46
Latvian Web Corpus 2007 (Tīmeklis2007)
CLARIN-LV digital library, 2007
http://hdl.handle.net/20.500.12574/46
Corpus size | 99M words (123M tokens) |
Data period | 1991–2005 |
Development period | 2006–2007 |
Developers | Institute of Mathematics and Computer Science UL |
Funding | Research and Development of the Semantic Web Technologies for Latvia (SemTi-Kamols) |
CLARIN | http://hdl.handle.net/20.500.12574/46 |