LATE-sarunas Search Word Frequency List
LATE-conversational
Corpus contains recordings of informal conversations, interviews and public speeches and their transcripts in orthographic transcription. Metadata has been added to each audio recording: gender and age group of the speaker, information about the form of speech – dialogue, monologue, spontaneous or prepared speech, etc.
Corpus size | 35 hours (347k tokens) |
Data period | 2012–2024 |
Development period | 2021–2024 |
Developers | Institute of Mathematics and Computer Science UL, Institute of Literature, Folklore and Art UL |
Funding | State Research Programme "Letonika – Fostering a Latvian and European Society" (VPP-LETONIKA-2021/1-0006) |