LATE-sarunas  Search Word Frequency List

LATE-conversational

Corpus contains recordings of informal conversations, interviews and public speeches and their transcripts in orthographic transcription. Metadata has been added to each audio recording: gender and age group of the speaker, information about the form of speech – dialogue, monologue, spontaneous or prepared speech, etc.

Citation
Publication
I. Auzina, N. Gruzitis, R. Dargis, G. Rabante-Busa, D. Gosko, J. Vempers, R. Kivkucans, A. Znotins
Recent Latvian Speech Corpora for Linguistic Research and Technology Development
Baltic Journal of Modern Computing, 12(4), 646-658, 2024
Data
I. Auziņa, R. Darģis, G. Rābante-Buša, I. Timinska-Ļaksa, E. Gailīte, A. Auziņa
LATE-conversational (LATE-sarunas)
CLARIN-LV digital library, 2024
http://hdl.handle.net/20.500.12574/113
Corpus size 44 hours (429k tokens)
Data period 2012–2024
Development period 2021–2024
Developers Institute of Mathematics and Computer Science UL, Institute of Literature, Folklore and Art UL
Funding State Research Programme "Letonika – Fostering a Latvian and European Society" (VPP-LETONIKA-2021/1-0006)
CLARIN http://hdl.handle.net/20.500.12574/113
Other publications
I. Auzina and G. Rabante-Busa
Sarunvalodai tipiskie fonētiskie līdzekļi: runas korpusa datu analīze
Valoda: nozīme un forma, 15, 7-23, 2024