LATE-sarunas Search Word Frequency List
LATE Conversational Speech Corpus
Corpus contains recordings of informal conversations, interviews and public speeches and their transcripts in orthographic transcription. Metadata has been added to each audio recording: gender and age group of the speaker, information about the form of speech – dialogue, monologue, spontaneous or prepared speech, etc.
Citation
Publication
I. Auzina,
N. Gruzitis,
R. Dargis,
G. Rabante-Busa,
D. Gosko,
J. Vempers,
R. Kivkucans,
A. Znotins
Recent Latvian Speech Corpora for Linguistic Research and Technology Development
Baltic Journal of Modern Computing, 12(4), 646-658, 2024
Recent Latvian Speech Corpora for Linguistic Research and Technology Development
Baltic Journal of Modern Computing, 12(4), 646-658, 2024
Data
I. Auziņa, R. Darģis, G. Rābante-Buša, I. Timinska-Ļaksa, E. Gailīte, A. Auziņa
LATE Conversational Speech Corpus (LATE-sarunas)
CLARIN-LV digital library, 2024
http://hdl.handle.net/20.500.12574/113
LATE Conversational Speech Corpus (LATE-sarunas)
CLARIN-LV digital library, 2024
http://hdl.handle.net/20.500.12574/113
| Corpus size | 44 hours (429k tokens) |
| Data period | 2012–2024 |
| Development period | 2021–2024 |
| Developers | Institute of Mathematics and Computer Science UL, Institute of Literature, Folklore and Art UL |
| Funding | State Research Programme "Letonika – Fostering a Latvian and European Society" (VPP-LETONIKA-2021/1-0006) |
| CLARIN | http://hdl.handle.net/20.500.12574/113 |
| Other publications |