Linguistic Corpora at the HZSK Repository
Corpus type3general corpus
A selection of short audio and video recordings in various languages to be used for instruction or demonstration of the EXMARaLDA system.
Language: German, English, French, Spanish, Turkish, Polish, Vietnamese, Swedish, Norwegian, Italian, Russian, Afrikaans, Portuguese
License: HZSK-PUB (public)
Subcorpus 1 presents part of the euroWiss-Corpus covering communication in teaching/learning discourses in instruction at German and Italian universities, in the humanities as well as the technical and natural sciences; it offers access to transcriptions of lectures and seminars aligned with audio recordings and the text types used for instruction. The corpus comprises 18 Communications, 24 audio recordings, 24 transcriptions, 140,000 transcribed words, 19 identified speakers, 18 students' notes, 2 lecture scripts, 24 chalkboard presentions, 2 powerpoint presentations, 3 overhead slides, 3 handouts, 14 schedules/descriptions of recorded lecture/seminar
Language: German, Italian
License: HZSK-ACA (academic)
Audio recordings (semi-spontaneous interviews) with German/Italian and German/French bilingual speakers aged approx. 15-55 years at the recording sessions. The simultaneous bilinguals with German and French/Italian as L1s have been recorded twice, i.e. once for each language. The successive bilinguals with German as L1 and French/Italian as L2, or French/Italian as L1 and German as L1 all have AOAs between 11 and 38 years and have been recorded using their L2.
Language: German, French, Italian
License: HZSK-RES (restricted)