Linguistic Corpora at the HZSK Repository
EXMARaLDA Demo Corpus 1.0
A selection of short audio and video recordings in various languages to be used for instruction or demonstration of the EXMARaLDA system.
Language: German, English, French, Spanish, Turkish, Polish, Vietnamese, Swedish, Norwegian, Italian, Russian, Afrikaans, Portuguese
License: HZSK-PUB (public)
Community Interpreting Database Pilot Corpus (ComInDat)
Audio and video recordings of various types of community interpreted discourse (doctor-patient communication, simulated doctor-patient communication, courtroom communication) in German (simulated and authentic doctor-patient communication) and US (courtroom communication) institutions with varying community languages. Video recordings only exist for the simulated communication. For the authentic interpreted doctor-patient communication, no audio files will be made available.
Language: German, English, Spanish, Turkish, Polish, Portuguese, Romanian, Russian, Haitian
License: HZSK-RES (restricted)
Consecutive and Simultaneous Interpreting (CoSi)
Audio and video recordings of three lectures in Portuguese, one simultaneously and two consecutively professionally interpreted into German. For the simultaneouly interpreted lecture there are different recordings and transcriptions for the participants.
Language: German, Portuguese
License: HZSK-RES (restricted)
Dolmetschen im Krankenhaus (DiK)
Audio recordings of various kinds of doctor-patient communication in hospitals. There are both monolingual conversations in German, Portuguese and Turkish, recorded in the respective country, and interpreted conversations recorded in Germany (i.e. in German-Turkish, German-Portuguese, and German-Portuguese/Spanish), about 15-20 recordings of each kind. The persons interpreting are bilingual hospital employees or relatives of the patients, who are all adults living in Germany but with varying knowledge of German.
Language: German, Portuguese, Spanish, Turkish
License: HZSK-RES (restricted)