Linguistic Corpora at the HZSK Repository
EXMARaLDA Demo Corpus 1.0
A selection of short audio and video recordings in various languages to be used for instruction or demonstration of the EXMARaLDA system.
Language: German, English, French, Spanish, Turkish, Polish, Vietnamese, Swedish, Norwegian, Italian, Russian, Afrikaans, Portuguese
License: HZSK-PUB (public)
ALCEBLA
Audio recordings in Spanish with 23 German/Spanish simultaneous bilingual children living in Germany and attending the Spanish complementary school at the first level. 1-6 recordings with each child, with 11 children also before the children attended the Spanish complementary school. All recordings feature elicited speech: A picture naming task, a story telling task, a morphosyntactic test, a lexical test, and the HAVAS 5. Rich metadata on language use and attitudes in the family submitted by the parents.
Language: German, Spanish
License: HZSK-RES (restricted)
Community Interpreting Database Pilot Corpus (ComInDat)
Audio and video recordings of various types of community interpreted discourse (doctor-patient communication, simulated doctor-patient communication, courtroom communication) in German (simulated and authentic doctor-patient communication) and US (courtroom communication) institutions with varying community languages. Video recordings only exist for the simulated communication. For the authentic interpreted doctor-patient communication, no audio files will be made available.
Language: German, English, Spanish, Turkish, Polish, Portuguese, Romanian, Russian, Haitian
License: HZSK-RES (restricted)
Dolmetschen im Krankenhaus (DiK)
Audio recordings of various kinds of doctor-patient communication in hospitals. There are both monolingual conversations in German, Portuguese and Turkish, recorded in the respective country, and interpreted conversations recorded in Germany (i.e. in German-Turkish, German-Portuguese, and German-Portuguese/Spanish), about 15-20 recordings of each kind. The persons interpreting are bilingual hospital employees or relatives of the patients, who are all adults living in Germany but with varying knowledge of German.
Language: German, Portuguese, Spanish, Turkish
License: HZSK-RES (restricted)
Hamburg Corpus of Argentinean Spanish (HaCASpa)
Audio and video recordings of experimental/read and spontaneous speech from adult speakers of Porteño Spanish in Argentina. Speakers are 18-69 years old and from two geographic areas. For the intonational experiments, there are audio recordings only, whereas some of the free interviews and map tasks feature video recordings. The material used as stimuli in the experiments is available with references encoded in the transcriptions.
Language: Spanish
License: HZSK-RES (restricted)
Parameterfixierung im Deutschen und Spanischen (PAIDUS)
Audio recordings of five German and five Spanish speaking monolingual children. For the German children there are about 30 recordings (interviewer/child interaction) per child, on an average starting at 9 months and ending at 3 years; for the Spanish children there are on average 15 recordings per child ending at 2 years.
Language: German, Spanish
License: HZSK-RES (restricted)
PhonBLA Longitudinalstudie Hamburg
Audio and Video recordings of four German/Spanish bilingual children starting at approx. 1 year and 6 months and ending at age 6-7 years with about 100 recordings (interviewer/child interaction) of each child, half of them in each language.
Language: German, Spanish
License: HZSK-RES (restricted)
Phonologie-Erwerb Deutsch-Spanisch als Erste Sprachen (PEDSES)
Audio recordings of three German/Spanish simultaneous bilingual children starting at approx. 1 year and ending at 2 or 3 years. There are 20-50 recording sessions (interviewer/child interaction) per child, half of them conducted in German and half in Spanish.
Language: German, Spanish
License: HZSK-RES (restricted)
Phon-CL2
Audio recordings of 15 German subjects in Spain (5 to 36 years old) with Spanish as L2 and AOA > 2 years. Recording sessions in Spanish based on picture naming and story telling etc. Rich metadata on language use and attitudes in the family submitted by the parents.
Language: German, Spanish
License: HZSK-RES (restricted)