Linguistic Corpora at the HZSK Repository

The digital repository of the Hamburger Zentrum für Sprachkorpora stores and disseminates linguistic resources and tools. Further information can be found here:

Hits: 3
http://hdl.handle.net/11022/0000-0000-4F70-A
general corpus / spoken / discourse

EXMARaLDA Demo Corpus 1.0

A selection of short audio and video recordings in various languages to be used for instruction or demonstration of the EXMARaLDA system.

Language: German, English, French, Spanish, Turkish, Polish, Vietnamese, Swedish, Norwegian, Italian, Russian, Afrikaans, Portuguese

License: HZSK-PUB (public)

Open lock icon indicates accessible resource
CLARIN icon indicates integration into CLARIN Eye icon indicates online browsable resource Download icon indicates downloads available for this resource
http://hdl.handle.net/11022/0000-0007-C641-0
general corpus / spoken / encyclopedia

The Spoken Wikipedia Corpora

The Spoken Wikipedia project unites volunteer readers of Wikipedia articles. Hundreds of spoken articles in multiple languages are available to users who are – for one reason or another – unable or unwilling to consume the written version of the article. Our resource, the Spoken Wikipedia Corpus, consolidates the Spoken Wikipediae, adding text segmentation, normalization, time-alignment and further annotations, making it accessible for research and fostering new ways of interacting with the material.

Language: English, German, Dutch

License: Creative Commons Attribution-ShareAlike 4.0 International (public)

Open lock icon indicates accessible resource
Download icon indicates downloads available for this resource
http://hdl.handle.net/11022/0000-0000-51E4-3
general corpus / spoken / discourse

Community Interpreting Database Pilot Corpus (ComInDat)

Audio and video recordings of various types of community interpreted discourse (doctor-patient communication, simulated doctor-patient communication, courtroom communication) in German (simulated and authentic doctor-patient communication) and US (courtroom communication) institutions with varying community languages. Video recordings only exist for the simulated communication. For the authentic interpreted doctor-patient communication, no audio files will be made available.

Language: German, English, Spanish, Turkish, Polish, Portuguese, Romanian, Russian, Haitian

License: HZSK-RES (restricted)

Closed lock icon indicates restricted resource
CLARIN icon indicates integration into CLARIN Eye icon indicates online browsable resource