Linguistic Corpora at the HZSK Repository

The digital repository of the Hamburger Zentrum für Sprachkorpora stores and disseminates linguistic resources and tools. Further information can be found here:

Hits: 3
http://hdl.handle.net/11022/0000-0007-E1D5-A
general corpus / spoken / conv: conversations

INEL Selkup Corpus

Selkup is an endangered Samoyedic language (Uralic family). The INEL Selkup corpus is composed of texts from the archive of Angelina Ivanovna Kuzmina (1924–2002), who gathered a large amount of material on Selkup in almost all regions where the Selkup people lived in 1962–1977. Most texts in the corpus originate from the handwritten part of the archive, the others come from sound recordings made by A.I. Kuzmina, transcribed and translated within the INEL project. Each text in the corpus is provided with morphological glossing, translation into English, Russian and German, annotation of borrowings. Some texts also have annotations for syntactic structure, semantic roles and information status.

Language: Selkup

License: CC BY-NC-SA 4.0 (public)

Open lock icon indicates accessible resource
CLARIN icon indicates integration into CLARIN Eye icon indicates online browsable resource Download icon indicates downloads available for this resource
http://hdl.handle.net/11022/0000-0000-772F-7
general corpus / spoken / discourse

Catalan in a bilingual context (PhonCAT)

Audio recordings of prompted, read and spontaneous speech data from L1 Catalan speakers from Barcelona. The data is stratified according to three different city districts and three age groups. Speakers' age vary from approx. 5 to 45 years.

Language: Catalan

License: HZSK-RES (restricted)

Closed lock icon indicates restricted resource
CLARIN icon indicates integration into CLARIN Eye icon indicates online browsable resource
http://hdl.handle.net/11022/0000-0000-A0D3-C
general corpus / spoken / discourse

Faroese Danish Corpus Hamburg 0.2.dan (FADAC-0.2.dan Hamburg)

Audio recordings of semi-structured interviews with bilingual speakers (aged 16-89 years) from various geographical areas on the Faroe Islands. For 37 of the 56 subjects there are recordings in both their L1 Faroese and their L2 Danish. Only the Danish data is available.

Language: Danish

License: HZSK-RES (restricted)

Closed lock icon indicates restricted resource
CLARIN icon indicates integration into CLARIN Eye icon indicates online browsable resource