Linguistic Corpora at the HZSK Repository
INEL Selkup Corpus
Selkup is an endangered Samoyedic language (Uralic family). The INEL Selkup corpus is composed of texts from the archive of Angelina Ivanovna Kuzmina (1924–2002), who gathered a large amount of material on Selkup in almost all regions where the Selkup people lived in 1962–1977. Most texts in the corpus originate from the handwritten part of the archive, the others come from sound recordings made by A.I. Kuzmina, transcribed and translated within the INEL project. Each text in the corpus is provided with morphological glossing, translation into English, Russian and German, annotation of borrowings. Some texts also have annotations for syntactic structure, semantic roles and information status.
Language: Selkup
License: CC BY-NC-SA 4.0 (public)
Covert translation: Business Communication (old)
Translation corpora of original texts with translations and comparable texts from the genre external business communication
Language: German, English
License: HZSK-ACA (academic)
Covert translation: popular science
Translation corpora of original texts with translations and comparable texts from the genre popular scientific prose.
Language: German, English
License: HZSK-ACA (academic)
Covert translation: Business Communication (new)
Translation corpora of original texts with translations and comparable texts from the genre external business communication.
Language: German, English
License: HZSK-ACA (academic)
Hamburg Corpus of Old Swedish with Syntactic Annotations (HaCOSSA)
Religious and secular prose, law texts, non-fiction literature (geographical, theological, historic, natural science), diploma.
Language: English, German, Latin, Old Swedish, Swedish
License: FID-AKA (restricted)