Linguistic Corpora at the HZSK Repository
B4 Tatian Corpus of Deviating Examples 2.1
The present corpus, the Tatian Corpus of Deviating Examples T-CODEX 2.1, provides morpho-syntactic and information structural annotation of parts of the Old High German translation attested in the MS St. Gallen Cod. 56, traditionally called the OHG Tatian, one of the largest prose texts from the classical OHG period. This corpus was designed and annotated by Project B4 of Collaborative Research Center on Information Structure at Humboldt University Berlin. The present corpus compiles ca. 2.000 deviating examples found in the text portions of the scribes α, β, γ and ε. Each clause structure represents an extra file annotated with the annotation tool EXMARaLDA and searchable via ANNIS, a general-purpose tool for the publication, visualisation and querying of linguistic data collections, developed by Project D1 of the Collaborative Research Center on Information Structure at Potsdam University.
Language: Latin, Old High German
License: Creative Commons Attribution 3.0 Unported License (public)
B4 Heliand
Heliand 1, 4 and 5: complete text, status: final, digitalization, translation to Modern German, manually annotated with parts of speech, syntactic categories, grammatical functions, clause status, numbers of syllables (per constituent), alliteration, information status, topic/comment, position of phrase in sentence, definiteness, focus/background, focus-marker, comments on context, source (bibliography).
Language: Old High German
License: Creative Commons Attribution-NonCommercial 3.0 Unported License (academic)
B4 Muspilli
Complete text, status: work in progress, digitalization, translation to English, manually annotated with parts of speech, syntactic category, grammatical function, clause status, numbers of syllables (per constituent), information status, topic/comment, position of constituent in sentence, definiteness, focus/background, focus marker, comments, source (bibliography).
Language: Old High German
License: Creative Commons Attribution-NonCommercial 3.0 Unported License (academic)
B4 Otfrid
The reference corpus Old German contains (annotated) data from the oldest language monuments of German before the continuous written transduction around 750 until 1050 with approx. 650,000 text words.
Language: Old High German
License: Creative Commons Attribution 3.0 Unported License (academic)
B4 Sächsische Weltchronik
The corpus contains a chronic from the 13th century in Middle Low German.
Language: Old High German
License: Creative Commons Attribution-NonCommercial 3.0 Unported License (academic)