Linguistic Corpora at the HZSK Repository

The digital repository of the Hamburger Zentrum für Sprachkorpora stores and disseminates linguistic resources and tools. Further information can be found here:

Hits: 7
http://hdl.handle.net/11022/0000-0000-9B1E-1
general corpus / written / religious text

B4 Tatian Corpus of Deviating Examples 2.1

The present corpus, the Tatian Corpus of Deviating Examples T-CODEX 2.1, provides morpho-syntactic and information structural annotation of parts of the Old High German translation attested in the MS St. Gallen Cod. 56, traditionally called the OHG Tatian, one of the largest prose texts from the classical OHG period. This corpus was designed and annotated by Project B4 of Collaborative Research Center on Information Structure at Humboldt University Berlin. The present corpus compiles ca. 2.000 deviating examples found in the text portions of the scribes α, β, γ and ε. Each clause structure represents an extra file annotated with the annotation tool EXMARaLDA and searchable via ANNIS, a general-purpose tool for the publication, visualisation and querying of linguistic data collections, developed by Project D1 of the Collaborative Research Center on Information Structure at Potsdam University.

Language: Latin, Old High German

License: Creative Commons Attribution 3.0 Unported License (public)

Closed lock icon indicates restricted resource
SSO icon indicates single sign-on resource Download icon indicates downloads available for this resource
http://hdl.handle.net/11022/0000-0000-9B24-9
general corpus / written / historic manuscript

B4 Heliand

Heliand 1, 4 and 5: complete text, status: final, digitalization, translation to Modern German, manually annotated with parts of speech, syntactic categories, grammatical functions, clause status, numbers of syllables (per constituent), alliteration, information status, topic/comment, position of phrase in sentence, definiteness, focus/background, focus-marker, comments on context, source (bibliography).

Language: Old High German

License: Creative Commons Attribution-NonCommercial 3.0 Unported License (academic)

Closed lock icon indicates restricted resource
SSO icon indicates single sign-on resource Download icon indicates downloads available for this resource
http://hdl.handle.net/11022/0000-0000-9B22-B
general corpus / written / historic manuscript

B4 Ludolf

The texts of this corpus, Ludolf von Sudheims Reise ins Heilige Land (Ludolf of Sudheim’s Journey to the Holy Land), is a journey diary describing the adventures of a group of pilgrims, written in Middle Low German and dated back to 1350. For information on the properties of the text, including the manuscripts, see Blust-Thiele (1985). This corpus uses the text edition by Stapelmohr (1937). The first 20 pages of it are tagged for clause type and grammatical function. The corpus includes 6,690 tokens.

Language: German Middle Low

License: Creative Commons Attribution-NonCommercial 3.0 Unported License (academic)

Closed lock icon indicates restricted resource
SSO icon indicates single sign-on resource Download icon indicates downloads available for this resource
http://hdl.handle.net/11022/0000-0000-9B21-C
general corpus / written / historic manuscript

B4 Muspilli

Complete text, status: work in progress, digitalization, translation to English, manually annotated with parts of speech, syntactic category, grammatical function, clause status, numbers of syllables (per constituent), information status, topic/comment, position of constituent in sentence, definiteness, focus/background, focus marker, comments, source (bibliography).

Language: Old High German

License: Creative Commons Attribution-NonCommercial 3.0 Unported License (academic)

Closed lock icon indicates restricted resource
SSO icon indicates single sign-on resource Download icon indicates downloads available for this resource
http://hdl.handle.net/11022/0000-0000-9B20-D
general corpus / written / historic manuscript

B4 Otfrid

The reference corpus Old German contains (annotated) data from the oldest language monuments of German before the continuous written transduction around 750 until 1050 with approx. 650,000 text words.

Language: Old High German

License: Creative Commons Attribution 3.0 Unported License (academic)

Closed lock icon indicates restricted resource
SSO icon indicates single sign-on resource Download icon indicates downloads available for this resource
http://hdl.handle.net/11022/0000-0000-9B1F-0
general corpus / written / historic manuscript

B4 Sächsische Weltchronik

The corpus contains a chronic from the 13th century in Middle Low German.

Language: Old High German

License: Creative Commons Attribution-NonCommercial 3.0 Unported License (academic)

Closed lock icon indicates restricted resource
SSO icon indicates single sign-on resource Download icon indicates downloads available for this resource
http://hdl.handle.net/11022/0000-0000-9B23-A
general corpus / written / historic manuscript

B4 Historisches Predigtenkorpus zum Nachfeld

HIPKON is the first corpus based on only one text type (sermons) and on one dialect area, Upper German (Bavarian-Alemannic). The sermons cover the time from Middle High German to the beginning of the New High German period. They were accurately selected so that each of them is representative of one century. Among others, syntax, information structure and discourse structure were annotated in the corpus.

Language: New High German

License: Creative Commons Attribution-NonCommercial 3.0 Unported License (academic)

Closed lock icon indicates restricted resource
SSO icon indicates single sign-on resource Download icon indicates downloads available for this resource