Linguistic Corpora at the HZSK Repository

The digital repository of the Hamburger Zentrum für Sprachkorpora stores and disseminates linguistic resources and tools. Further information can be found here:

Hits: 11
http://hdl.handle.net/11022/0000-0001-7DBA-2
general corpus / spoken / discourse

euroWiss - Linguistic Profiling of European Academic Education (Subcorpus 1)

Subcorpus 1 presents part of the euroWiss-Corpus covering communication in teaching/learning discourses in instruction at German and Italian universities, in the humanities as well as the technical and natural sciences; it offers access to transcriptions of lectures and seminars aligned with audio recordings and the text types used for instruction. The corpus comprises 18 Communications, 24 audio recordings, 24 transcriptions, 140,000 transcribed words, 19 identified speakers, 18 students' notes, 2 lecture scripts, 24 chalkboard presentions, 2 powerpoint presentations, 3 overhead slides, 3 handouts, 14 schedules/descriptions of recorded lecture/seminar

Language: German, Italian

License: HZSK-ACA (academic)

Closed lock icon indicates restricted resource
SSO icon indicates single sign-on resource CLARIN icon indicates integration into CLARIN Eye icon indicates online browsable resource
http://hdl.handle.net/11022/0000-0007-D0D3-F
general corpus / spoken / discourse

The Hamburg MapTask Corpus (HAMATAC)

Audio recordings of map tasks with adult L2 users of German. The speakers´ L1 and their L2 proficiencies vary. The maps used for the tasks are available.; Audioaufnahmen von Map-Task-Aufgaben bei Erwachsenen mit Deutsch als Zweitsprache. Die Kompetenzen der Sprecher in Erst- und Zweitsprache variieren. Die in dieser Aufgabe benutzten Karten sind verfügbar.

Language: German

License: HZSK-ACA (academic)

Closed lock icon indicates restricted resource
SSO icon indicates single sign-on resource CLARIN icon indicates integration into CLARIN Eye icon indicates online browsable resource
http://hdl.handle.net/11022/0000-0000-6973-9
general corpus / spoken / discourse

Hamburg Modern Times Corpus (HaMoTiC)

Audio recordings of a film retelling task with adult L2 users of German. The speakers' L1 and their L2 proficiencies vary. 24 communications + 1 German reference communication, duration between 2 and 16 minutes. For each speaker, a language learner biography (audio and freely transcribes) is available.

Language: German

License: HZSK-ACA (academic)

Closed lock icon indicates restricted resource
SSO icon indicates single sign-on resource CLARIN icon indicates integration into CLARIN Eye icon indicates online browsable resource
http://hdl.handle.net/11022/0000-0000-82AD-A
unknown / spoken / discourse

A5 Hausa Umarnin Uwa

This corpus of Umarnin Uwa film transcripts contains 47 transcripts with a total of 10194 tokens. It provides information including automatic POS tagging, speaker and extralinguistic information, foreign words and code-switching.

Language: Hausa

License: Creative Commons Attribution-NonCommercial 3.0 Unported License (academic)

Closed lock icon indicates restricted resource
SSO icon indicates single sign-on resource Download icon indicates downloads available for this resource
http://hdl.handle.net/11022/0000-0000-9B1C-3
general corpus / spoken / discourse

B1 Aja

The data sets for each language consist of a small number of mini-dialogues, chosen out of the 189 entries within the Focus Translation Task (cf. Skopeteas et al. 2006: 209ff.) in order to get a basic set of utterances for comparison between the languages dealt with in the project.

Language: Aja

License: Creative Commons Attribution-NonCommercial 3.0 Unported License (academic)

Closed lock icon indicates restricted resource
SSO icon indicates single sign-on resource Download icon indicates downloads available for this resource
http://hdl.handle.net/11022/0000-0000-9B2C-1
general corpus / spoken / discourse

B1 Fon

The data sets for each language consist of a small number of mini-dialogues, chosen out of the 189 entries within the Focus Translation Task (cf. Skopeteas et al. 2006: 209ff.) in order to get a basic set of utterances for comparison between the languages dealt with in the project.

Language: Fon

License: Creative Commons Attribution-NonCommercial 3.0 Unported License (academic)

Closed lock icon indicates restricted resource
SSO icon indicates single sign-on resource Download icon indicates downloads available for this resource
http://hdl.handle.net/11022/0000-0000-9B2B-2
general corpus / spoken / discourse

B1 Foodo

The data sets for each language consist of a small number of mini-dialogues, chosen out of the 189 entries within the Focus Translation Task (cf. Skopeteas et al. 2006: 209ff.) in order to get a basic set of utterances for comparison between the languages dealt with in the project.

Language: Foodo

License: Creative Commons Attribution-NonCommercial 3.0 Unported License (academic)

Closed lock icon indicates restricted resource
SSO icon indicates single sign-on resource Download icon indicates downloads available for this resource
http://hdl.handle.net/11022/0000-0000-9B2A-3
general corpus / spoken / discourse

B1 Yom

The data sets for each language consist of a small number of mini-dialogues, chosen out of the 189 entries within the Focus Translation Task (cf. Skopeteas et al. 2006: 209ff.) in order to get a basic set of utterances for comparison between the languages dealt with in the project.

Language: Yom

License: Creative Commons Attribution-NonCommercial 3.0 Unported License (academic)

Closed lock icon indicates restricted resource
SSO icon indicates single sign-on resource Download icon indicates downloads available for this resource
http://hdl.handle.net/11022/0000-0000-9B29-4
general corpus / spoken / discourse

B2 Bura

Full set: all focus related experiments, status: work in progress, large parts elicited, most of the data transcribed, partly annotated

Language: Bura

License: Creative Commons Attribution-NonCommercial 3.0 Unported License (academic)

Closed lock icon indicates restricted resource
SSO icon indicates single sign-on resource Download icon indicates downloads available for this resource
http://hdl.handle.net/11022/0000-0000-9B26-7
general corpus / spoken / discourse

B2 Marghi

Full set: all focus related experiments, status: work in progress, large parts elicited, most of the data transcribed, partly annotated.

Language: Marghi

License: Creative Commons Attribution-NonCommercial 3.0 Unported License (academic)

Closed lock icon indicates restricted resource
SSO icon indicates single sign-on resource Download icon indicates downloads available for this resource
http://hdl.handle.net/11022/0000-0000-9B25-8
general corpus / spoken / discourse

B2 Tangale

Tangale sample: sample, status: final, manually transcribed, glossed and translated to English, annotated wrt. morphology, parts of speech, syntax, gramm. function, sem. roles, focus and focus position (e.g. ex situ) in EXMARaLDA.

Language: Tangale

License: Creative Commons Attribution-NonCommercial 3.0 Unported License (academic)

Closed lock icon indicates restricted resource
SSO icon indicates single sign-on resource Download icon indicates downloads available for this resource