Linguistic Corpora at the HZSK Repository

The digital repository of the Hamburger Zentrum für Sprachkorpora stores and disseminates linguistic resources and tools. Further information can be found here:

Keyword

9EXMARaLDA
9L1 data
9L1-Daten
5Erwerb der Wissenschaftssprache
5L2 data
...
Searched: L1-Daten
X
Hits: 9
http://hdl.handle.net/11022/0000-0007-CAE6-2
general corpus / spoken / flk: folklore texts

INEL Kamas Corpus 0.1

Kamas is an extinct Samoyedic language (Uralic family). The INEL Kamas corpus comprises folklore texts collected by Kai Donner in 1912–1914, before the language shift, and transcribed audio recordings of the last speaker, Klavdiya Plotnikova made between 1964 and 1970. Each text in the corpus is provided with morphological glossing, translation into English, Russian and German, annotation of borrowings. Some texts also have annotations for syntactic structure, semantic roles and information status.

Language: Kamas

License: CC BY-NC-SA 4.0 (public)

Open lock icon indicates accessible resource
CLARIN icon indicates integration into CLARIN Eye icon indicates online browsable resource Download icon indicates downloads available for this resource
http://hdl.handle.net/11022/0000-0007-E1D5-A
general corpus / spoken / conv: conversations

INEL Selkup Corpus

Selkup is an endangered Samoyedic language (Uralic family). The INEL Selkup corpus is composed of texts from the archive of Angelina Ivanovna Kuzmina (1924–2002), who gathered a large amount of material on Selkup in almost all regions where the Selkup people lived in 1962–1977. Most texts in the corpus originate from the handwritten part of the archive, the others come from sound recordings made by A.I. Kuzmina, transcribed and translated within the INEL project. Each text in the corpus is provided with morphological glossing, translation into English, Russian and German, annotation of borrowings. Some texts also have annotations for syntactic structure, semantic roles and information status.

Language: Selkup

License: CC BY-NC-SA 4.0 (public)

Open lock icon indicates accessible resource
CLARIN icon indicates integration into CLARIN Eye icon indicates online browsable resource Download icon indicates downloads available for this resource
http://hdl.handle.net/11022/0000-0007-CAE5-3
general corpus / spoken / flk: folklore texts

INEL Selkup Corpus 0.1

Selkup is an endangered Southern Samoyedic language (Uralic family). The INEL Selkup corpus is composed of texts from the archive of Angelina Ivanovna Kuzmina (1924–2002), who gathered a large amount of material on Selkup in almost all regions where the Selkup people lived in 1962–1977. Most texts in the corpus originate from the handwritten part of the archive, the others come from sound recordings made by A.I. Kuzmina, transcribed and translated within the INEL project. Each text in the corpus is provided with morphological glossing, translation into English, Russian and German, annotation of borrowings. Some texts also have annotations for syntactic structure, semantic roles and information status.

Language: Selkup

License: CC BY-NC-SA 4.0 (public)

Open lock icon indicates accessible resource
CLARIN icon indicates integration into CLARIN Eye icon indicates online browsable resource Download icon indicates downloads available for this resource
http://hdl.handle.net/11022/0000-0006-CD41-A
learner corpus / written / academic writing

Commented Learner Corpus Academic Writing

Authentic texts written by students of the University of Hamburg as part of their studies, the students have various L1 languages and study various subjects, all of the texts were subject of a writing counseling at the Writing Center Multilingualism (Schreibwerkstatt Mehrsprachigkeit), for some of the texts comments by peer tutors and several versions are available.

Language: German

License: HZSK-ACA (academic)

Closed lock icon indicates restricted resource
SSO icon indicates single sign-on resource CLARIN icon indicates integration into CLARIN Download icon indicates downloads available for this resource
http://hdl.handle.net/11022/0000-0001-7DBA-2
general corpus / spoken / discourse

euroWiss - Linguistic Profiling of European Academic Education (Subcorpus 1)

Subcorpus 1 presents part of the euroWiss-Corpus covering communication in teaching/learning discourses in instruction at German and Italian universities, in the humanities as well as the technical and natural sciences; it offers access to transcriptions of lectures and seminars aligned with audio recordings and the text types used for instruction. The corpus comprises 18 Communications, 24 audio recordings, 24 transcriptions, 140,000 transcribed words, 19 identified speakers, 18 students' notes, 2 lecture scripts, 24 chalkboard presentions, 2 powerpoint presentations, 3 overhead slides, 3 handouts, 14 schedules/descriptions of recorded lecture/seminar

Language: German, Italian

License: HZSK-ACA (academic)

Closed lock icon indicates restricted resource
SSO icon indicates single sign-on resource CLARIN icon indicates integration into CLARIN Eye icon indicates online browsable resource
http://hdl.handle.net/11022/0000-0001-B735-5
learner corpus / written / academic writing

Commented Learner Corpus Academic Writing (KoLaS 1.1)

Authentic texts written by students of the University of Hamburg as part of their studies, the students have various L1 languages and study various subjects, all of the texts were subject of a writing counseling at the Writing Center Multilingualism (Schreibwerkstatt Mehrsprachigkeit), for some of the texts comments by peer tutors and several versions are available.

Language: German

License: HZSK-ACA (academic)

Closed lock icon indicates restricted resource
SSO icon indicates single sign-on resource
http://hdl.handle.net/11022/0000-0001-B734-6
learner corpus / written / academic writing

Commented Learner Corpus Academic Writing (KoLaS 1.0)

Authentic texts written by students of the University of Hamburg as part of their studies, the students have various L1 languages and study various subjects, all of the texts were subject of a writing counseling at the Writing Center Multilingualism (Schreibwerkstatt Mehrsprachigkeit), for some of the texts comments by peer tutors and several versions are available.

Language: German

License: HZSK-ACA (academic)

Closed lock icon indicates restricted resource
SSO icon indicates single sign-on resource
http://hdl.handle.net/11022/0000-0001-B732-8
learner corpus / written / academic writing

Commented Learner Corpus Academic Writing (KoLaS 2.0)

Authentic texts written by students of the University of Hamburg as part of their studies, the students have various L1 languages and study various subjects, all of the texts were subject of a writing counseling at the Writing Center Multilingualism (Schreibwerkstatt Mehrsprachigkeit), for some of the texts comments by peer tutors and several versions are available.

Language: German

License: HZSK-ACA (academic)

Closed lock icon indicates restricted resource
SSO icon indicates single sign-on resource
http://hdl.handle.net/11022/0000-0000-6ECE-E
general corpus / spoken / discourse

Phonologie-Erwerb Deutsch-Spanisch als Erste Sprachen (PEDSES)

Audio recordings of three German/Spanish simultaneous bilingual children starting at approx. 1 year and ending at 2 or 3 years. There are 20-50 recording sessions (interviewer/child interaction) per child, half of them conducted in German and half in Spanish.

Language: German, Spanish

License: HZSK-RES (restricted)

Closed lock icon indicates restricted resource
CLARIN icon indicates integration into CLARIN Eye icon indicates online browsable resource