Nganasan Spoken Language Corpus (NSLC)

PID: http://hdl.handle.net/11022/0000-0007-C6F2-8

Dieses ist die neueste Korpusversion.
Korpusversionen: 0.2, 0.1

BeschreibungMetadatenSessionsZugehörige Dateien

Daten werden geladen ...

Zusammenfassung

Title (eng) Nganasan Spoken Language Corpus (NSLC)
Description The Nganasan Spoken Language Corpus (NSLC) has been created as part of Corpus based grammatical studies on Nganasan project (supported by the German Research Grant; WA3153/2-1). The Spoken Nganasan Corpus contains the same text samples in at least three languages: The original text in Nganasan with translations mostly into Russian and English, sometimes also into German. The corpus contains 55 communications from 15 different speakers. The bulk of the language material to be integrated, glossed and annotated has been collected by several researchers and is available in audio format. The transcription data as well as the metadata of the corpus are processed and stored in EXMARaLDA format.
Publication date 2018-06-12
Data owner Beáta Wagner-Nagy, Institut für Finnougristik/Uralistik / Überseering 35 / D-22297 Hamburg, beata.wagner-nagy@uni-hamburg.de
Keywords annotated, bilingual society, code-switching, indigenous language, information status, endangered language, folklore, language contact, language documentation, EXMARaLDA
Languages Nganasan (nio), Russian (rus)
Size 1962 min56 speakers (29 female, 26 male, 1 unknown)
Annotation types transcription (manual): Orthographic transcription. Partly based on archive materials without audio files.
tx: Tier for interlinearization
mb: Morpheme break
mp: Morphophonemes, underlying forms
gr: Morphological annotation: Russian gloss of each morpheme
ge: Morphological annotation: English gloss of each morpheme
mc: Part of speech of each morpheme
ps: Part of speech of each word
ser: Annotation of semantic roles
syf: Annotation of syntactic function
ist: Annotation of information status
ref: Name of the communication
st: Source texts: normally in Cyrillic transliteration
ts: Transcription (what is heard)
fr: Russian free translation
fe: English free translation
nt: Notes on the text unit
so: Source origin
cw: Annotation of code switching
License CC BY-NC-SA 4.0 (public, non-commercial only)
PIDs  This corpus: http://hdl.handle.net/11022/0000-0007-C6F2-8

CMDI metadata: http://hdl.handle.net/11022/0000-0007-C9C5-8

Details

Header

Resources

Components

SpokenCorpusProfile

Name (eng): Nganasan Spoken Language Corpus

Title (eng): Nganasan Spoken Language Corpus (NSLC)

PID: http://hdl.handle.net/11022/0000-0007-C6F2-8

Description (eng): The Nganasan Spoken Language Corpus (NSLC) has been created as part of Corpus based grammatical studies on Nganasan project (supported by the German Research Grant; WA3153/2-1). The Spoken Nganasan Corpus contains the same text samples in at least three languages: The original text in Nganasan with translations mostly into Russian and English, sometimes also into German. The corpus contains 55 communications from 15 different speakers. The bulk of the language material to be integrated, glossed and annotated has been collected by several researchers and is available in audio format. The transcription data as well as the metadata of the corpus are processed and stored in EXMARaLDA format.

Keyword (eng): annotated

Keyword (eng): bilingual society

Keyword (eng): code-switching

Keyword (eng): indigenous language

Keyword (eng): information status

Keyword (eng): endangered language

Keyword (eng): folklore

Keyword (eng): language contact

Keyword (eng): language documentation

Keyword (eng): EXMARaLDA

ResourceClass: corpus

PublicationDate: 2018-06-12

LifeCycleStatus: released

LegalOwner: Beáta Wagner-Nagy, Institut für Finnougristik/Uralistik / Überseering 35 / D-22297 Hamburg, beata.wagner-nagy@uni-hamburg.de

Availability (eng): Free to use for research and teaching purposes.

DistributionType: public

LicenseName: CC BY-NC-SA 4.0

LicenseURL: ttps://creativecommons.org/licenses/by-nc-sa/4.0/legalcode

NonCommercialUsageOnly: true

Daten werden geladen ...

COMA file for Nganasan Spoken Language Corpus (NSLC)
nslc.coma (text/x-exmaralda-coma+xml)

COMA overview for Nganasan Spoken Language Corpus (NSLC)
coma-overview.html (text/html)

NSLCGuidelines
NSLCGuidelines.pdf (application/pdf)

ZIP file (incl. MP3) for Nganasan Spoken Language Corpus (NSLC)
mp3-zip.zip (application/zip)

ZIP file for Nganasan Spoken Language Corpus (NSLC)
nslc.zip (application/zip)

Hinweis: Die eigentlichen Korpusdaten finden Sie unter Sessions.

Nutzungsbedingungen

By using Nganasan Spoken Language Corpus (NSLC), you agree:

  • to use the corpus for non-commercial research and teaching purposes only
  • to cite the following sources in any published work which is based on the corpus

Beáta Wagner-Nagy, Maria Brykina, Valentin Gusev, Sándor Szeverényi. 2018: "Nganasan Spoken Language Corpus (NSLC)". Archived in the Hamburger Zentrum für Sprachkorpora. Version 0.2. Publication date 2018-06-12. http://hdl.handle.net/11022/0000-0007-C6F2-8.