The Spoken Wikipedia Corpora

PID: http://hdl.handle.net/11022/0000-0007-C641-0

BeschreibungMetadatenZugehörige Dateien

Daten werden geladen ...

Zusammenfassung

Title (en) The Spoken Wikipedia Corpora
Description The Spoken Wikipedia project unites volunteer readers of Wikipedia articles. Hundreds of spoken articles in multiple languages are available to users who are – for one reason or another – unable or unwilling to consume the written version of the article. Our resource, the Spoken Wikipedia Corpus, consolidates the Spoken Wikipediae, adding text segmentation, normalization, time-alignment and further annotations, making it accessible for research and fostering new ways of interacting with the material.
Publication date 2017
Data owner Timo Baumann - Universität Hamburg
Languages English (eng), German (deu), Dutch (nld)
License Creative Commons Attribution-ShareAlike 4.0 International (public
PIDs  This corpus: http://hdl.handle.net/11022/0000-0007-C641-0

CMDI metadata: http://hdl.handle.net/11022/0000-0007-C640-1

Details

Header

Resources

Components

SpokenCorpusProfile

Name (en): swc

Title (en): The Spoken Wikipedia Corpora

PID: http://hdl.handle.net/11022/0000-0007-C641-0

Description (en): The Spoken Wikipedia project unites volunteer readers of Wikipedia articles. Hundreds of spoken articles in multiple languages are available to users who are – for one reason or another – unable or unwilling to consume the written version of the article. Our resource, the Spoken Wikipedia Corpus, consolidates the Spoken Wikipediae, adding text segmentation, normalization, time-alignment and further annotations, making it accessible for research and fostering new ways of interacting with the material.

ResourceClass: corpus

PublicationDate: 2017

LifeCycleStatus: released

LegalOwner: Timo Baumann - Universität Hamburg

Availability (en): Creative Commons Attribution-ShareAlike 4.0 International

DistributionType: public

LicenseName: Creative Commons Attribution-ShareAlike 4.0 International

LicenseURL: https://creativecommons.org/licenses/by-sa/4.0/

NonCommercialUsageOnly: false

UsageReportRequired: false

ModificationsRequireRedeposition: true

BibliographicCitations

Dutch spoken Wikipedia with audio (SWC 2.0)
nl-with-audio.tar (application/x-tar)

Dutch spoken wikipedia (SWC 2.0)
nl.txz

English spoken Wikipedia with audio (SWC 2.0)
en-with-audio.tar (application/x-tar)

English spoken wikipedia (SWC 2.0)
en.txz

German spoken Wikipedia with audio (SWC 2.0)
de-with-audio.tar (application/x-tar)

German spoken wikipedia (SWC 2.0)
de.txz

Nutzungsbedingungen

By using The Spoken Wikipedia Corpora, you agree:

  • to use the corpus for non-commercial research and teaching purposes only
  • to cite the following sources in any published work which is based on the corpus

. 2017: "The Spoken Wikipedia Corpora". Archived in the Hamburger Zentrum für Sprachkorpora. Version 2.0. Publication date 2017. http://hdl.handle.net/11022/0000-0007-C641-0.