The Danish Gigaword Project

Publikation: Working paperForskning

Standard

The Danish Gigaword Project. / Strømberg-Derczynski, Leon; Baglini, Rebekah; Christiansen, Morten H.; Ciosici, Manuel R.; Dalsgaard, Jacob Aarup; Fusaroli, Riccardo; Henrichsen, Peter Juel; Hvingelby, Rasmus; Kirkedal, Andreas; Kjeldsen, Alex Speed; Ladefoged, Claus; Nielsen, Finn Årup; Petersen, Malte Lau; Rystrøm, Jonathan Hvithamar; Varab, Daniel.

2020.

Publikation: Working paperForskning

Harvard

Strømberg-Derczynski, L, Baglini, R, Christiansen, MH, Ciosici, MR, Dalsgaard, JA, Fusaroli, R, Henrichsen, PJ, Hvingelby, R, Kirkedal, A, Kjeldsen, AS, Ladefoged, C, Nielsen, FÅ, Petersen, ML, Rystrøm, JH & Varab, D 2020 'The Danish Gigaword Project'. <https://arxiv.org/pdf/2005.03521.pdf>

APA

Strømberg-Derczynski, L., Baglini, R., Christiansen, M. H., Ciosici, M. R., Dalsgaard, J. A., Fusaroli, R., Henrichsen, P. J., Hvingelby, R., Kirkedal, A., Kjeldsen, A. S., Ladefoged, C., Nielsen, F. Å., Petersen, M. L., Rystrøm, J. H., & Varab, D. (2020). The Danish Gigaword Project. https://arxiv.org/pdf/2005.03521.pdf

Vancouver

Strømberg-Derczynski L, Baglini R, Christiansen MH, Ciosici MR, Dalsgaard JA, Fusaroli R o.a. The Danish Gigaword Project. 2020.

Author

Strømberg-Derczynski, Leon ; Baglini, Rebekah ; Christiansen, Morten H. ; Ciosici, Manuel R. ; Dalsgaard, Jacob Aarup ; Fusaroli, Riccardo ; Henrichsen, Peter Juel ; Hvingelby, Rasmus ; Kirkedal, Andreas ; Kjeldsen, Alex Speed ; Ladefoged, Claus ; Nielsen, Finn Årup ; Petersen, Malte Lau ; Rystrøm, Jonathan Hvithamar ; Varab, Daniel. / The Danish Gigaword Project. 2020.

Bibtex

@techreport{a0eeecbd71d344b7b2cb0b5b850010a9,
title = "The Danish Gigaword Project",
abstract = "Danish is a North Germanic/Scandinavian language spoken primarily in Denmark, a country with a tradition of technological and scientific innovation. However, from a technological perspective, the Danish language has received relatively little attention and, as a result, Danish language technology is hard to develop, in part due to a lack of large or broad-coverage Danish corpora. This paper describes the Danish Gigaword project, which aims to construct a freely-available one billion word corpus of Danish text that represents the breadth of the written language.",
author = "Leon Str{\o}mberg-Derczynski and Rebekah Baglini and Christiansen, {Morten H.} and Ciosici, {Manuel R.} and Dalsgaard, {Jacob Aarup} and Riccardo Fusaroli and Henrichsen, {Peter Juel} and Rasmus Hvingelby and Andreas Kirkedal and Kjeldsen, {Alex Speed} and Claus Ladefoged and Nielsen, {Finn {\AA}rup} and Petersen, {Malte Lau} and Rystr{\o}m, {Jonathan Hvithamar} and Daniel Varab",
year = "2020",
language = "Dansk",
type = "WorkingPaper",

}

RIS

TY - UNPB

T1 - The Danish Gigaword Project

AU - Strømberg-Derczynski, Leon

AU - Baglini, Rebekah

AU - Christiansen, Morten H.

AU - Ciosici, Manuel R.

AU - Dalsgaard, Jacob Aarup

AU - Fusaroli, Riccardo

AU - Henrichsen, Peter Juel

AU - Hvingelby, Rasmus

AU - Kirkedal, Andreas

AU - Kjeldsen, Alex Speed

AU - Ladefoged, Claus

AU - Nielsen, Finn Årup

AU - Petersen, Malte Lau

AU - Rystrøm, Jonathan Hvithamar

AU - Varab, Daniel

PY - 2020

Y1 - 2020

N2 - Danish is a North Germanic/Scandinavian language spoken primarily in Denmark, a country with a tradition of technological and scientific innovation. However, from a technological perspective, the Danish language has received relatively little attention and, as a result, Danish language technology is hard to develop, in part due to a lack of large or broad-coverage Danish corpora. This paper describes the Danish Gigaword project, which aims to construct a freely-available one billion word corpus of Danish text that represents the breadth of the written language.

AB - Danish is a North Germanic/Scandinavian language spoken primarily in Denmark, a country with a tradition of technological and scientific innovation. However, from a technological perspective, the Danish language has received relatively little attention and, as a result, Danish language technology is hard to develop, in part due to a lack of large or broad-coverage Danish corpora. This paper describes the Danish Gigaword project, which aims to construct a freely-available one billion word corpus of Danish text that represents the breadth of the written language.

UR - https://arxiv.org/abs/2005.03521

M3 - Working paper

BT - The Danish Gigaword Project

ER -

ID: 240835080