Standard
The Danish Gigaword Corpus. / Strømberg-Derczynski, Leon; Ciosici, Manuel Rafael; Christiansen, Morten H.; Baglini, Rebekah Brita; Dalsgaard, Jacob Aarup; Fusaroli, Riccardo; Henrichsen, Peter Juel; Hvingelby, Rasmus; Kirkedal, Andreas; Kjeldsen, Alex Speed; Ladefoged, Claus; Nielsen, Finn Arup; Madsen, Jens; Petersen, Malte Lau; Rystrøm, Jonathan Hvithamar; Varab, Daniel.
Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa). Linköping University Electronic Press, 2021. s. 413-421.
Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › fagfællebedømt
Harvard
Strømberg-Derczynski, L, Ciosici, MR, Christiansen, MH, Baglini, RB, Dalsgaard, JA, Fusaroli, R, Henrichsen, PJ, Hvingelby, R, Kirkedal, A
, Kjeldsen, AS, Ladefoged, C, Nielsen, FA, Madsen, J, Petersen, ML, Rystrøm, JH & Varab, D 2021,
The Danish Gigaword Corpus. i
Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa). Linköping University Electronic Press, s. 413-421. <
https://www.aclweb.org/anthology/2021.nodalida-main.46>
APA
Strømberg-Derczynski, L., Ciosici, M. R., Christiansen, M. H., Baglini, R. B., Dalsgaard, J. A., Fusaroli, R., Henrichsen, P. J., Hvingelby, R., Kirkedal, A.
, Kjeldsen, A. S., Ladefoged, C., Nielsen, F. A., Madsen, J., Petersen, M. L., Rystrøm, J. H., & Varab, D. (2021).
The Danish Gigaword Corpus. I
Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa) (s. 413-421). Linköping University Electronic Press.
https://www.aclweb.org/anthology/2021.nodalida-main.46
Vancouver
Strømberg-Derczynski L, Ciosici MR, Christiansen MH, Baglini RB, Dalsgaard JA, Fusaroli R o.a. The Danish Gigaword Corpus. I Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa). Linköping University Electronic Press. 2021. s. 413-421
Author
Strømberg-Derczynski, Leon ; Ciosici, Manuel Rafael ; Christiansen, Morten H. ; Baglini, Rebekah Brita ; Dalsgaard, Jacob Aarup ; Fusaroli, Riccardo ; Henrichsen, Peter Juel ; Hvingelby, Rasmus ; Kirkedal, Andreas ; Kjeldsen, Alex Speed ; Ladefoged, Claus ; Nielsen, Finn Arup ; Madsen, Jens ; Petersen, Malte Lau ; Rystrøm, Jonathan Hvithamar ; Varab, Daniel. / The Danish Gigaword Corpus. Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa). Linköping University Electronic Press, 2021. s. 413-421
Bibtex
@inproceedings{da3cde90ac0d4296b8da1a51f43c2351,
title = "The Danish Gigaword Corpus",
abstract = "Danish language technology has been hindered by a lack of broad-coverage corpora at the scale modern NLP prefers. This paper describes the Danish Gigaword Corpus, the result of a focused effort to provide a diverse and freely-available one billion word corpus of Danish text. The Danish Gigaword corpus covers a wide array of time periods, domains, speakers{\textquoteright} socio-economic status, and Danish dialects.",
author = "Leon Str{\o}mberg-Derczynski and Ciosici, {Manuel Rafael} and Christiansen, {Morten H.} and Baglini, {Rebekah Brita} and Dalsgaard, {Jacob Aarup} and Riccardo Fusaroli and Henrichsen, {Peter Juel} and Rasmus Hvingelby and Andreas Kirkedal and Kjeldsen, {Alex Speed} and Claus Ladefoged and Nielsen, {Finn Arup} and Jens Madsen and Petersen, {Malte Lau} and Rystr{\o}m, {Jonathan Hvithamar} and Daniel Varab",
year = "2021",
language = "English",
pages = "413--421",
booktitle = "Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa)",
publisher = "Link{\"o}ping University Electronic Press",
}
RIS
TY - GEN
T1 - The Danish Gigaword Corpus
AU - Strømberg-Derczynski, Leon
AU - Ciosici, Manuel Rafael
AU - Christiansen, Morten H.
AU - Baglini, Rebekah Brita
AU - Dalsgaard, Jacob Aarup
AU - Fusaroli, Riccardo
AU - Henrichsen, Peter Juel
AU - Hvingelby, Rasmus
AU - Kirkedal, Andreas
AU - Kjeldsen, Alex Speed
AU - Ladefoged, Claus
AU - Nielsen, Finn Arup
AU - Madsen, Jens
AU - Petersen, Malte Lau
AU - Rystrøm, Jonathan Hvithamar
AU - Varab, Daniel
PY - 2021
Y1 - 2021
N2 - Danish language technology has been hindered by a lack of broad-coverage corpora at the scale modern NLP prefers. This paper describes the Danish Gigaword Corpus, the result of a focused effort to provide a diverse and freely-available one billion word corpus of Danish text. The Danish Gigaword corpus covers a wide array of time periods, domains, speakers’ socio-economic status, and Danish dialects.
AB - Danish language technology has been hindered by a lack of broad-coverage corpora at the scale modern NLP prefers. This paper describes the Danish Gigaword Corpus, the result of a focused effort to provide a diverse and freely-available one billion word corpus of Danish text. The Danish Gigaword corpus covers a wide array of time periods, domains, speakers’ socio-economic status, and Danish dialects.
M3 - Article in proceedings
SP - 413
EP - 421
BT - Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa)
PB - Linköping University Electronic Press
ER -