Towards a Gold Standard for Evaluating Danish Word Embeddings

Publikation: Bidrag til bog/antologi/rapportKonferencebidrag i proceedingsForskningfagfællebedømt

Standard

Towards a Gold Standard for Evaluating Danish Word Embeddings. / Schneidermann, Nina; Hvingelby, Rasmus; Pedersen, Bolette Sandford.

Proceedings of the 12th Language Resources and Evaluation Conference. Marseille, France : European Language Resources Association, 2020. s. 4756-4765.

Publikation: Bidrag til bog/antologi/rapportKonferencebidrag i proceedingsForskningfagfællebedømt

Harvard

Schneidermann, N, Hvingelby, R & Pedersen, BS 2020, Towards a Gold Standard for Evaluating Danish Word Embeddings. i Proceedings of the 12th Language Resources and Evaluation Conference. European Language Resources Association, Marseille, France, s. 4756-4765, Language Resources and Evaluation Conference (LREC) 2020, Marseille, Frankrig, 13/05/2020. <http://www.lrec-conf.org/proceedings/lrec2020/pdf/2020.lrec-1.585.pdf>

APA

Schneidermann, N., Hvingelby, R., & Pedersen, B. S. (2020). Towards a Gold Standard for Evaluating Danish Word Embeddings. I Proceedings of the 12th Language Resources and Evaluation Conference (s. 4756-4765). European Language Resources Association. http://www.lrec-conf.org/proceedings/lrec2020/pdf/2020.lrec-1.585.pdf

Vancouver

Schneidermann N, Hvingelby R, Pedersen BS. Towards a Gold Standard for Evaluating Danish Word Embeddings. I Proceedings of the 12th Language Resources and Evaluation Conference. Marseille, France: European Language Resources Association. 2020. s. 4756-4765

Author

Schneidermann, Nina ; Hvingelby, Rasmus ; Pedersen, Bolette Sandford. / Towards a Gold Standard for Evaluating Danish Word Embeddings. Proceedings of the 12th Language Resources and Evaluation Conference. Marseille, France : European Language Resources Association, 2020. s. 4756-4765

Bibtex

@inproceedings{5e64e09100fb473b9ef3049783f418a9,
title = "Towards a Gold Standard for Evaluating Danish Word Embeddings",
abstract = "This paper presents the process of compiling a model-agnostic similarity gold standard for evaluating Danish word embeddings basedon human judgments made by 42 native speakers of Danish. Word embeddings resemble semantic similarity solely by distribution(meaning that word vectors do not reflect relatedness as differing from similarity), and we argue that this generalisation poses a problemin most intrinsic evaluation scenarios. In order to be able to evaluate on both dimensions, our human-generated dataset is thereforedesigned to reflect the distinction between relatedness and similarity. The goal standard is applied for evaluating the {"}goodness{"} ofsix existing word embedding models for Danish, and it is discussed how a relatively low correlation can be explained by the fact thatsemantic similarity is substantially more challenging to model than relatedness, and that there seems to be a need for future humanjudgements to measure similarity in full context and along more than a single spectrum.",
author = "Nina Schneidermann and Rasmus Hvingelby and Pedersen, {Bolette Sandford}",
year = "2020",
language = "English",
pages = "4756--4765",
booktitle = "Proceedings of the 12th Language Resources and Evaluation Conference",
publisher = "European Language Resources Association",
note = "null ; Conference date: 13-05-2020 Through 15-05-2020",
url = "https://lrec2020.lrec-conf.org/en/",

}

RIS

TY - GEN

T1 - Towards a Gold Standard for Evaluating Danish Word Embeddings

AU - Schneidermann, Nina

AU - Hvingelby, Rasmus

AU - Pedersen, Bolette Sandford

PY - 2020

Y1 - 2020

N2 - This paper presents the process of compiling a model-agnostic similarity gold standard for evaluating Danish word embeddings basedon human judgments made by 42 native speakers of Danish. Word embeddings resemble semantic similarity solely by distribution(meaning that word vectors do not reflect relatedness as differing from similarity), and we argue that this generalisation poses a problemin most intrinsic evaluation scenarios. In order to be able to evaluate on both dimensions, our human-generated dataset is thereforedesigned to reflect the distinction between relatedness and similarity. The goal standard is applied for evaluating the "goodness" ofsix existing word embedding models for Danish, and it is discussed how a relatively low correlation can be explained by the fact thatsemantic similarity is substantially more challenging to model than relatedness, and that there seems to be a need for future humanjudgements to measure similarity in full context and along more than a single spectrum.

AB - This paper presents the process of compiling a model-agnostic similarity gold standard for evaluating Danish word embeddings basedon human judgments made by 42 native speakers of Danish. Word embeddings resemble semantic similarity solely by distribution(meaning that word vectors do not reflect relatedness as differing from similarity), and we argue that this generalisation poses a problemin most intrinsic evaluation scenarios. In order to be able to evaluate on both dimensions, our human-generated dataset is thereforedesigned to reflect the distinction between relatedness and similarity. The goal standard is applied for evaluating the "goodness" ofsix existing word embedding models for Danish, and it is discussed how a relatively low correlation can be explained by the fact thatsemantic similarity is substantially more challenging to model than relatedness, and that there seems to be a need for future humanjudgements to measure similarity in full context and along more than a single spectrum.

M3 - Article in proceedings

SP - 4756

EP - 4765

BT - Proceedings of the 12th Language Resources and Evaluation Conference

PB - European Language Resources Association

CY - Marseille, France

Y2 - 13 May 2020 through 15 May 2020

ER -

ID: 241358594