Towards a Gold Standard for Evaluating Danish Word Embeddings
Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › fagfællebedømt
Dokumenter
- 2020.lrec-1.585
Forlagets udgivne version, 310 KB, PDF-dokument
This paper presents the process of compiling a model-agnostic similarity gold standard for evaluating Danish word embeddings based
on human judgments made by 42 native speakers of Danish. Word embeddings resemble semantic similarity solely by distribution
(meaning that word vectors do not reflect relatedness as differing from similarity), and we argue that this generalisation poses a problem
in most intrinsic evaluation scenarios. In order to be able to evaluate on both dimensions, our human-generated dataset is therefore
designed to reflect the distinction between relatedness and similarity. The goal standard is applied for evaluating the "goodness" of
six existing word embedding models for Danish, and it is discussed how a relatively low correlation can be explained by the fact that
semantic similarity is substantially more challenging to model than relatedness, and that there seems to be a need for future human
judgements to measure similarity in full context and along more than a single spectrum.
on human judgments made by 42 native speakers of Danish. Word embeddings resemble semantic similarity solely by distribution
(meaning that word vectors do not reflect relatedness as differing from similarity), and we argue that this generalisation poses a problem
in most intrinsic evaluation scenarios. In order to be able to evaluate on both dimensions, our human-generated dataset is therefore
designed to reflect the distinction between relatedness and similarity. The goal standard is applied for evaluating the "goodness" of
six existing word embedding models for Danish, and it is discussed how a relatively low correlation can be explained by the fact that
semantic similarity is substantially more challenging to model than relatedness, and that there seems to be a need for future human
judgements to measure similarity in full context and along more than a single spectrum.
Originalsprog | Engelsk |
---|---|
Titel | Proceedings of the 12th Language Resources and Evaluation Conference |
Antal sider | 10 |
Udgivelsessted | Marseille, France |
Forlag | European Language Resources Association |
Publikationsdato | 2020 |
Sider | 4756-4765 |
ISBN (Elektronisk) | 9791095546344 |
Status | Udgivet - 2020 |
Begivenhed | Language Resources and Evaluation Conference (LREC) 2020 - Marseille, Marseille, Frankrig Varighed: 13 maj 2020 → 15 maj 2020 https://lrec2020.lrec-conf.org/en/ |
Konference
Konference | Language Resources and Evaluation Conference (LREC) 2020 |
---|---|
Lokation | Marseille |
Land | Frankrig |
By | Marseille |
Periode | 13/05/2020 → 15/05/2020 |
Internetadresse |
Links
- http://www.lrec-conf.org/proceedings/lrec2020/pdf/2020.lrec-1.585.pdf
Forlagets udgivne version
Antal downloads er baseret på statistik fra Google Scholar og www.ku.dk
Ingen data tilgængelig
ID: 241358594