Building Sense Representations in Danish by Combining Word Embeddings with Lexical Resources

Publikation: Bidrag til bog/antologi/rapportKonferenceabstrakt i proceedingsForskningfagfællebedømt

Our aim is to identify suitable sense representations for NLP in Danish. We investigate sense inventories that correlate with human interpretations of word meaning and ambiguity as typically described in dictionaries and wordnets and that are well reflected distributionally
as expressed in word embeddings. To this end, we study a number of highly ambiguous Danish nouns and examine the effectiveness of
sense representations constructed by combining vectors from a distributional model with the information from a wordnet. We establish
representations based on centroids obtained from wordnet synsets and example sentences as well as representations established via
a clustering approach; these representations are tested in a word sense disambiguation task. We conclude that the more information
extracted from the wordnet entries (example sentence, definition, semantic relations) the more successful the sense representation vector.
OriginalsprogEngelsk
TitelGlobalex Workshop on Linked Lexicography : LREC 2020 Workshop Language Resources and Evaluation Conference
Antal sider7
Udgivelses stedMarseille, France
ForlagEuropean Language Resources Association
Publikationsdato2020
Sider45-52
ISBN (Elektronisk)979-10-95546-46-7
StatusUdgivet - 2020

ID: 241359613