Lexical and morpho-syntactic features in word embeddings a case study of nouns in Swedish
Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Standard
Lexical and morpho-syntactic features in word embeddings a case study of nouns in Swedish. / Basirat, Ali; Tang, Marc.
ICAART 2018 - Proceedings of the 10th International Conference on Agents and Artificial Intelligence. ed. / Ana Paula Rocha; Jaap van den Herik. SCITEPRESS (Science and Technology Publications, Lda.), 2018. p. 663-674 (ICAART 2018 - Proceedings of the 10th International Conference on Agents and Artificial Intelligence, Vol. 2).Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - GEN
T1 - Lexical and morpho-syntactic features in word embeddings a case study of nouns in Swedish
AU - Basirat, Ali
AU - Tang, Marc
N1 - Publisher Copyright: Copyright © 2018 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved
PY - 2018
Y1 - 2018
N2 - We apply real-valued word vectors combined with two different types of classifiers (linear discriminant analysis and feed-forward neural network) to scrutinize whether basic nominal categories can be captured by simple word embedding models. We also provide a linguistic analysis of the errors generated by the classifiers. The targeted language is Swedish, in which we investigate three nominal aspects: uter/neuter, common/proper, and count/mass. They represent respectively grammatical, semantic, and mixed types of nominal classification within languages. Our results show that word embeddings can capture typical grammatical and semantic features such as uter/neuter and common/proper nouns. Nevertheless, the model encounters difficulties to identify classes such as count/mass which not only combine both grammatical and semantic properties, but are also subject to conversion and shift. Hence, we answer the call of the Special Session on Natural Language Processing in Artificial Intelligence by approaching the topic of interfaces between morphology, lexicon, semantics, and syntax via interdisciplinary methods combining machine learning of language and general linguistics.
AB - We apply real-valued word vectors combined with two different types of classifiers (linear discriminant analysis and feed-forward neural network) to scrutinize whether basic nominal categories can be captured by simple word embedding models. We also provide a linguistic analysis of the errors generated by the classifiers. The targeted language is Swedish, in which we investigate three nominal aspects: uter/neuter, common/proper, and count/mass. They represent respectively grammatical, semantic, and mixed types of nominal classification within languages. Our results show that word embeddings can capture typical grammatical and semantic features such as uter/neuter and common/proper nouns. Nevertheless, the model encounters difficulties to identify classes such as count/mass which not only combine both grammatical and semantic properties, but are also subject to conversion and shift. Hence, we answer the call of the Special Session on Natural Language Processing in Artificial Intelligence by approaching the topic of interfaces between morphology, lexicon, semantics, and syntax via interdisciplinary methods combining machine learning of language and general linguistics.
KW - Neural Network
KW - Nominal Classification
KW - Swedish
KW - Word Embedding
UR - http://www.scopus.com/inward/record.url?scp=85046649696&partnerID=8YFLogxK
U2 - 10.5220/0006729606630674
DO - 10.5220/0006729606630674
M3 - Article in proceedings
AN - SCOPUS:85046649696
T3 - ICAART 2018 - Proceedings of the 10th International Conference on Agents and Artificial Intelligence
SP - 663
EP - 674
BT - ICAART 2018 - Proceedings of the 10th International Conference on Agents and Artificial Intelligence
A2 - Rocha, Ana Paula
A2 - van den Herik, Jaap
PB - SCITEPRESS (Science and Technology Publications, Lda.)
T2 - 10th International Conference on Agents and Artificial Intelligence, ICAART 2018
Y2 - 16 January 2018 through 18 January 2018
ER -
ID: 366046241