Lexical and morpho-syntactic features in word embeddings a case study of nouns in Swedish

Publikation: Bidrag til bog/antologi/rapportKonferencebidrag i proceedingsForskningfagfællebedømt

Standard

Lexical and morpho-syntactic features in word embeddings a case study of nouns in Swedish. / Basirat, Ali; Tang, Marc.

ICAART 2018 - Proceedings of the 10th International Conference on Agents and Artificial Intelligence. red. / Ana Paula Rocha; Jaap van den Herik. SCITEPRESS (Science and Technology Publications, Lda.), 2018. s. 663-674 (ICAART 2018 - Proceedings of the 10th International Conference on Agents and Artificial Intelligence, Bind 2).

Publikation: Bidrag til bog/antologi/rapportKonferencebidrag i proceedingsForskningfagfællebedømt

Harvard

Basirat, A & Tang, M 2018, Lexical and morpho-syntactic features in word embeddings a case study of nouns in Swedish. i AP Rocha & J van den Herik (red), ICAART 2018 - Proceedings of the 10th International Conference on Agents and Artificial Intelligence. SCITEPRESS (Science and Technology Publications, Lda.), ICAART 2018 - Proceedings of the 10th International Conference on Agents and Artificial Intelligence, bind 2, s. 663-674, 10th International Conference on Agents and Artificial Intelligence, ICAART 2018, Funchal, Madeira, Portugal, 16/01/2018. https://doi.org/10.5220/0006729606630674

APA

Basirat, A., & Tang, M. (2018). Lexical and morpho-syntactic features in word embeddings a case study of nouns in Swedish. I A. P. Rocha, & J. van den Herik (red.), ICAART 2018 - Proceedings of the 10th International Conference on Agents and Artificial Intelligence (s. 663-674). SCITEPRESS (Science and Technology Publications, Lda.). ICAART 2018 - Proceedings of the 10th International Conference on Agents and Artificial Intelligence Bind 2 https://doi.org/10.5220/0006729606630674

Vancouver

Basirat A, Tang M. Lexical and morpho-syntactic features in word embeddings a case study of nouns in Swedish. I Rocha AP, van den Herik J, red., ICAART 2018 - Proceedings of the 10th International Conference on Agents and Artificial Intelligence. SCITEPRESS (Science and Technology Publications, Lda.). 2018. s. 663-674. (ICAART 2018 - Proceedings of the 10th International Conference on Agents and Artificial Intelligence, Bind 2). https://doi.org/10.5220/0006729606630674

Author

Basirat, Ali ; Tang, Marc. / Lexical and morpho-syntactic features in word embeddings a case study of nouns in Swedish. ICAART 2018 - Proceedings of the 10th International Conference on Agents and Artificial Intelligence. red. / Ana Paula Rocha ; Jaap van den Herik. SCITEPRESS (Science and Technology Publications, Lda.), 2018. s. 663-674 (ICAART 2018 - Proceedings of the 10th International Conference on Agents and Artificial Intelligence, Bind 2).

Bibtex

@inproceedings{a8c826436a9248d4a38b2849e93e3be6,
title = "Lexical and morpho-syntactic features in word embeddings a case study of nouns in Swedish",
abstract = "We apply real-valued word vectors combined with two different types of classifiers (linear discriminant analysis and feed-forward neural network) to scrutinize whether basic nominal categories can be captured by simple word embedding models. We also provide a linguistic analysis of the errors generated by the classifiers. The targeted language is Swedish, in which we investigate three nominal aspects: uter/neuter, common/proper, and count/mass. They represent respectively grammatical, semantic, and mixed types of nominal classification within languages. Our results show that word embeddings can capture typical grammatical and semantic features such as uter/neuter and common/proper nouns. Nevertheless, the model encounters difficulties to identify classes such as count/mass which not only combine both grammatical and semantic properties, but are also subject to conversion and shift. Hence, we answer the call of the Special Session on Natural Language Processing in Artificial Intelligence by approaching the topic of interfaces between morphology, lexicon, semantics, and syntax via interdisciplinary methods combining machine learning of language and general linguistics.",
keywords = "Neural Network, Nominal Classification, Swedish, Word Embedding",
author = "Ali Basirat and Marc Tang",
note = "Publisher Copyright: Copyright {\textcopyright} 2018 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved; 10th International Conference on Agents and Artificial Intelligence, ICAART 2018 ; Conference date: 16-01-2018 Through 18-01-2018",
year = "2018",
doi = "10.5220/0006729606630674",
language = "English",
series = "ICAART 2018 - Proceedings of the 10th International Conference on Agents and Artificial Intelligence",
pages = "663--674",
editor = "Rocha, {Ana Paula} and {van den Herik}, Jaap",
booktitle = "ICAART 2018 - Proceedings of the 10th International Conference on Agents and Artificial Intelligence",
publisher = "SCITEPRESS (Science and Technology Publications, Lda.)",

}

RIS

TY - GEN

T1 - Lexical and morpho-syntactic features in word embeddings a case study of nouns in Swedish

AU - Basirat, Ali

AU - Tang, Marc

N1 - Publisher Copyright: Copyright © 2018 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved

PY - 2018

Y1 - 2018

N2 - We apply real-valued word vectors combined with two different types of classifiers (linear discriminant analysis and feed-forward neural network) to scrutinize whether basic nominal categories can be captured by simple word embedding models. We also provide a linguistic analysis of the errors generated by the classifiers. The targeted language is Swedish, in which we investigate three nominal aspects: uter/neuter, common/proper, and count/mass. They represent respectively grammatical, semantic, and mixed types of nominal classification within languages. Our results show that word embeddings can capture typical grammatical and semantic features such as uter/neuter and common/proper nouns. Nevertheless, the model encounters difficulties to identify classes such as count/mass which not only combine both grammatical and semantic properties, but are also subject to conversion and shift. Hence, we answer the call of the Special Session on Natural Language Processing in Artificial Intelligence by approaching the topic of interfaces between morphology, lexicon, semantics, and syntax via interdisciplinary methods combining machine learning of language and general linguistics.

AB - We apply real-valued word vectors combined with two different types of classifiers (linear discriminant analysis and feed-forward neural network) to scrutinize whether basic nominal categories can be captured by simple word embedding models. We also provide a linguistic analysis of the errors generated by the classifiers. The targeted language is Swedish, in which we investigate three nominal aspects: uter/neuter, common/proper, and count/mass. They represent respectively grammatical, semantic, and mixed types of nominal classification within languages. Our results show that word embeddings can capture typical grammatical and semantic features such as uter/neuter and common/proper nouns. Nevertheless, the model encounters difficulties to identify classes such as count/mass which not only combine both grammatical and semantic properties, but are also subject to conversion and shift. Hence, we answer the call of the Special Session on Natural Language Processing in Artificial Intelligence by approaching the topic of interfaces between morphology, lexicon, semantics, and syntax via interdisciplinary methods combining machine learning of language and general linguistics.

KW - Neural Network

KW - Nominal Classification

KW - Swedish

KW - Word Embedding

UR - http://www.scopus.com/inward/record.url?scp=85046649696&partnerID=8YFLogxK

U2 - 10.5220/0006729606630674

DO - 10.5220/0006729606630674

M3 - Article in proceedings

AN - SCOPUS:85046649696

T3 - ICAART 2018 - Proceedings of the 10th International Conference on Agents and Artificial Intelligence

SP - 663

EP - 674

BT - ICAART 2018 - Proceedings of the 10th International Conference on Agents and Artificial Intelligence

A2 - Rocha, Ana Paula

A2 - van den Herik, Jaap

PB - SCITEPRESS (Science and Technology Publications, Lda.)

T2 - 10th International Conference on Agents and Artificial Intelligence, ICAART 2018

Y2 - 16 January 2018 through 18 January 2018

ER -

ID: 366046241