Standard
Linguistic information in word embeddings. / Basirat, Ali; Tang, Marc.
Agents and Artificial Intelligence - 10th International Conference, ICAART 2018, Revised Selected Papers. ed. / Jaap van den Herik; Ana Paula Rocha. Cham : Springer Verlag, 2019. p. 492-513 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 11352 LNAI).
Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Harvard
Basirat, A & Tang, M 2019,
Linguistic information in word embeddings. in J van den Herik & AP Rocha (eds),
Agents and Artificial Intelligence - 10th International Conference, ICAART 2018, Revised Selected Papers. Springer Verlag, Cham, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11352 LNAI, pp. 492-513, 10th International Conference on Agents and Artificial Intelligence, ICAART 2018, Funchal, Madeira, Portugal,
16/01/2018.
https://doi.org/10.1007/978-3-030-05453-3_23
APA
Basirat, A., & Tang, M. (2019).
Linguistic information in word embeddings. In J. van den Herik, & A. P. Rocha (Eds.),
Agents and Artificial Intelligence - 10th International Conference, ICAART 2018, Revised Selected Papers (pp. 492-513). Springer Verlag. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 11352 LNAI
https://doi.org/10.1007/978-3-030-05453-3_23
Vancouver
Basirat A, Tang M.
Linguistic information in word embeddings. In van den Herik J, Rocha AP, editors, Agents and Artificial Intelligence - 10th International Conference, ICAART 2018, Revised Selected Papers. Cham: Springer Verlag. 2019. p. 492-513. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 11352 LNAI).
https://doi.org/10.1007/978-3-030-05453-3_23
Author
Basirat, Ali ; Tang, Marc. / Linguistic information in word embeddings. Agents and Artificial Intelligence - 10th International Conference, ICAART 2018, Revised Selected Papers. editor / Jaap van den Herik ; Ana Paula Rocha. Cham : Springer Verlag, 2019. pp. 492-513 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 11352 LNAI).
Bibtex
@inproceedings{d7958b465036498ab81ccc842e364def,
title = "Linguistic information in word embeddings",
abstract = "We study the presence of linguistically motivated information in the word embeddings generated with statistical methods. The nominal aspects of uter/neuter, common/proper, and count/mass in Swedish are selected to represent respectively grammatical, semantic, and mixed types of nominal categories within languages. Our results indicate that typical grammatical and semantic features are easily captured by word embeddings. The classification of semantic features required significantly less neurons than grammatical features in our experiments based on a single layer feed-forward neural network. However, semantic features also generated higher entropy in the classification output despite its high accuracy. Furthermore, the count/mass distinction resulted in difficulties to the model, even though the quantity of neurons was almost tuned to its maximum.",
keywords = "Neural network, Nominal classification, Swedish, Word embedding",
author = "Ali Basirat and Marc Tang",
note = "Publisher Copyright: {\textcopyright} Springer Nature Switzerland AG 2019.; 10th International Conference on Agents and Artificial Intelligence, ICAART 2018 ; Conference date: 16-01-2018 Through 18-01-2018",
year = "2019",
doi = "10.1007/978-3-030-05453-3_23",
language = "English",
isbn = "9783030054526",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "492--513",
editor = "{van den Herik}, Jaap and Rocha, {Ana Paula}",
booktitle = "Agents and Artificial Intelligence - 10th International Conference, ICAART 2018, Revised Selected Papers",
address = "Germany",
}
RIS
TY - GEN
T1 - Linguistic information in word embeddings
AU - Basirat, Ali
AU - Tang, Marc
N1 - Publisher Copyright:
© Springer Nature Switzerland AG 2019.
PY - 2019
Y1 - 2019
N2 - We study the presence of linguistically motivated information in the word embeddings generated with statistical methods. The nominal aspects of uter/neuter, common/proper, and count/mass in Swedish are selected to represent respectively grammatical, semantic, and mixed types of nominal categories within languages. Our results indicate that typical grammatical and semantic features are easily captured by word embeddings. The classification of semantic features required significantly less neurons than grammatical features in our experiments based on a single layer feed-forward neural network. However, semantic features also generated higher entropy in the classification output despite its high accuracy. Furthermore, the count/mass distinction resulted in difficulties to the model, even though the quantity of neurons was almost tuned to its maximum.
AB - We study the presence of linguistically motivated information in the word embeddings generated with statistical methods. The nominal aspects of uter/neuter, common/proper, and count/mass in Swedish are selected to represent respectively grammatical, semantic, and mixed types of nominal categories within languages. Our results indicate that typical grammatical and semantic features are easily captured by word embeddings. The classification of semantic features required significantly less neurons than grammatical features in our experiments based on a single layer feed-forward neural network. However, semantic features also generated higher entropy in the classification output despite its high accuracy. Furthermore, the count/mass distinction resulted in difficulties to the model, even though the quantity of neurons was almost tuned to its maximum.
KW - Neural network
KW - Nominal classification
KW - Swedish
KW - Word embedding
UR - http://www.scopus.com/inward/record.url?scp=85059677023&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-05453-3_23
DO - 10.1007/978-3-030-05453-3_23
M3 - Article in proceedings
AN - SCOPUS:85059677023
SN - 9783030054526
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 492
EP - 513
BT - Agents and Artificial Intelligence - 10th International Conference, ICAART 2018, Revised Selected Papers
A2 - van den Herik, Jaap
A2 - Rocha, Ana Paula
PB - Springer Verlag
CY - Cham
T2 - 10th International Conference on Agents and Artificial Intelligence, ICAART 2018
Y2 - 16 January 2018 through 18 January 2018
ER -