Augmenting the automated extracted tree adjoining grammars by semantic representation

Publikation: Bidrag til bog/antologi/rapportKonferencebidrag i proceedingsForskningfagfællebedømt

MICA [1] is a fast and accurate dependency parser for English that uses an automatically LTAG derived from Penn Treebank (PTB) using the Chen's approach [7]. However, there is no semantic representation related to its grammar. On the other hand, XTAG [20] grammar is a hand crafted LTAG that its elementary trees were enriched with the semantic representation by experts. The linguistic knowledge embedded in the XTAG grammar caused it to being used in wide variety of natural language applications. However, the current XTAG parser is not as fast and accurate as well as the MICA parser. Generating an XTAG derivation tree from a MICA dependency structure could make a bridge between these two notions and gets the benefits of both models. Also, by having this conversion, the applications that use the XTAG parser, may get the helps from MICA parser too. In addition, it can enrich the MICA's grammar by semantic representation of XTAG grammar. In this paper, an unsupervised sequence tagger that maps any sequence of MICA elementary trees onto an XTAG elementary trees sequence is presented. The proposed sequence tagger is based on a Hidden Markov Model (HMM) proceeded by an EM-based algorithm for setting its initial parameters values. The trained model is tested on a part of PTB and about 82% accuracy for the detected sequences is achieved.

TitelProceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering, NLP-KE, 2010
ISBN (Trykt)9781424468966
StatusUdgivet - 2010
Begivenhed6th International Conference on Natural Language Processing and Knowledge Engineering, NLP-KE 2010 - Beijing, Kina
Varighed: 21 aug. 201023 aug. 2010


Konference6th International Conference on Natural Language Processing and Knowledge Engineering, NLP-KE 2010
NavnProceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering, NLP-KE 2010

ID: 366048038