Automatic enhancement of LTAG Treebanks
Publikation: Bidrag til tidsskrift › Konferenceartikel › Forskning › fagfællebedømt
The Treebanks as the sets of syntactically annotated sentences, are the most widely used language resource in the application of Natural Language Processing. The occurrence of errors in the automatically created Treebanks is one of the main obstacles limiting the using of these resources in the real world applications. This paper aims to introduce an statistical method for diminishing the amount of errors occurred in a specific English LTAG-Treebank proposed in Basirat and Faili (2013). The problem has been formulated as a classification problem and has been tackled by using several classifiers. The experiments show that by using this approach, about 95% of the errors could be detected and more than 77% of them could successfully be corrected in the case of using Adaboost classifier. In addition, it has been shown that the new treebank could reach a high of 76% F-measure which is 8% higher than the original treebank.
Originalsprog | Engelsk |
---|---|
Tidsskrift | International Conference Recent Advances in Natural Language Processing, RANLP |
Sider (fra-til) | 733-739 |
Antal sider | 7 |
ISSN | 1313-8502 |
Status | Udgivet - 2013 |
Begivenhed | 9th International Conference on Recent Advances in Natural Language Processing, RANLP 2013 - Hissar, Bulgarien Varighed: 9 sep. 2013 → 11 sep. 2013 |
Konference
Konference | 9th International Conference on Recent Advances in Natural Language Processing, RANLP 2013 |
---|---|
Land | Bulgarien |
By | Hissar |
Periode | 09/09/2013 → 11/09/2013 |
ID: 366047604