About: Eustagger   Generate local descriptor data

AttributesValues
type
label
  • Eustagger
Description
  • Lemmatizer. Eustagger is a robust and wide-coverage morphological analyser and a Part-of-Speech tagger for Basque. The analyser is based on the two-level formalism and has been designed in an incremental way with three main modules: the standard analyser, the analyser of linguistic variants, and the analyser without lexicon which can recognize word-forms without having their lemmas in the lexicon. Using lexical transducers for our analyser we have improved both the performance of the different components of the system and the description itself. Provides possible lemmas, PoS and other morphological information for a token. It also recognizes date/time expressions, numbers. In the tagger combination of stochastic and rule-based disambiguation methods are applied to Basque language. The methods we have used in disambiguation are Constraint Grammar formalism and an HMM based tagger. CG rules are applied using all the morphological features and this process decreases morphological ambiguity of texts. Finally, we use the stochastic tool to select just one from the possible remaining tags. Using only the stochastic method the error rate is about 14%, but the accuracy may be increased by about 2% enriching the lexicon with the unknown words. When both methods are combined, the error rate of the whole process is 3.5%. Tokenization, morphological analysis, lemmatization and tagging for Basque. There is a web service.
http://lodserver.i.../services/contact
  • n.ezeiza@ehu.es
http://lodserver.i...es/demoInvocation
http://lodserver.i...s/serviceProvider
http://lodserver.i...serviceTechnology
http://lodserver.i...are/services/task
http://lodserver.i...ogy/documentation
http://lodserver.i...logy/languageCode
http://lodserver.i...logy/languageName
  • Basque
http://lodserver.i...y/resourceCreator
http://lodserver.i...logy/resourceName
  • Eustagger
http://lodserver.i...hare/ontology/url
Alternative Linked Data Views: Sponger | iSPARQL | ODE     Raw Data in: CXML | CSV | RDF ( N-Triples N3/Turtle JSON XML ) | OData ( Atom JSON )    About   
This material is Open Knowledge   W3C Semantic Web Technology [RDF Data] This material is Open Knowledge Creative Commons License Valid XHTML + RDFa
This work is licensed under a Creative Commons Attribution-Share Alike 3.0 Unported License.
OpenLink Virtuoso version 06.01.3126, on Linux (x86_64-pc-linux-gnu), Standard Edition
Copyright © 2009-2011 OpenLink Software