문제

I read in many sites and they said that Gate supports Spanish but I did not find how to analyze Spanish text with Gate. I tried with the treetagger but I get the error:TreeTagger\tree-tagger-spanish-gate": CreateProcess error=193, %1 no es una aplicación Win32 válida

I also tried with the OpenNLP but I dindt find the model for spanish (tokenizer, chunk, etc) I only found dutch, german and english.

Also I need to indentify the subject of a sentence and the predicate. Correct me if I'm wrong but I think I can do it with OpenNLP because it is possible to identify the NP(noun phrase) and the VP(verb phrase) with the tree bank parser or with the MuNPEx plugin.

Summering, Is there any way to set Gate language to Spanish?

Thanks.

도움이 되었습니까?

해결책

You need Cygwin to be able to run the treetagger scripts on Windows. The error message

CreateProcess error=193, %1 no es una aplicación Win32 válida

suggests to me that you have not set the shell.path system property to point to your Cygwin sh.exe, as explained in the TaggerFramework section of the user guide.

For OpenNLP there are some name finder models available at http://opennlp.sourceforge.net/models-1.5/ and POS tagger models at https://github.com/utcompling/OpenNLP-Models/tree/master/models/es but I can't see tokeniser or chunker models anywhere. For tokenisation and sentence splitting I suspect that the default GATE Unicode Tokeniser (not the "ANNIE English tokeniser") and either of the default sentence splitters will do a reasonable job.

It might be worth subscribing to the gate-users mailing list and asking on there whether anyone else has any Spanish resources they would be willing to share.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top