문제

I am trying to setup the OpenNLP NameFinder in a project with an XML feature generator descriptor and some non-standard features. The XML descriptor has support for custom feature generators:

<generators>
  <cache>
    <generators>
      ...
      <custom class="com.example.MyFeatureGenerator"/>
   </cache>
</generators>

However, documentation doesn't speak of passing parameters to the feature generator. Creating a new class for every slightly different configuration of the feature generator is not desirable. On the other hand, creating the feature generators programmatically likely means duplicating much of the OpenNLP code for handling the feature generator setup. What is the recommended way to use custom feature generators in OpenNLP?

도움이 되었습니까?

해결책 2

No proper solution yet, but I worked around the issue by registering a new feature factory in OpenNLP. Unfortunately, this needs access to private parts of the OpenNLP class GeneratorFactory via reflection. Here's a working solution.

First, define a new class, named XmlDescriptorUtil:

import java.lang.reflect.Field;
import java.lang.reflect.InvocationHandler;
import java.lang.reflect.Method;
import java.lang.reflect.Proxy;
import java.util.Map;

import opennlp.tools.util.InvalidFormatException;
import opennlp.tools.util.featuregen.AdaptiveFeatureGenerator;
import opennlp.tools.util.featuregen.FeatureGeneratorResourceProvider;
import opennlp.tools.util.featuregen.GeneratorFactory;

import org.w3c.dom.Element;

public final class XmlDescriptorUtil {
  private XmlDescriptorUtil(){};

  public static abstract class XmlDescriptorFactory implements InvocationHandler
  {
    @Override
    public Object invoke(Object proxy, Method method, Object[] args) throws Throwable {
      return create((Element)args[0], (FeatureGeneratorResourceProvider)args[1]);
    }

    public abstract AdaptiveFeatureGenerator create(Element generatorElement, FeatureGeneratorResourceProvider resourceManager)
      throws InvalidFormatException;
  }

  public static void register(String name, XmlDescriptorFactory factory) throws Exception
  {
    Class<?> factoryInterface = Class.forName(GeneratorFactory.class.getName()+"$XmlFeatureGeneratorFactory");
    Object proxy = Proxy.newProxyInstance(GeneratorFactory.class.getClassLoader(), new Class[]{factoryInterface}, factory);
    registerByProxy(name, proxy);
  }

  private static void registerByProxy(String name, Object proxy) throws Exception
  {
    Field f = GeneratorFactory.class.getDeclaredField("factories");
    f.setAccessible(true);
    @SuppressWarnings("unchecked")
    Map<String, Object> factories = (Map<String, Object>) f.get(null);
    factories.put(name, proxy);
  }

}

Then, create a feature generator factory which implements the public interface XmlDescriptorUtil$XmlDescriptorFactory:

public static void main(String[] args) {
  XmlDescriptorUtil.register("myCustom", new XmlDescriptorUtil.XmlDescriptorFactory() {
    @Override
    public AdaptiveFeatureGenerator create(Element generatorElement, FeatureGeneratorResourceProvider resourceManager) throws InvalidFormatException {
      return new MyFeatureGenerator();
    });
}

Now, the feature generator is ready for use and can be used in the XML descriptor:

<generators>
  <cache>
    <generators>
      ...
      <myCustom/>
    </generators>
  </cache> 
</generators>

If the feature generator needs parameters, they can be extracted from generatorElement in the factory class.

다른 팁

If you don't mind open a jira issue over at Apache OpenNLP and request to fix this. It should be possible for the custom element to pass in parameters and external resources.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top