(copy of what I wrote in https://github.com/misja/python-boilerpipe/issues/17)
OK, I've reproduced the bug : the thread that calls the JVM is not attached to it, therefore the calls to JVM internals fail. The bug comes from boilerpipe (see below).
First, monkey patching : in the code you posted on stackoverflow, you just have to add the following code before the creation of the extractor :
class ExtractingContent:
@classmethod
def processingContent(self,sourceUrl,extractorType="DefaultExtractor"):
print "State=", jpype.isThreadAttachedToJVM()
if not jpype.isThreadAttachedToJVM():
print "Needs to attach..."
jpype.attachThreadToJVM()
print "Check Attached=", jpype.isThreadAttachedToJVM()
extractor = Extractor(extractor=extractorType, url=sourceUrl)
About boilerpipe: the check if threading.activeCount() > 1
in boilerpipe/extractor/__init__.py
, line 50, is wrong.
The calling thread must always be attached to the JVM, even if there is only one.