I think Boris's answer is reasonable from a Prolog developer point of view, but I didn't want to spend too much time to investigate how could I call Java functions from Prolog and vice versa, not to mention the problem of creating a standalone executable from the Prolog scripts which is a requirement for me.
So I was a bit stubborn and I tried to implement my original idea of creating an inverted index from the Prolog terms and it works flawlessly. So now when I want to search for a certain Prolog goal, in the first step I can filter the "database" for the relevant theories: I compute the intersection of the occurences of the terms, and I run the Prolog engine with only these theories. Another big performance optimization was to share a singleton TuProlog engine between the individual searches, and it also decreased the memory overhead. Oh and I also refactored the rules too, for example now I write this:
rel(nsubjpass, "seen","It", 1).
instead of this:
rel("nsubjpass","seen","It",S):-S is 1.
I didn't notice any big performance boost from this change, but at least it didn't get slower :)
With these changes, now it can be seen that the real bottleneck of the performance is not running the Prolog engine, but using NLP libraries...but it's another issue. :)