سؤال

Here's what I intend to do:

doc = xapian.Document()
doc.set_data(somedata)
..
..
doc.add_term("Ajohn doe")

Assume the prefix "author" is available for document author.

Now I want to be able to run this search "searchterm AND author:john doe"

This is obviously not working because "doe" is being considered part of the author (the QueryParser is translating it to "searchterm AND author:john OR doe"). Should I do this:

doc.add-term("Ajohn_doe")

and search by "searchterm AND author:john_doe"? Are there any alternatives for searching text with spaces in general?

هل كانت مفيدة؟

المحلول

The most common way of doing this would be to add terms Ajohn and Adoe (probably using Xapian's TermGenerator, which will do word splitting and term creation for you). Having done this, you can then run a search author:"john doe" (a prefixed phrase search, which will be able to search across multiple terms). Something like the following:

import xapian
db = xapian.WritableDatabase("my-db", xapian.DB_CREATE_OR_OPEN)
tg = xapian.TermGenerator()

doc = xapian.Document()
tg.set_document(doc)
tg.index_text("John Doe", 1, "A")
db.add_document(doc)

qp = xapian.QueryParser()
qp.add_prefix("author", "A")
q = qp.parse_query('author:"John Doe"')

enq = xapian.Enquire(db)
enq.set_query(q)
for match in enq.get_mset(0, 10):
    print "%8.8i: %f" % (match.docid, match.weight,)

(Tested against a semi-recent Xapian trunk, although I don't believe anything is particularly new here.)

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top