Question

I am using Oracle 11g and Oracle Text for a web search engine.

I have now created & text-indexed a CLOB column Keywords which contains space-separated words. This allowed me to extend the search, as Oracle Text will return the rows that have one or more keywords stored in that column. The contents of the column are hidden from the user, and is used only to "extend" the search. This is working as intended.

But now I need to support multiple words or even complete sentences. With the current configuration, Oracle Text will search only for individual keyword. How do I need to store the phrases and configure Oracle Text so that it will search for whole phrases (exact match is preferred, but fuzzy matching is fine too)?

Column content example of two rows(semi-colon seperated values):

"hello, hello; is there anybody out there?; nope;"
"just the; basic facts;"

I found a similar question: Searching a column with comma separated values, except that I need a solution for Oracle 11g with it's freetext search functionality.

Possible solutions:

1st solution: I was thinking of redesigning the DB as follows. I'd make a new table Keywords(pkID NUMBER, nonUniqueID NUMBER, singlePhrase VARCHAR2(100 BYTE)). And I'd change the previous column Keyword to KeywordNonUniqueID, which would hold the ID (instead of a list of values). At search-time I'd INNER JOIN with the new Keyword table. The problem with this solution is that I'll get multiple rows that contains the same data except the phrase. I assume this will destroy the ranking?

2nd solution: Is it possible to store phrases as a XML in the original Keyword column, and somehow tell Oracle Text to search within the XML?

3rd solution: ?

Note that, generally, there won't be a lot of phrases (less than 100), nor will they be long (a single phrase will have up to 5 words).

Also note that I am currently using CONTAINS, and a few of its operators, for my full-text searching needs.

EDIT: This https://forums.oracle.com/forums/thread.jspa?messageID=10791361 discussion that almost solves my problem, but it also matches the individual words, not the whole phrase (exact matching).

Was it helpful?

Solution

Oracle supports searching of phrases by default. In docs we can see this

4.1.4.1 CONTAINS Phrase Queries

If multiple words are contained in a query expression, separated only by blank spaces (no operators), the string of words is considered a phrase and Oracle Text searches for the entire string during a query.

For example, to find all documents that contain the phrase international law, enter your query with the phrase international law.

Did I answer your question or misunderstand you?

P.S. It seems to me that the solution is convert

"hello, hello; is there anybody out there?; nope;" "just the; basic facts;"

to

"hello, hello aa is there anybody out there? aa nope aa" "just the aa basic facts aa"

and search with CONTAINS for the phrase "is there anybody out there? aa"

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top