I'm attempting to implement a search function on a two tables with a one-to-many relationship. Think of it as a post with multiple tags. Each tag has its own row in the tag table.

I'd like to retrieve a post if all of the search terms can be found in either a) the post text, b) the post tags or c) both.

Let's say I've created my tables like this:

CREATE TABLE post (
    id MEDIUMINT NOT NULL AUTO_INCREMENT,
    text VARCHAR(100) NOT NULL
);

CREATE TABLE tag (
    id MEDIUMINT NOT NULL AUTO_INCREMENT,
    name VARCHAR(30) NOT NULL,
    post MEDIUMINT NOT NULL
);

And I create indexes like this:

CREATE FULLTEXT INDEX post_idx ON post(text);
CREATE FULLTEXT INDEX tag_idx ON tag(name);

If my search query were "TermA TermB" and wanted to search just in the post text, I'd formulate my SQL query like this:

SELECT * FROM post WHERE MATCH(text) AGAINST('+TermA +TermB' IN BOOLEAN MODE);

Is there a way to add tags into the mix? My previous attempt was this:

SELECT * FROM post 
RIGHT JOIN tag ON tag.post = post.id 
WHERE MATCH(post.text) AGAINST('TermA TermB' IN BOOLEAN MODE)
OR MATCH(tag.name) AGAINST('TermA TermB' IN BOOLEAN MODE);

The problem is, this is only an any words query and not an all words query. By this I mean, I'd like to retrieve the post if TermA is in the text and TermB is in the tags.

What am I missing here? Is this even possible using a fulltext search? Is there a better way to approach this?

有帮助吗?

解决方案

Try this one:

SELECT post.* 
FROM post 
INNER JOIN (SELECT post, GROUP_CONCAT(name SEPARATOR ' ') tags FROM tag GROUP BY post) tag ON post.id=tag.post
WHERE MATCH(post.text) AGAINST('+TermA +TermB' IN BOOLEAN MODE)
OR MATCH(tags) AGAINST('+TermA +TermB' IN BOOLEAN MODE)

This might work to also get results that match from either content or tags, but it didn't work in the MySQL 5.1:

SELECT post.*, GROUP_CONCAT(tag.name SEPARATOR ' ') tags 
FROM post 
LEFT JOIN tag ON post.id=tag.post
GROUP BY post.id
HAVING MATCH(post.text,tags) AGAINST('+TermA +TermB' IN BOOLEAN MODE)

so I rewrote it as:

SELECT post.*, tags
FROM post 
LEFT JOIN (SELECT post, GROUP_CONCAT(tag.name SEPARATOR ' ') tags FROM tag GROUP BY post) tags ON post.id=tags.post
WHERE MATCH(post.text, tags) AGAINST('+TermA +TermB' IN BOOLEAN MODE)

其他提示

This is possible, but I'm guessing that in your Tags table, you have one row for each tag per post. So one row containing the tag 'TermA' for post 1 and another record with the tag 'TermB', right?

The all words query (with +) only returns rows where the searched field contains all the specified words. For the tags table, that is never the case.

One possible solution would be to store all tags in a single field in the posts table itself. Then it would be easy to do advanced matching on the tags as well.

Another possibility is to change the condition for tags altogether. That is, use an all query for the text and an any query for the tags. To do that, you'll have to modify the search query yourself, which can fortunately be as easy as removing the plusses from the query.

You can also query for an exact match, like this:

SELECT * FROM post p
WHERE
  MATCH(p.text) AGAINST('TermA TermB' IN BOOLEAN MODE)
  AND 
     /* Number of matching tags .. */
     (SELECT COUNT(*) FROM tags t 
      WHERE 
        t.post = p.id
        AND (t.tag in ('TermA', 'TermB')
     = /* .. must be .. */ 
     2 /* .. number of searched tags */ )

In this query, I count the number of matching tags. In this case I want it to be exactly 2, meaning that both tags match (provided that tags are unique per post). You could also check for >= 1 to see if any tags match.

But as you can see, this also requires parsing of the search string. You will have to remove the plusses (or even check their existence to understand whether you want 'any' or 'all'). And you will have to split it as well to get the number of searched words, and get the separate words themselves.

All in all, adding all tags to a 'tags' field in post is the easiest way. Not ideal from a normalisation point of view, but that is managable, I think.

You can search on both text and tags.

SELECT * 
  FROM post 
 WHERE MATCH(text,tags) AGAINST('+TermA +TermB' IN BOOLEAN MODE)

To get this to work you'll need to make a FULLTEXT index for both columns together.

CREATE FULLTEXT INDEX keywords ON pos(text,tags)

In Boolean search mode this should do what you want.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top