Question

I'm trying to do a search in MySQL where the user just has one field. The table looks like this:

ID          BIGINT
TITLE       TEXT
DESCRIPTION TEXT
FILENAME    TEXT
TAGS        TEXT
ACTIVE      TINYINT

Now if the user inputs just blah blubber, the search must check wether every word appears in the fields TITLE, DESCRIOTION, FILENAME or TAGS. The result itself should be ordered by relevance, so how often does a string appear in the record. I got this example data:

ID   | TITLE   | DESCRIPTION  | FILENAME | TAGS | ACTIVE
1    | blah    | blah         | bdsai    | bdha | 1
2    | blubber | blah         | blah     | adsb | 1
3    | blah    | dsabsadsab   | dnsa     | dsa  | 1

In this example, ID 2 must be at the top (2x blah, 1x blubber), then 1 (2x blah ) and then 3 (1x blah). This process should be dynamical so the user can also input more words and the relevance works same as with one or several words.

Is this possible to realize only in MySQL, or do I have to use some PHP? How would this work exactly?

Thank you very much for your help! Regards, Florian

EDIT: Here is the result after I tried the answer of Tom Mac:

I have four records which look like this:

ID  | TITLE | DESCRIPTION | FILENAME | TAGS                          | ACTIVE
1   | s     | s           | s        | s                             | 1
2   | 0     | fdsadf      | sdfs     | a,b,c,d,e,f,s,a,a,s,s,as,sada | 1
3   | 0     | s           | s        | s                             | 1
4   | a     | a           | a        | a                             | 1

Now, if I search for the string s, I should only get the top three records, ordered by a relevance of s. This means, the records should be orderer like this:

ID | TITLE | DESCRIPTION | FILENAME | TAGS                          | ACTIVE
2  | 0     | fdsadf      | sdfs     | a,b,c,d,e,f,s,a,a,s,s,as,sada | 1        <== 8x s
1  | s     | s           | s        | s                             | 1        <== 4x s
3  | 0     | s           | s        | s                             | 1        <== 3x s

Now, I tried my query like this (the table's name is PAGES):

select t . *
  from (

        select 
              match(title) against('*s*' in boolean mode) 
            + match(description) against('*s*' in boolean mode) 
            + match(filename) against('*s*' in boolean mode) 
            + match(tags) against('*s*' in boolean mode) 
            as matchrank,
                bb . *
          from pages bb) t
 where t.matchrank > 0
 order by t.matchrank desc

This query returns this:

matchRank | ID  | TITLE | DESCRIPTION | FILENAME | TAGS                          | ACTIVE
2         | 2   | 0     | fdsadf      | sdfs     | a,b,c,d,e,f,s,a,a,s,s,as,sada | 1

Is this because of the wildcards? I think, the string *s* should also find a value which is only s ...

Was it helpful?

Solution

This might help you out. It does kinda assume that your MySQL table uses the MyISAM engine though:

create table blubberBlah (id int unsigned not null primary key auto_increment,
title varchar(50) not null,
description varchar(50) not null,
filename varchar(50) not null,
tags varchar(50)not null,
active tinyint not null
) engine=MyISAM;

insert into blubberBlah (title,description,filename,tags,active) 
values ('blah','blah','bdsai','bdha',1);
insert into blubberBlah (title,description,filename,tags,active) 
values ('blubber','blah','blah','adsb',1);
insert into blubberBlah (title,description,filename,tags,active) 
values ('blah','dsabsadsab','dnsa','dsa',1);

select t.*
from
(
 select MATCH (title) AGAINST ('blubber blah' IN BOOLEAN MODE)
       +MATCH (description) AGAINST ('blubber blah' IN BOOLEAN MODE)
       +MATCH (fileName) AGAINST ('blubber blah' IN BOOLEAN MODE)
       +MATCH (tags) AGAINST ('blubber blah' IN BOOLEAN MODE) as matchRank,
       bb.*
from blubberBlah bb
) t
order by t.matchRank desc;

EDIT

Another assumption that this solution makes is that the string that your searching for is >= 4 characters long. If there is a possibility that the 'search for' string i.e 'blubber' or 'blah' will be either 1, 2 or 3 characters long then you can always head to your my.cnf file and add ft_min_word_len=1 under the [mysqld] configuration options. Then restart MySQL and you should be good to go.

One final thing: if you are considering using this approach then you should add a FULLTEXT INDEX to each of the columns. Hence:

ALTER TABLE blubberBlah add fulltext index `blubberBlahFtIdx1`(`title`);
ALTER TABLE blubberBlah add fulltext index `blubberBlahFtIdx2`(`description`);
ALTER TABLE blubberBlah add fulltext index `blubberBlahFtIdx3`(`filename`);
ALTER TABLE blubberBlah add fulltext index `blubberBlahFtIdx4`(`tags`);

You can find more details on BOOLEAN FULLTEXT searching in the MySQL Docs.

OTHER TIPS

Rather than searching 'in boolean mode', use Match() Against() to determine a score. Add those scores up to get relevance.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top