Question

I need something in between a full text search and an index search:
I want to search for text in one column of my table (probably there will be an index on the column, too, if that matters).

Problem is, I want to search for words in the column, but I don't want to match parts.

For example, my column might contain business names:
Mighty Muck Miller and Partners Inc.
Boy & Butter Breakfast company

Now if I search for "Miller" I want to find the first line. But if I search for "iller" I don't want to find it, because there is no word starting with "iller". Searching for "Break" should find "Boy & Butter Breakfast company", though, since one word is starting with "Break".

So if I try and use

WHERE BusinessName LIKE %Break%

it will find too many hits.

Is there any way to Search for Words separated by whitespace or other delimiters?

(LINQ would be best, plain SQL would do, too)

Important: Spaces are by far not the only delimiters! Slashes, colons, dots, all non-alphanumerical characters should be considered for this to work!

Was it helpful?

Solution

SQL Server 2000 or above.

SELECT *
  FROM dbo.TblBusinessNames
 WHERE BusinessName like '%[^A-z^0-9]Break%' -- In the middle of a sentence
    OR BusinessName like 'Break%'            -- At the beginning of a sentence

Keyword Reference for LIKE: http://msdn.microsoft.com/en-us/library/aa933232(SQL.80).aspx

OTHER TIPS

Your word delimiters are going to be many: space, tab, beginning of line, parentheses, periods, commas, exclamation/question marks etc. So, a pretty simple solution is to use a regex in your WHERE clause. (And it's going to be a lot more efficient than just ORing every possible delimiter you can think of.)

Since you mentioned LINQ, here's an article that describes how to do efficient regex querying with SQL Server.

Complicated WHERE clauses like this always raise a red flag with me as far as performance is concerned, so I definitely suggest benchmarking whatever you end up with, you may decide to build a search index for the column after all.

EDIT: Saw you edited your question. When writing your regex, it's easy to just have it use any non-alphanum character as a delimiter, i.e. [^0-9a-zA-Z], or \W for any non-word character, \b for any word boundary and \B for any non-word boundary. Or, instead of matching delimiters, just match any word, i.e. \w+. Here's another example of someone doing regex searches with SQL Server (more complicated than what you'd need).

where BusinessName like 'Break%' -- to find if it is beginning with the word
or BusinessName like '% Break%' -- to find if it contains the word anywhere but the beginning
WHERE BusinessName LIKE '% Break%'

You mentioned LINQ - you could do something like...

string myPattern = "% Break%";

var query =
      from b in Business
      where SqlMethods.Like(b.BusinessName, myPattern) 
      select b;

Note that this uses the System.Linq.Data.SqlClient namespace which translates directly to the LIKE operator with no additional processing.

Try this:

declare @vSearch nvarchar(100)

set @vSearch = 'About'

select * from btTab where ' ' + vText + ' ' LIKE '%[^A-z^0-9]' + @vSearch + '[^A-z^0-9]%'
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top