find a result from a .doc type that store in a varbinary(max) column
-
21-09-2019 - |
Question
i want to write a query with Full-Text-Search on a column with varbinary(max) type that stored a .doc/.docx(MS-Word) file. my query must returns records that contain a word in stored file.
is this possible?
if yes,how?(please write an example)
if yes,can we write that for other language(e.g Arabic,Persian or a UniCode characters)?
thank you beforehand.
Solution
What you're looking for is fulltext indexing, which has been greatly improved in SQL Server 2008.
For an introduction, I would recommend checking out these articles here:
- SQL Server 2008 - Creating Full Text Catalog and Search
- Understanding Full-Text Indexing in SQL Server
- Fulltext-Indexing Workbench
Once you understand this and have created your own fulltext catalog, you should be able to search something like this:
SELECT ID, (other fields), DocumentColumn
FROM dbo.YourTable
WHERE CONTAINS(*, 'Microsoft Word')
And yes, Fulltext indexing and searching does support lots of languages - check out the links I've sent you and the SQL Server 2008 Books Online for details!
Marc
OTHER TIPS
If you have SQL Server 2005 or later, yes, you just need the filters:
If you have SQL Server 2000, doc files can be indexed, but not the newer Office 2007 format as far as I know (I've heard you may be able to borrow the IFilter by installing Word 2007 on the server).