Domanda

We have tables Site and Content in our database.

Each site is operated by a different client and each site has it's own content.

On the front end of the sites we offer a search box which uses a fulltext/freetext search on the content table to return results but each site can only return results from its self, not from other sites in the database.

SQL Server query optimizer is behaving badly here. If it optimizes the query for a site with little content then the query performs horribly for sites with lots of content causing timeouts.

We understand that we can add OPTION(RECOMPILE) to the end of the query to fix this but my question is this...

Would it be better to create a cache table for each site so that the content for each site could be cached periodically and have the search stored procedure look for a cache table instead using a parameter?

The cache would only be updated / refreshed whenever content is added/changed.

My thinking is that this would....

a) Reduce the size of the table being searched to only contain the records for the correct site

b) Allow the FullText search to generate a more accurate index of the content for each site

c) Allow the query optimizer to cache the optimized queries for each site independently

Is this correct? Am I right in doing it this way?

È stato utile?

Soluzione

You are asking the right questions. It's a trade-off. You have to decide which is better/ or worse for your situation.

Will you be adding sites frequently? How many rows do you expect total and for each site? In general SQL Server 2008 Full-Text Search will do fine up to 10's of millions of rows. If you expect more than that, I'd split the sites out individual tables.

Keep in mind, even if you split out to multiple tables, your query plans could still vary greatly due to the number or words being returned from a given search term. You may still want to consider using OPTION(RECOMPILE).

Here are some advantages of each route:

Single Table

  • No schema changes to add additional sites.
  • Easier to manage.
  • Don't need to worry about separate stored procedures or dynamic SQL to handle multiple tables.

Multiple Tables

  • Smaller tables and indexes (don't need a SiteId).
  • Better full-text performance due to smaller catalogs.
  • Potentially better separation of Data.

Altri suggerimenti

Have you tried "option(optimize for unknown)"? This will generate one generic execution plan for all your inputs regardless of how many rows are expected. It will cost more for smaller sites than before but should be fine for the larger ones and still be acceptable for smaller ones. Here's a blog post detailing the inner workings: http://www.benjaminnevarez.com/2010/06/how-optimize-for-unknown-works/.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top