سؤال

Using SQL Server 2012, I have a table with 7 million rows. PK column is a GUID (COMB GUID). I am trying to test the performance of a query and first need to update a random sampling of data, I want to change a column value (not the PK) of 50,000 rows.

Selecting Top 50,000 Order by NEWID() takes way too long, I think SQL Server is scanning the whole table. I cannot seem to get the syntax right for TABLESAMPLE, it returns an empty set.

What is the best way to get this to work?

هل كانت مفيدة؟

المحلول

And to treat it as an update:

;WITH x AS
(
  SELECT TOP (50000) col 
  FROM dbo.table TABLESAMPLE (50000 ROWS)
)
UPDATE x SET col = 'something else';

But a couple of notes:

  1. You probably won't see a huge performance improvement over ORDER BY NEWID(). On a table with 1MM rows this took over a minute on my machine.
  2. The TOP is there because TABLESAMPLE doesn't guarantee the exact number of rows - it's based on a rough calculation of how many pages might contain 50,000 rows. You may end up with less or more depending on your fillfactor, how many variable-length columns, how many NULL values, etc. The TOP above will help limit it to 50,000 when the estimate leads to a larger number of pages being read, but it won't help if the estimate is under.

There is some discussion of this going on in another question right now.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top