Question

I have a table with a column of values with the following sample data that has been pulled for 1 user:

ID  |  Data
5      Record1
12     NULL
13     NULL
15     Record1
20     Record12
28     NULL
31     NULL
35     Record12
37     Record23
42     Record34
51     NULL
53     Record34 
58     Record5
61     Record17
63     NULL
69     Record17

What I would like to do is to delete any values in the Data column where the Data value does not have a start and finish record. So in the above Record 23 and Record 5 would be deleted.

Please note that the Record(n) may appear more than once so it's not as straight forward as doing a count on the Data value. It needs to be incremental, a record should always start and finish before another one starts, if it starts and doesnt finish then I want to remove it.

Was it helpful?

Solution

Sadly SQL Server 2008 does not have LAG or LEAD which would make the operation simpler.

You could use a common table expression for finding the non consecutive (non null) values, and delete them;

WITH cte AS (
  SELECT *, ROW_NUMBER() OVER (ORDER BY id) rn FROM table1 WHERE data IS NOT NULL
)
DELETE c1 FROM cte c1
LEFT JOIN cte c2 ON (c1.rn = c2.rn+1 OR c1.rn = c2.rn-1) AND c1.data = c2.data
WHERE c2.id IS NULL

An SQLfiddle to test with.

If you just want to see which rows would be deleted, replace DELETE c1 with SELECT c1.*.

...and as always, remember to back up before running potentially destructive SQL for random people on the Internet.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top