Question

Based on the great solution provided by Sergey for my c# question here: Filter out path strings beginning with strings, i went away to craft a similar thing for T-SQL. I can refactor this to use in memory table as opposed to cursors, but i would really like to see if there is a set-based approach that i can take to achieve the same goal.

Scenario: I have paths in the db, like so: C:\Users, C:\Users\cheese, D:\Shadow\stuff ,D:\Shadow. I need to filter the paths leaving the rootmost ones only (e.g. out of the 4 above, leave only C:\Users, D:\Shadow)

This is what i have right now:

ALTER PROCEDURE [dbo].[GetPaths]
@guy NVARCHAR(MAX)
AS
DECLARE 
@tempPath NVARCHAR(MAX) = '',
@Path NVARCHAR (MAX),
@filteredPath TABLE (FilteredPath NVARCHAR(MAX))
BEGIN
    SET NOCOUNT ON;
    IF (@guy IS NOT NULL)
    BEGIN
        DECLARE C Cursor FOR
    SELECT Paths.Path 
            WHERE
                Paths.Username = @guy 
            ORDER BY Paths.Path DESC
            OPEN C
                FETCH NEXT FROM C INTO @Path 
            WHILE @@Fetch_Status=0 BEGIN
                IF (CHARINDEX(@tempPath, @Path = 0)
                BEGIN
                INSERT INTO @filteredPath(FilteredPath)
                        VALUES (@Path)
                END
                SET @tempPath = @Path 
                FETCH NEXT FROM C into @Path 
            END
            CLOSE C
            DEALLOCATE C
                SELECT * FROM @filteredPath
END
END
Was it helpful?

Solution

It is generally better to do things in the database as a select query rather than using cursors. I believe the following will make efficient use of an index on Paths(path):

select p.*
from Paths p
where not exists (select 1
                  from Paths p2
                  where p.path like p2.path + '%' and
                        p.path <> p2.path
                 );

(Unfortunately, I cannot test this today.)

By the way, your cursor expression is missing the from clause and that might have to do with why it doesn't work.

OTHER TIPS

Here's a thought.

If you take the length of the path and subtract the length of the path with the back-slash removed, you only want the ones with a value of 1. Something like this:

SELECT Paths.Path 
FROM   Paths
WHERE  Paths.Username = @guy 
       And Len(Path) - Len(Replace(Path, '\', '')) = 1
ORDER BY Paths.Path DESC

This assumes that your paths are all consistently formatted. If you have root paths that also include a trailing back-slash, then this solution will not work for you. Basically, this query will only returns paths with a single back-slash.

Can you not simply rewrite this as a query using a CASE statement to differentiate between those paths at the 'root' level and those that have sub-directories:

SELECT  DISTINCT
        CASE (CHARINDEX('\', Path, CHARINDEX('\', Path) + 1))
          WHEN 0 THEN Path
          ELSE SUBSTRING(Path, 1, (CHARINDEX('\', Path, CHARINDEX('\', Path) + 1) - 1))
        END
  FROM  Paths
  WHERE Username = @guy
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top