Question

I have two tables Backup and Requests.

Below is the script for both the tables

Backup

CREATE TABLE UserBackup(
           FileName varchar(70) NOT NULL,
        )

File name is represented by a guid. Sometimes there is some additional information related to the file. Hence we have entries like guid_ADD entried in table.

Requests

CREATE TABLE Requests(
           RequestId UNIQUEIDENTIFIER NOT NULL,
           Status int Not null
        )

Here are some sample rows :

UserBackup table:

FileName
15b993cc-e8be-405d-bb9f-0c58b66dcdfe 
4cffe724-3f68-4710-b785-30afde5d52f8 
4cffe724-3f68-4710-b785-30afde5d52f8_Add
7ad22838-ddee-4043-8d1f-6656d2953545

Requests table:

RequestId                              Status
15b993cc-e8be-405d-bb9f-0c58b66dcdfe    1
4cffe724-3f68-4710-b785-30afde5d52f8    1
7ad22838-ddee-4043-8d1f-6656d2953545    2

What I need is to return all the rows from userbackup table whose name (the guid) is matches RequestId in the Requests table and the status is 1. So here is the query I wrote

Select * 
from UserBackup
inner join Requests on UserBackup.FileName = Requests.RequestId
where Requests.Status = 1

And this works fine. It returns me the following result

FileName                                      RequestId                              Status
15b993cc-e8be-405d-bb9f-0c58b66dcdfe          15b993cc-e8be-405d-bb9f-0c58b66dcdfe     1
4cffe724-3f68-4710-b785-30afde5d52f8          4cffe724-3f68-4710-b785-30afde5d52f8     1
4cffe724-3f68-4710-b785-30afde5d52f8_Add      4cffe724-3f68-4710-b785-30afde5d52f8     1

This is exactly what I want. But what I don't understand is how it is working. If you notice the result is returning 4cffe724-3f68-4710-b785-30afde5d52f8_Add row as well. The inner join is on varchar and uniqueidentifier, and this join instead of working like "Equals to" comparison works like "contains" comparison. I want to know how this works so that I can be sure to use this code without any unexpected scenarios.

Was it helpful?

Solution 2

When you compare two columns of different data types SQL Server will attempt to do implicit conversion on lower precedence.

The following comes from MSDN docs on uniqueidentifier

The following example demonstrates the truncation of data when the value is too long for the data type being converted to. Because the uniqueidentifier type is limited to 36 characters, the characters that exceed that length are truncated.

DECLARE @ID nvarchar(max) = N'0E984725-C51C-4BF4-9960-E1C80E27ABA0wrong'; 
SELECT @ID, CONVERT(uniqueidentifier, @ID) AS TruncatedValue;

http://msdn.microsoft.com/en-us/library/ms187942.aspx

Documentation is clear that data is truncated

When ever you are unsure about your join operation you can verify Actual Execution Plan.

Here is test sample that you can run inside SSMS or SQL Sentry Plan Explorer

DECLARE @userbackup TABLE ( _FILENAME VARCHAR(70) )

INSERT INTO @userbackup
    VALUES  ( '15b993cc-e8be-405d-bb9f-0c58b66dcdfe' ),
            ( '4cffe724-3f68-4710-b785-30afde5d52f8' ),
            ( '4cffe724-3f68-4710-b785-30afde5d52f8_Add' )
,           ( '7ad22838-ddee-4043-8d1f-6656d2953545' )


DECLARE @Requests TABLE
    (
     requestID UNIQUEIDENTIFIER
    ,_Status INT
    )
INSERT INTO @Requests
    VALUES  ( '15b993cc-e8be-405d-bb9f-0c58b66dcdfe', 1 )
,           ( '4cffe724-3f68-4710-b785-30afde5d52f8', 1 )
,           ( '7ad22838-ddee-4043-8d1f-6656d2953545', 2 )

SELECT *
    FROM @userbackup u
    JOIN @Requests r
        ON u.[_FILENAME] = r.requestID
    WHERE r.[_Status] = 1

Instead of regular join operation SQL Server is doing HASH MATCH with EXPR 1006 in SSMS it is hard to see what is doing but if you open XML file you will find this

<ColumnReference Column="Expr1006" />
<ScalarOperator ScalarString="CONVERT_IMPLICIT(uniqueidentifier,@userbackup.[_FILENAME] as [u].[_FILENAME],0)">

When ever in doubt check execution plan and always make sure to match data types when comparing.

This is great blog Data Mismatch on WHERE Clause might Cause Serious Performance Problems from Microsoft engineer on exact problem.

OTHER TIPS

The values on both sides of a comparison have to be of the same data type. There's no such thing as, say, comparing a uniqueidentifier and a varchar.

uniqueidentifier has a higher precedence than varchar so the varchars will be converted to uniqueidentifiers before the comparison occurs.

Unfortunately, you get no error or warning if the string contains more characters than are needed:

select CONVERT(uniqueidentifier,'4cffe724-3f68-4710-b785-30afde5d52f8_Add')

Result:

4CFFE724-3F68-4710-B785-30AFDE5D52F8

If you want to force the comparison to occur between strings, you'll have to perform an explicit conversion:

Select * 
from UserBackup
inner join Requests
on UserBackup.FileName = CONVERT(varchar(70),Requests.RequestId)
where Requests.Status = 1

What is happening here is the FileName is being converted from varchar to a UniqueIdentifier, and during that process it ignores anything after the first 36 characters.

You can see it in action here

Select convert(uniqueidentifier, UserBackup.FileName), FileName
  from UserBackup

It works, but to reduce confusion for the next person to come along, you might want to store the RequestId associated with the UserBackup as a GUID in the UserBackup table and join on that.

At the very least put a comment in ;)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top