Question

I'm trying to join 2 tables that have the common column 'NAME', but data is sorted like this:


TABLE A

NAME
B C Corporations
Tefal Inc.
West, Tom
Anne Zagabi
(C) NamyangSoy

TABLE B

NAME
BC Corporations
Tefal Inc
Tom West
AnneZagabi
( C ) NamyangSoy

The above are the cases that I came across. It's really ugly, BUT the one thing that made me think it may be possible with sql is that spelling of at least one word is the same in both tables.

However, I've tried soundex but the code is actually not in English so it didn't work (the above is just an example I made up in English). I've tried the difference function but it didn't work either (everything has value 4, I guess because it's not in English? I'm not sure).

I tried joining letter by letter but it didn't work either. I was hoping if there may be any other ways this could be done.. I'm using sqlcanvas, and the database is sybase. both tables have near 30 columns and ~12,000 rows each

Was it helpful?

Solution

Will something like this work for you?

select * from [Table A] a join [Table B] b On REPLACE(a.Name, ' ', '') = REPLACE(b.Name, ' ', '')

Use the Replace function to remove all spaces and compare the results.

For exampe, run this

 select
 CASE
 WHEN REPLACE('T  E  S  T', ' ', '') =  REPLACE('TE  ST', ' ', '')  THEN  'TRUE' else 'FALSE'
 END

Use code to replace all non-alpha characters:

CREATE FUNCTION [dbo].[fncRemoveNonAlphanumericChars](@Temp VarChar(1000))
RETURNS VarChar(1000)
AS
BEGIN
WHILE PatIndex('%[^A-Za-z0-9]%', @Temp) > 0
SET @Temp = Stuff(@Temp, PatIndex('%[^A-Za-z0-9]%', @Temp), 1, '')

RETURN @TEmp
END

Example:

SELECT dbo.fncRemoveNonAlphanumericChars('abc...DEF,,,GHI(((123)))456jklmn')

Result:

abcDEFGHI123456jklmn

(This was from here: http://jayhollingum.blogspot.com/2011/01/sql-server-remove-non-alphanumeric.html)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top