Question

I'm trying to find a way to match a column of clean data in table 1 to a column of dirty data in table2 without making any changes to the dirty data. I was thinking a fuzzy match, but there are too many entries in the clean table to allow for CDEs to be used. So, for example:

Table 1
GroupID    CompanyName
123        CompanyA
445        CompanyB
556        CompanyC

Table 2
GroupID    Patientname
AE123789   PatientA
123987     PatientB
445111     PatientC

And I'm trying to match the insurance company to the patient using the group number. Is there a matching method out there? (Fortunately the group numbers are actually much longer and when looking for a single group's worth of patients, fuzzy matching works really well, so they seem to be unique enough to be applied here).

Working in SQL server 2008.

Was it helpful?

Solution

This changes slightly depending on which database you are using, but it looks like you're looking for something like this:

MSSQL

select *
from table1 t1
  join table2 t2 on t2.groupid like '%'+cast(t1.groupid as varchar(max))+'%' 

MySQL - use Concat():

select *
from table1 t1
  join table2 t2 on t2.groupid like concat('%',t1.groupid,'%') 
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top