Question

One table has "John Doe <jdoe@aol.com>" while another has "jdoe@aol.com". Is there a UDF or alternative method that'll match the email address from the first field against the second field?

This isn't going to be production code, I just need it for running an ad-hoc analysis. It's a shame the DB doesn't store both friendly and non-friendly email addresses.

Update: Fixed the formatting, should be < and > on the first one.

Was it helpful?

Solution

You could do a join using LOCATE method, something like...

 SELECT * FROM table1 JOIN table2 ON (LOCATE(table2.real_email, table1.friend_email) > 0) 

OTHER TIPS

I would split the email addresses on the last space- this should give you the email address. The exact code would depend on your database, but some rough pseudocode:

email = row.email
parts = email.split(" ")
real_email = parts[ len(parts) - 1 ]

You should be able to use the LIKE keyword depending on how consistent the pattern is for the "friendly" email addresses.

SELECT
     T1.nonfriendly_email_address,
     T2.friendly_email_address
FROM
     My_Table T1
INNER JOIN My_Table T2 ON
     T2.friendly_email_address LIKE '%<' + T1.nonfriendly_email_address + '>'

Perhaps the following TSQL code can help you:

DECLARE @email varchar(200)
SELECT @email = 'John Doe jdoe@aol.com'

SELECT REVERSE(SUBSTRING(REVERSE(@email), 0,CHARINDEX(' ', REVERSE(@email))))

That statement returns:

jdoe@aol.com

Talking through the logic:

  1. Reverse the email column
  2. Find the index of the first ' ' character ... everything up to this point is your actual email address
  3. Substring the column, from the start of the (reversed) string, to the index found at step 2.
  4. Reverse the string again, putting it in the right order.

There might be more elegant ways of doing it, but that would work, and you could therefore use it for one side of your JOIN. It works because email addresses cannot contain spaces, so therefore the last space (or first when you reverse it) will be the separator between your actual email address, and friendly one. As far as I know TSQL does not contain a LastIndexOf() function, which would have been handy to avoid the double Reverse() function calls.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top