Question

I have a table that stores names that are input by the user into a text edit box.

There is no selection or masking on this input box

This leads the database to have inconsistent data such as

John Doe
 John Dow
Jonh doe
johh Doe

when the name is supposed to be John Doe.

I am wondering how to create a single query that can handle parsing the names back to what they are supposed to be (i know of all the wrongly spelt name occurances) with multple user names .

Example:

select name from tableName
set
(
if(name = 'John Dow' or name = 'Johh Doe') name = 'John Doe'
)

(
if(name = 'Janee Doe' or name = 'Jaan Doe' name = 'Jane Doe'
)

Using SQL and Oracle Developer

Was it helpful?

Solution

Well, you can use case:

update tableName
    set name = (case when name in ('John Dow', 'Johh Dow') then 'John Doe'
                     when name in ('Janee Doe', 'Jaan doe') then 'Jane Doe'
                     else name
                end)

You might want to include a where clause to limit the number of rows updated with the same value:

    where name in ('John Dow', 'Johh Dow', 'Janee Doe', 'Jaan doe')

OTHER TIPS

It's a simple LEFT OUTER JOIN; assuming that you have all your mis-spellings in a table and it is unique on the mis-spelling itself, i.e. each mis-spelling is associated with one and only one name. Just ensure you take the correct name first using NVL

select nvl(b.name, a.name) as name
  from tablename a
  join misspellings b
    on a.name = b.misspelling

I'm not certain I'd update the table (if that's what you're attempting it's not clear) what happens if the mis-spelling is actually correct?

Give this a whirl:

    with testdata as
(
select 'John     Doe' as name from dual
union
select ' John dow' as name from dual
union
select ' JON DOH ' as name from dual
union
select ' Joe   wtf ' as name from dual
),
transdata as (
select 'JOHN DOW' as badval, 'JOHN DOE' as goodval from dual
union
select 'JON DOH' as badval, 'JOHN DOE' as goodval from dual
)
select 
'"' || td.name || '"' as raw_name,
--initcap(trim(regexp_replace(nvl(tr.goodval, td.name), '(\W){2,}', ' '))) as output
initcap(nvl(tr.goodval, trim(regexp_replace(td.name, '(\W){2,}', ' ')))) as output
from testdata td, transdata tr
where upper(trim(regexp_replace(td.name, '(\W){2,}', ' '))) = tr.badval(+);


RAW_NAME,OUTPUT
" John dow",John Doe
" JON DOH ",John Doe
"John     Doe",John Doe
" Joe   wtf ",Joe Wtf

Make sure to outer join to your translation table. All that said, you might also try some basic cleanup on input as well.

Forget about joins, initcaps, CASE etc... Those are not usually used to search text. Make both sides upper and use Like operator. Copy/paste to see results:

With t AS
(
 SELECT 'John Doe' name FROM dual
  UNION ALL
 SELECT  'John Dow' FROM dual
  UNION ALL 
 SELECT 'Jonh doe' FROM dual
  UNION ALL
 SELECT 'johh Doe' FROM dual
)
SELECT name FROM t WHERE Upper(name) Like Upper('%jo%do%') --or '%john%do%' - up to you
/ 

Output: 

NAME
-------------
John Doe
John Dow
Jonh doe
johh Doe
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top