Question

I'm working with a large database of businesses.

I'd like to be able to compare two business names for similarity to see if they possibly might be duplicates.

Below is a list of business names that should test as having a high probability of being duplicates, what is a good way to go about this?

George Washington Middle Schl
George Washington School

Santa Fe East Inc
Santa Fe East

Chop't Creative Salad Co
Chop't Creative Salad Company

Manny and Olga's Pizza
Manny's & Olga's Pizza

Ray's Hell Burger Too
Ray's Hell Burgers

El Sol
El Sol de America

Olney Theatre Center for the Arts
Olney Theatre

21 M Lounge
21M Lounge

Holiday Inn Hotel Washington
Holiday Inn Washington-Georgetown

Residence Inn Washington,DC/Dupont Circle
Residence Inn Marriott Dupont Circle

Jimmy John's Gourmet Sandwiches
Jimmy John's

Omni Shoreham Hotel at Washington D.C.
Omni Shoreham Hotel

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top