We solved similar problem the way bellow:
We had a core banking application, the customer sub-system needed a full text search on customers name, family, father name etc.
Different encoding, legacy migrated data, keyboard layouts and Farsi fonts ... made search process inaccurate.
We overcame the problem by replacing problematic characters with some standard one and saving the standard string for search purpose.
After several iterations, the replacement is as bellow that may come in handy:
Formula="UPPER(REPLACE(REPLACE(REPLACE
(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE
(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE
(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE
(REPLACE(REPLACE(REPLACE(REPLACE
(REPLACE(FirsName || LastName || FatherName,
chr(32),''),
chr(13),''),
chr(9),''),
chr(10),''),
'-',''),
'-',''),
'آ','ا'),
'أ', 'ا'),
'ئ', 'ي'),
'ي', 'ي'),
'ك', 'ک'),
'آإئؤةي','اايوهي'),
'ء',''),
'شأل','شاال'),
'ا.','اله'),
'.',''),
'الله','اله'),
'ؤ','و'),
'إ','ا'),
'ة','ه'),
' ا لله','اله'),
'ا لله','اله'),
' ا لله','اله'))"