java startsWith() method with custom rules

Question 1

I don't know all of your custom rules, but would a regular expression work?

The user is passing in a String. Create a method to convert that String to a regex, e.g.

replace a short hyphen with short or long ([-‒]),
same for your accents, e becomes [eé]
Prepend with the start of word dohicky (\b),

Then convert this to a regex and give it a go.

Note that the list of replacements could be kept in a Map as suggested by Tobbias. Your code could be something like

public boolean myStartsWith(String testString, String startsWith) {

    for (Map.Entry<String,String> me : fancyTransformMap) {
       startsWith = startsWith.replaceAll(me.getKey(), me.getValue());
    }

    return testString.matches('\b' + startsWith);
}

p.s. I'm not a regex super-guru so if there may be possible improvements.

Question 2

I'd think something like a HashMap that maps the undesirable characters to what you want them to be interpreted as might be the way to go if you are worried about performance;

HashMap<Character, Character> fastMap = new Map<Character, Character>();

// read it as '<long hyphen> can be interpreted as <regular-hyphen>
fastMap.add('–', '-');
fastMap.add('é', 'e');
fastMap.add('è', 'e');
fastMap.add('？', '?');
...
// and so on

That way you could ask for the value of the key: value = map.get(key).

However, this will only work as long as you have unique key-values. The caveat is that é can't be interpreted as è with this method - all the keys must be unique. However, if you are worried about performance, this is an exceedingly fast way of doing it, since the lookup time for a HashMap is pretty close to being O(1). But as others on this page has written, premature optimization is often a bad idea - try implementing something that works first, and if at the end of it you find it is too slow, then optimize.