In some situations in which you frequently need to perform these checks, a great performance improvement can be obtained from Interning.
Interning still requires us to have some string lookup data structure, such as a tree or hash table. However, we do these heavy lookups more rarely: specificaly, we do them only whenever some raw textual input arrives from the environment into our software system.
At that time, we take the input text as a character string and intern it: look it up in the existing set of strings, and convert it to an atom. An atom is small unit of data, typically a single-word quantity such as a machine pointer. If the string doesn't exist, the interning function gives us a new, unique atom. Otherwise, it gives us the same atom that it gave us previously for that string.
Once we have interned input strings to atoms, we always use the atoms in their place. So instead of comparing whether two strings are the same, we compare whether two atoms are the same, which is a "blindingly fast" single-word comparison operation. (We still use the the strings when we need to print atoms in a readable way: again at the boundary between our system and the outside world).
Interning comes from Lisp: in Lisp symbols are atoms. In the raw source code, they are textual, and so when code is read into memory, symbol names are interned to produce atoms which are basically pointers to symbol objects.
Interning crops up in other places such as the X Window system (XInternAtom
function):
Atom XInternAtom(Display *display, char *atom_name, Bool only_if_exists)
and in the Microsoft Windows API where the term "intern" is not used, but the function returns something called an ATOM
: Lisp terminology. What is interned is not a simple string but a "class" strucure:
ATOM WINAPI RegisterClass(const WNDCLASS *lpWndClass);
In both systems, these atoms are system-wide (server-wide in the case of X) and can be compared for equality in place of the objects they represent. In Windows if you have two ATOM
values which are equal, they are the same class.
The Flyweight Design Pattern from the GoF book is essentially a reinvention of interning, extended to structures other than strings (like WNDCLASS
in the above Win32 API); so if you want to "sell" the idea to your boss, you can take it from that angle.