You can easily test this yourself. Out of curiosity, I created a test case of four different scenarios:
Pattern.matcher().matches()
with an on-demandPattern
instance (created for each run)Pattern.matcher().matches()
with a cachedPattern
instance (created before all runs)String.equals()
for each element in the array, executed within a loopSet.contains()
on a cachedSet
(created before all runs)
Data Set: Input array containing 6000 randomly generated strings of 6 characters each. Each test was executed 10,000 times, the results of all runs were totaled and averaged.
The results (all times in ms - lower is obviously better). The first number is the total execution time of all 10,000 runs, the second number is the average of each run:
On-Demand Regex: 12934 (1.29 avg)
Pre-compile Regex: 458 (0.05 avg)
Loop: 77 (0.01 avg)
Set.contains: 4 (0.00 avg)
Long story short: if you're going to use a regular expression (which you shouldn't) at least create and cache the Pattern
. But assuming performance is what matters, you're not going to beat Set.contains()
if you know the list of words ahead of time.
Note The On-demand regex test includes the cost of constructing the StringBuilder
instance that is given to the Pattern.compile()
method, so not necessarily all of the extra time is spent in regex compilation. The Set.contains
test also has a slight advantage in that it's inlined, and avoids the extra stack creation of the method call. I modified the test to have that execute inside of a separate method, but it didn't materially affect the results.