If there's an accepted standard way to do this, it's to use Lucene. There are some regex gimmicks you can use, like this one from RegexBuddy's library (where word1
and word2
are placeholders for the search terms, and the 3
in {1,3}?
is the maximum distance):
\b(?:word1(?:\W+\w+){1,3}?\W+word2|word2(?:\W+\w+){1,3}?\W+word1)\b
Trouble is, this relies on an extremely simplistic, arbitrary notion of what constitutes a word. It doesn't match contractions or hyphenated words, but it does match "words" with digits and underscores in them. You could tweak the regex to deal with those problems, but more will pop up to replace them. And ugly as it already was, each tweak makes the regex that much less readable, that much harder to maintain.
This barely scratches the surface of what full-text search engines save you from. If you have a very specific, tightly constrained task to accomplish, regexes or other "syntax-level" tools might suit. But if you need to work at the semantic level, recognizing natural-language words and phrases, you want a search engine or other dedicated tool.