문제

I'm writing a very basic commenting system and want to implement a simple, efficient bad words filter.

I'm aware of the problems associated with bad word filters and realize it's basically impossible to write one that keeps misspellings and innuendo out, but I'm just wanting to write a very simple one that keeps correct spellings of vulgar words from being displayed.

I found a bad words list of about 400 words and put it into preg_replace() with the pattern being:

/(these|are|bad|words|like|ass)/

The problem is that it replaces any occurrence of the characters in the pattern, even if they are in the middle of a word. So, for example, assist will be replaced with ist.

Second question: instead of replacing the bad words with an empty string, or with a fixed-width string such as ****, is there a way to replace it with a string of asterisks with the same length of the replaced word?

도움이 되었습니까?

해결책

preg_replace_callback(
    '/\b(these|are|bad|words|like|ass)\b/',
    function (array $match) { return str_repeat('*', strlen($match[1])); },
    $comment
)

\b is a word boundary and will probably suffice for most cases; though it probably won't be perfect for all cases.

다른 팁

You could use word boundaries:

/\b(these|are|bad|words|like|ass)\b/

First off, one thing you want is word_boundary characters \b they are zero width and match the boundary of a word so make your regex:

/\b(these|are|bad|words|like|ass)\b/

secondly, to replace the string with another one of equal length just use a function that operates on the match.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top