문제

I'm trying to linkify a twitter post. But hashtags that look like "#löövet" doesn't get filter as I want them to. They get cut off before the foreign characters. The foreign characters should be allowed.

Anyone know how to alter the regex for this purpose?

Below is my example:

//Hashtag
$tweet = preg_replace("/ +#([a-z0-9_]*)?/i", " <a href=\"http://twitter.com/tag/\\1\" target=\"_blank\">#\\1</a>", $tweet);



//Problem: 
/*
* The function above does not match foreign characters as å/ä/ö
* Tag result example: tag = #löövet
* After preg_replace: tag = #l öövet
* Desired after preg_replace: tag = #löövet
*/   
도움이 되었습니까?

해결책

How about:

$tweet = preg_replace("/ +#(\p{Xwd}*)/u", " <a href=\"http://twitter.com/tag/$1\" target=\"_blank\">#$1</a>", $tweet);

\p{Xwd} has the same meaning that \w with all unicode letters and number and underscores.

If you don't want underscore, use \p{Xan}.

다른 팁

use \p{L} instead of a-z to match all unicode letters (or \p{L}\p{N} with numbers)

$tweet = preg_replace("/ +#([\p{L}\p{N}_]*)?/i", " <a href=\"http://twitter.com/tag/\\1\" target=\"_blank\">#\\1</a>", $tweet);

to find more about unicode in regexp look here

Instead of running behind the unicode, you can try this one if your hashtags do not contains any space.

/ +#(\S+)/

If you want to limit allowed letters to latin letters, you can use:

$tweet = preg_replace('/ +#([\p{Latin}0-9_]*)/u', ' <a href="http://twitter.com/tag/$1" target="_blank">#$1</a>", $tweet);
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top