質問

I'm trying to linkify a twitter post. But hashtags that look like "#löövet" doesn't get filter as I want them to. They get cut off before the foreign characters. The foreign characters should be allowed.

Anyone know how to alter the regex for this purpose?

Below is my example:

//Hashtag
$tweet = preg_replace("/ +#([a-z0-9_]*)?/i", " <a href=\"http://twitter.com/tag/\\1\" target=\"_blank\">#\\1</a>", $tweet);



//Problem: 
/*
* The function above does not match foreign characters as å/ä/ö
* Tag result example: tag = #löövet
* After preg_replace: tag = #l öövet
* Desired after preg_replace: tag = #löövet
*/   
役に立ちましたか?

解決

How about:

$tweet = preg_replace("/ +#(\p{Xwd}*)/u", " <a href=\"http://twitter.com/tag/$1\" target=\"_blank\">#$1</a>", $tweet);

\p{Xwd} has the same meaning that \w with all unicode letters and number and underscores.

If you don't want underscore, use \p{Xan}.

他のヒント

use \p{L} instead of a-z to match all unicode letters (or \p{L}\p{N} with numbers)

$tweet = preg_replace("/ +#([\p{L}\p{N}_]*)?/i", " <a href=\"http://twitter.com/tag/\\1\" target=\"_blank\">#\\1</a>", $tweet);

to find more about unicode in regexp look here

Instead of running behind the unicode, you can try this one if your hashtags do not contains any space.

/ +#(\S+)/

If you want to limit allowed letters to latin letters, you can use:

$tweet = preg_replace('/ +#([\p{Latin}0-9_]*)/u', ' <a href="http://twitter.com/tag/$1" target="_blank">#$1</a>", $tweet);
ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top