Find links in string with PHP. Differ from normal and youtube links
Frage
I have a string that contain links. I want my php to do different things with my links, depending on the url.
Answer:
function fixLinks($text)
{
$links = array();
$text = strip_tags($text);
$pattern = '!(https?://[^\s]+)!';
if (preg_match_all($pattern, $text, $matches)) {
list(, $links) = ($matches);
}
$i = 0;
$links2 = array();
foreach($links AS $link) {
if(strpos($link,'youtube.com') !== false) {
$search = "!(http://.*youtube\.com.*v=)?([a-zA-Z0-9_-]{11})(&.*)?!";
$youtube = '<a href="youtube.php?id=\\2" class="fancy">http://www.youtube.com/watch?v=\\2</a>';
$link2 = preg_replace($search, $youtube, $link);
} else {
$link2 = preg_replace('@(https?://([-\w\.]+)+(:\d+)?(/([\-\w/_\.]*(\?\S+)?)?)?)@', '<a href="$1" target="_blank"><u>$1</u></a>', $link);
}
$links2[$i] = $link2;
$i++;
}
$text = str_replace($links, $links2, $text);
$text = nl2br($text);
return $text;
}
Lösung
First of all, ditch eregi
. It's deprecated and will disappear soon.
Then, doing this in just one pass is maybe a stretch too far. I think you'll be better off splitting this into three phases.
Phase 1 runs a regex search over your input, finding everything that looks like a link, and storing it in a list.
Phase 2 iterates over the list, checking whether a link goes to youtube (parse_url
is tremendously useful for this), and putting a suitable replacement into a second list.
Phase 3: you now have two lists, one containing the original matches, one containing the desired replacements. Run str_replace over your original text, providing the match list for the search parameter and the replacement list for the replacements.
There are several advantages to this approach:
- The regular expression for extracting links can be kept relatively simple, since it doesn't have to take special hostnames into account
- It is easier to debug; you can dump the search and replace arrays prior to phase 3, and see if they contain what you expect
- Because you perform all replacements in one go, you avoid problems with overlapping matches or replacing a piece of already-replaced text (after all, the replaced text still contains a URL, and you don't want to replace that again)
Andere Tipps
tdammers' answer is good, but another option is to use preg_replace_callback
. If you go with that, then the process changes a little:
- Create a regular expression to match all links, same as his Phase 1
- In the callback, search for the YouTube video id. This will require running a second
preg_match
, which is (in my opinion) the biggest problem with this technique. - Return the replacement string, based on whether or not it's YouTube.
The code would look something like this:
function replaceem($matches) {
$url = $matches[0];
preg_match('~youtube\.com.*v=([\w\-]{11})~', $url, $matches);
return isset($matches[0]) ?
'<a href="youtube.php?id='.$matches[1].'" class="fancy">'.
'http://www.youtube.com/watch?v='.$matches[1].'</a>' :
'<a href="'.$url.'" title="Åben link" alt="Åben link" '.
'target="_blank">'.$url.'</a>';
}
$text = preg_replace_callback('~(?:f|ht)tps?://[^\s]+~', 'replaceem', $text);