Should I use strstr() to loosely validate urls before passing them to preg_match()?

StackOverflow https://stackoverflow.com/questions/17960197

  •  04-06-2022
  •  | 
  •  

Вопрос

I am writing a function to parse some videosites urls in order to generate embedding html:

if (strstr($url, 'a.com')) {
    $from = 'a';
} elseif (strstr($url, 'b.com')) {
    $from = 'b';
} else {
    return 'Wrong Video Url!';
}

if ($from == 'a') {
    // use preg_match() to retrieve video id to generate embedding html
    if (preg_match('#^http://a\.com/id_(\w*?)\.html$#', $url, $matches)) {
        // return video embedding html
    }
    return 'Wrong a.com Video Url!';      
}

if ($from == 'b') {
    if (preg_match('#^http://b\.com/v_(\w*?)\.html$#', $url, $matches)) {
        //return video embedding html
    }
    return 'Wrong b.com Video Url!';
}

My purpose of using strstr() is reducing calls of preg_match() in some situations, for example if I have b.com urls like this: http://www.b.com/v_OTQ2MDE4MDg.html, I don't have to call preg_match() twice.

But I am still not sure if this kind of practice is good or if there is a better way.

Это было полезно?

Решение

Why not just do an alternation? (At least in this case.)

'#^http://(?:a\.com/id|b\.com/v)_(\w*?)\.html$#'

That's one preg_match, and zero strstrs.

Also, not that it is a big danger in this case, but escaping the dots when they should be dots is generally a good idea; your regexp will match "http://bacom/v_id_xhtml" (with "id_" captured by (\w*?)).

If you can't make "one pattern that fits all" (and it's actually a bad idea if you have many options, because your legibility goes down the drain), use a pattern to extract the site name, then do a switch on it. It will then just be two preg_matches, and zero strstrs, no matter how many patterns you have.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top