Question

I am programming a blog and I want the URIs to be the title like the question title here in stackoverflow or like wordpress.
What are the rules for sanitizing a URI?
Is there an already made code in PHP that does this?

Thanks in advance,
Omer

Was it helpful?

Solution

Many CMS's have implemented something like that, the one of Wordpress has been posted in another question. You might be interested in the question about this technique in general, too.

OTHER TIPS

This might be the shortest way to replace any non alphanumeric character with a single hyphen:

trim(preg_replace('/[^a-z0-9-]+/', '-', strtolower($str)), '-')

Here's how drupal does it.

In case of site goes down:

<?php
function pathauto_cleanstring($string)
{
    $url = $string;
    $url = preg_replace('~[^\\pL0-9_]+~u', '-', $url); // substitutes anything but letters, numbers and '_' with separator
    $url = trim($url, "-");
    $url = iconv("utf-8", "us-ascii//TRANSLIT", $url); // TRANSLIT does the whole job
    $url = strtolower($url);
    $url = preg_replace('~[^-a-z0-9_]+~', '', $url); // keep only letters, numbers, '_' and separator
    return $url;
}

Generally you'll want your URL to have only 0-9 and a-z, and make sure that everything is lowercase. Replace spaces with dashes (-), and strip the rest of the gibberish.

SO pretty much has it figured out.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top