We need to generate a unique URL from the title of a book - where the title can contain any character. How can we search-replace all the 'invalid' characters so that a valid and neat lookoing URL is generated?
For instance:
"The Great Book of PHP"
www.mysite.com/book/12345/the-great-book-of-php
"The Greatest !@#$ Book of PHP"
www.mysite.com/book/12345/the-greatest-book-of-php
"Funny title     "
www.mysite.com/book/12345/funny-title
                Ah, slugification
// This function expects the input to be UTF-8 encoded.
function slugify($text)
{
    // Swap out Non "Letters" with a -
    $text = preg_replace('/[^\\pL\d]+/u', '-', $text); 
    // Trim out extra -'s
    $text = trim($text, '-');
    // Convert letters that we have left to the closest ASCII representation
    $text = iconv('utf-8', 'us-ascii//TRANSLIT', $text);
    // Make text lowercase
    $text = strtolower($text);
    // Strip out anything we haven't been able to convert
    $text = preg_replace('/[^-\w]+/', '', $text);
    return $text;
}
This works fairly well, as it first uses the unicode properties of each character to determine if it's a letter (or \d against a number) - then it converts those that aren't to -'s - then it transliterates to ascii, does another replacement for anything else, and then cleans up after itself. (Fabrik's test returns "arvizturo-tukorfurogep")
I also tend to add in a list of stop words - so that those are removed from the slug. "the" "of" "or" "a", etc (but don't do it on length, or you strip out stuff like "php")
If “invalid” means non-alphanumeric, you can do this:
function foo($str) {
    return trim(preg_replace('/[^a-z0-9]+/', '-', strtolower($str)), '-');
}
This will turn $str into lowercase, replace any sequence of one or more non-alphanumeric characters by one hyphen, and then remove leading and trailing hyphens.
var_dump(foo("The Great Book of PHP") === 'the-great-book-of-php');
var_dump(foo("The Greatest !@#$ Book of PHP") === 'the-greatest-book-of-php');
var_dump(foo("Funny title     ") === 'funny-title');
                        You can use a simple regular expression for this purpose:
<?php
    function safeurl( $v )
    {
        $v = strtolower( $v );
        $v = preg_replace( "/[^a-z0-9]+/", "-", $v );
        $v = trim( $v, "-" );
        return $v;
    }
    echo "<br>www.mysite.com/book/12345/" . safeurl( "The Great Book of PHP" );
    echo "<br>www.mysite.com/book/12345/" . safeurl( "The Greatest !@#$ Book of PHP" );
    echo "<br>www.mysite.com/book/12345/" . safeurl( "  Funny title  " );
    echo "<br>www.mysite.com/book/12345/" . safeurl( "!!Even Funnier title!!" );
?>
                        If you want to allow only letters, digits and underscore (usual word characters) you can do:
$str = strtolower(preg_replace(array('/\W/','/-+/','/^-|-$/'),array('-','-',''),$str));
It first replaces any non-word character(\W) with a -.
Next it replaces any consecutive - with a single -
Next it deletes any leading or trailing -.
Working link
This code comes from CodeIgniter's url helper. It should do the trick.
function url_title($str, $separator = 'dash', $lowercase = FALSE)
    {
        if ($separator == 'dash')
        {
            $search     = '_';
            $replace    = '-';
        }
        else
        {
            $search     = '-';
            $replace    = '_';
        }
        $trans = array(
                        '&\#\d+?;'              => '',
                        '&\S+?;'                => '',
                        '\s+'                   => $replace,
                        '[^a-z0-9\-\._]'        => '',
                        $replace.'+'            => $replace,
                        $replace.'$'            => $replace,
                        '^'.$replace            => $replace,
                        '\.+$'                  => ''
                      );
        $str = strip_tags($str);
        foreach ($trans as $key => $val)
        {
            $str = preg_replace("#".$key."#i", $val, $str);
        }
        if ($lowercase === TRUE)
        {
            $str = strtolower($str);
        }
        return trim(stripslashes($str));
    }
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With