Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the best way to clean a string for placement in a URL, like the question name on SO?

I'm looking to create a URL string like the one SO uses for the links to the questions. I am not looking at rewriting the url (mod_rewrite). I am looking at generating the link on the page.

Example: The question name is:

Is it better to use ob_get_contents() or $text .= ‘test’;

The URL ends up being:

http://stackoverflow.com/questions/292068/is-it-better-to-use-obgetcontents-or-text-test

The part I'm interested in is:

is-it-better-to-use-obgetcontents-or-text-test

So basically I'm looking to clean out anything that is not alphanumeric while still keeping the URL readable. I have the following created, but I'm not sure if it's the best way or if it covers all the possibilities:

$str = urlencode(
    strtolower(
    str_replace('--', '-', 
    preg_replace(array('/[^a-z0-9 ]/i', '/[^a-z0-9]/i'), array('', '-'), 
    trim($urlPart)))));

So basically:

  1. trim
  2. replace any non alphanumeric plus the space with nothing
  3. then replace everything not alphanumeric with a dash
  4. replace -- with -.
  5. strtolower()
  6. urlencode() -- probably not needed, but just for good measure.
like image 670
Darryl Hein Avatar asked Feb 12 '09 03:02

Darryl Hein


People also ask

How to clean guitar strings?

Have a cleaning solution and a cloth or a rag ready to clean your strings. There are some people who prefer to use a paper towel, a dry dish towel, a clean cloth diaper, or a microfiber cloth. Any cloth or paper product will work for your purposes, but you do want to make sure that it’s a clean, dry, and soft cloth.

How to make a URL more friendly?

Hope that helps somebody challenged by slugifying URLs and keeping umlauts and friends with their URL friendly equivalent at the same time. Show activity on this post. You should consider using a regular expression instead. It's much more efficient than what you're trying to do above. More on Regular Expressions here. Show activity on this post.

How do I know what type of strings I have?

If you don’t have the product packaging anymore, a good way to figure out what type of strings you have is to consider what type of genres your guitar is made to play.


1 Answers

As you pointed out already, urlencode() is not needed in this case and neither is trim(). If I understand correctly, step 4 is to avoid multiple dashes in a row, but it will not prevent more than two dashes. On the other hand, dashes connecting two words (like in "large-scale") will be removed by your solution while they seem to be preserved on SO.

I'm not sure that this is really the best way to do it, but here's my suggestion:

$str = strtolower( 
  preg_replace( array('/[^a-z0-9\- ]/i', '/[ \-]+/'), array('', '-'), 
  $urlPart ) );

So:

  1. remove any character that is neither space, dash, nor alphanumeric
  2. replace any consecutive number of spaces or dashes with a single dash
  3. strtolower()
like image 123
cg. Avatar answered Nov 04 '22 06:11

cg.