Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Strip hidden character from string

This is something that should be simple but I can't figure out.

The site in question is UTF-8 encoded.

A customer has been having trouble filling out a form on our website. Here is example data they have entered.

SPICER-SMITHS LOST

It looks like a regular string, but when you copy that string into an app like notepad++ you'll see a "?" appear in the word "SMITHS" ("SMITH?S").

The script sanitizes the field and goes the extra step of removing the following characters: "\r\n", "\n", "\r", "\t", "\0", "\x0B".

It's not catching this hidden character though.

Does anybody know what's going on here?

EDIT: I'm using php. Here is the function that I use to sanitize the field:

function strip_hidden_chars($str)
{
    $chars = array("\r\n", "\n", "\r", "\t", "\0", "\x0B");

    $str = str_replace($chars," ",$str);

    return preg_replace('/\s+/',' ',$str);
}

EDIT 2: @thaJeztah led me to the answer. The string I was testing was the output from our support ticket after the customer had copied and pasted it from whatever application she is using. The actual input was

SPICER-SMITH’S

like image 974
Bill H Avatar asked Feb 01 '13 20:02

Bill H


2 Answers

You may try to have a look here; remove control characters?

Remove control characters from php String

like image 73
thaJeztah Avatar answered Oct 25 '22 08:10

thaJeztah


this also work as well

$chars = array("\r\n", '\\n', '\\r', "\n", "\r", "\t", "\0", "\x0B");
str_replace($chars,"<br>",$data);
like image 41
Mikael HOUNDEGNON Avatar answered Oct 25 '22 08:10

Mikael HOUNDEGNON