I am grabbing input from a file with the following code
$jap= str_replace("\n","",addslashes(strtolower(trim(fgets($fh), " \t\n\r"))));
i had also previously tried these while troubleshooting
$jap= str_replace("\n","",addslashes(strtolower(trim(fgets($fh)))));
$jap= addslashes(strtolower(trim(fgets($fh), " \t\n\r")));
and if I echo $jap it looks fine, so later in the code, without any other alterations to $jap it is inserted into the DB, however i noticed a comparison test that checks if this jap is already in the DB returned false when i can plainly see that a seemingly exact same entry of jap is in the DB. So I copy the jap entry that was inserted right from phpmyadmin or from my site where the jap is displayed and paste into a notepad i notice that it paste like this... (this is an exact paste into the below quotes)
"
バスにのって、うみへ行きました"
and obviously i need, it without that white space and breaks or whatever it is.
so as far as I can tell the trim is not doing what it says it will do. or im missing something here. if so what is it?
UPDATE: with regards to Jacks answer
the preg_replace did not help but here is what i did, i used the bin2hex() to determine that the part that "is not the part i want" is efbbbf i did this by taking $jap into str replace and removing the japanese i am expecting to find, and what is left goes into the bin2hex. and the result was the above "efbbbf"
echo bin2hex(str_replace("どちらがあなたの本ですか","",$jap));
output of the above was efbbbf but what is it? can i make a str_replace to remove this somehow?
Yes it does, see the manual: This function returns a string with whitespace stripped from the beginning and end of str .
Definition and Usage. The trim() function removes whitespace and other predefined characters from both sides of a string. Related functions: ltrim() - Removes whitespace or other predefined characters from the left side of a string.
PHP's trim() function removes all whitespace from the beginning and the end of a string.
Whitespaces are generally ignored in PHP syntax, but there are several places where you cannot put them without affecting the sense (and the result).
The trim
function doesn't know about Unicode white spaces. You could try this:
preg_replace('/^\p{Z}+|\p{Z}+$/u', '', $str);
As taken from: Trim unicode whitespace in PHP 5.2
Otherwise, you can do a bin2hex()
to find out what characters are being added at the front.
Update
Your file contains a UTF8 BOM; to remove it:
$f = fopen("file.txt", "r");
$s = fread($f, 3);
if ($s !== "\xef\xbb\xbf") {
// bom not found, rewind file
fseek($f, 0, SEEK_SET);
}
// continue reading here
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With