Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unable to explode on 'þ' when using file_get_contents()

Tags:

php

I need to get the contents of a remote file, and then explode those contents on the symbol: "þ".

I can make it work if the string I am exploding is just a local variable, but I can't get it to work with file_get_contents();

$string = '1þClassic Los 1/10þþ15þ1þTrueþ2þCú';
$parts = explode("þ", $string);
var_dump($parts);

result:

array(8) {
  [0]=>
  string(1) "1"
  [1]=>
  string(16) "Classic Los 1/10"
  [2]=>
  string(0) ""
  [3]=>
  string(2) "15"
  [4]=>
  string(1) "1"
  [5]=>
  string(4) "True"
  [6]=>
  string(1) "2"
  [7]=>
  string(2) "Cu"
}

$string = file_get_contents('file.txt');
$parts = explode("þ", $string);
var_dump($parts);

result:

array(1) {
  [0]=>
  string(42) "1þClassic Los 1/10þþ15þ1þTrueþ2þCú"
}

Why can't I explode on that symbol when I use file_get_contents() ?

like image 675
James Arnold Avatar asked Aug 13 '12 13:08

James Arnold


2 Answers

The encoding of the symbol as you enter it in your PHP script, and that in your text file must match.

Make sure the encodings match. Check your IDE to see what encoding your PHP script is being saved as.

If you won't/can't change either's encoding for some reason....

  • If your PHP script is ISO-8859-1/Windows-1252, and the text file is utf-8, use

    $parts = explode(utf8_encode("þ"), $string);
    
  • If your PHP script is UTF-8, and the text file is ISO-8859-1/Windows-1252, use

    $parts = explode(utf8_decode("þ"), $string);
    
  • If your script and your text file are some other combination, you can also use iconv().

like image 83
Pekka Avatar answered Sep 23 '22 02:09

Pekka


PHP compares raw bytes for its exploding. The character "þ" can be represented in several different encodings, which means using different bytes. If the encoding the character is saved as in your source code is not the same as the one in file.txt (say, UTF-8 and Latin1 respectively), they won't match and hence won't explode.

like image 28
deceze Avatar answered Sep 20 '22 02:09

deceze