Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why would var_dump return a bigger value than the string length?

I am working on getting some song lyrics using an API, and converting the lyrics string into an array of words. I am getting some unusual behaviors in preg_replace function. When I did some debugging using var_dump, I see that var_dump returns a value of 10 for the string "you", which tells me that there might be something wrong. After that preg_replace acts weirdly.

This is my code:

$source = get_chart_lyrics_data("madonna","frozen");
$pieces = explode("\n", $source);
$lyrics = array();
for($i=0;$i<count($pieces);$i++){
  if($i>10){
    $words = explode(" ",$pieces[$i]);
    foreach($words as $_word){
      if($_word=="")
        continue;
      var_dump($_word);
      $word = strtolower($_word);
      var_dump($word);
      $word = trim($word);
      var_dump($word);
      $word = preg_replace("/[^A-Za-z ]/", '', $word);
      var_dump($word);
      $lyrics[$word]++;
    }
  }
}

This is the first 4 lines this code returns:

string(10) “You”
string(10) “you”
string(10) “you”
string(8) “lyricyou”

How come var_dump is returning a value of 10 for "you"? And why preg_replace is acting like that?

Thanks.

like image 352
Baykal Avatar asked Jan 23 '15 04:01

Baykal


People also ask

What is the use of Var_dump in PHP?

The php var_dump () function returns a data structure which includes information about the passed variables type and value. With var_dump (), you can see the type and values of strings, ints, floats, arrays and objects. Below is an example of using var_dump () to output different types of variables.

Is it possible to capture var_dump() output to a string?

As Bryan said, it is possible to capture var_dump () output to a string. But it's not quite exact if the dumped variable contains HTML code. I wrote this dandy little function for using var_dump () on HTML documents so I don't have to view the source. The second parameter lets you specify the height of the box.

What is the value of the second parameter in var_dumping?

The second parameter lets you specify the height of the box. Default is 9em, but if you're expecting a huge output you'll probably want a higher value. Happy var_dumping. Be careful this outputs to stdout stream (1) instead of the proper stderr stream (2). * Better GI than print_r or var_dump -- but, unlike var_dump, you can only dump one variable.

What is the value of $B in var_dump?

$b = "Hello world!"; The var_dump () function dumps information about one or more variables. The information holds type and value of the variable (s). var_dump ( var1, var2, ...); var1, var2, ...


1 Answers

The likeliest answer is that the string contains non-printable characters beyond "you". To figure out what exactly it contains, you'll have to look at the raw bytes. Do this with echo bin2hex($word). This outputs a string like 666f6f..., where every 2 characters are one byte in hexadecimal notation. You may make that more readable with something like:

echo join(' ', str_split(bin2hex($word), 2));
// 66 6f 6f ...

Now use your favourite ASCII/Unicode table (depending on the encoding of the string) to figure out what individual characters those represent and where you got them from.

Perhaps your string is encoded in UTF-16, in which case you should see telltale 00 bytes every two characters.

like image 81
deceze Avatar answered Oct 06 '22 00:10

deceze