Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does (string) 'hard-copy' a string?

PHP uses a copy-on-modification system.

Does $a = (string) $a; ($a is a already string) modify and copy anything?


Especially, this is my problem:

Parameter 1 is mixed / I want to allow to pass non-strings and convert them to strings.
But sometimes these strings are very large. So I want to omit copying of a param, that is already a string.

Can I use version Foo or do I have to use version Bar?

class Foo {     private $_foo;     public function __construct($foo) {         $this->_foo = (string) $foo;     } }  class Bar {     private $_bar;     public function __construct($bar) {         if (is_string($bar)) {             $this->_bar = $bar;         } else {             $this->_bar = (string) $bar;         }     } } 
like image 409
mzimmer Avatar asked Feb 28 '13 15:02

mzimmer


People also ask

Can strings be copied?

Sometime back I was asked how to copy a String in java. As we know that String is an immutable object, so we can just assign one string to another for copying it. If the original string value will change, it will not change the value of new String because of immutability.

How do you make a copy of a string?

To make a copy of a string, we can use the built-in new String() constructor in Java. Similarly, we can also copy it by assigning a string to the new variable, because strings are immutable objects in Java.

How do I copy a string to another string?

Copying one string to another - strcpystrcpy can be used to copy one string to another. Remember that C strings are character arrays. You must pass character array, or pointer to character array to this function where string will be copied. The destination character array is the first parameter to strcpy .

What is a string copy?

In the C Programming Language, the strcpy function copies the string pointed to by s2 into the object pointed to by s1. It returns a pointer to the destination.


1 Answers

The answer is that yes, it does copy the string. Sort-of... Not really. Well, it depends on your definition of "copy"...

>= 5.4

To see what's happening, let's look at the source. The executor handles a variable cast in 5.5 here.

    zend_make_printable_zval(expr, &var_copy, &use_copy);     if (use_copy) {         ZVAL_COPY_VALUE(result, &var_copy);         // if optimized out     } else {         ZVAL_COPY_VALUE(result, expr);         // if optimized out         zendi_zval_copy_ctor(*result);     } 

As you can see, the call uses zend_make_printable_zval() which just short-circuits if the zval is already a string.

So the code that's executed to do the copy is (the else branch):

ZVAL_COPY_VALUE(result, expr); 

Now, let's look at the definition of ZVAL_COPY_VALUE:

#define ZVAL_COPY_VALUE(z, v)                   \     do {                                        \         (z)->value = (v)->value;                \         Z_TYPE_P(z) = Z_TYPE_P(v);              \     } while (0) 

Note what that's doing. The string itself is NOT copied (which is stored in the ->value block of the zval). It's just referenced (the pointer remains the same, so the string value is the same, no copy). But it's creating a new variable (the zval part that wraps the value).

Now, we get into the zendi_zval_copy_ctor call. Which internally does some interesting things on its own. Note:

case IS_STRING:     CHECK_ZVAL_STRING_REL(zvalue);     if (!IS_INTERNED(zvalue->value.str.val)) {         zvalue->value.str.val = (char *) estrndup_rel(zvalue->value.str.val, zvalue->value.str.len);     }     break; 

Basically, that means that if it's an interned string, it won't be copied. but if it's not, it will be copied... So what's an interned string, and what does that mean?

<= 5.3

In 5.3, interned strings didn't exist. So the string is always copied. That's really the only difference...

Benchmark Time:

Well, in a case like this:

$a = "foo"; $b = (string) $a; 

No copy of the string will happen in 5.4, but in 5.3 a copy will occur.

But in a case like this:

$a = str_repeat("a", 10); $b = (string) $a; 

A copy will occur for all versions. That's because in PHP, not all strings are interned...

Let's try it out in a benchmark: http://3v4l.org/HEelW

$a = "foobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisoutfoobarbizbazbuztestingthisout"; $b = str_repeat("a", 300);  echo "Static Var\n"; testCopy($a); echo "Dynamic Var\n"; testCopy($b);  function testCopy($var) {     echo memory_get_usage() . "\n";     $var = (string) $var;     echo memory_get_usage() . "\n"; } 

Results:

  • 5.4 - 5.5 alpha 1 (not including other alphas, as the differences are minor enough to not make a fundamental difference)

    Static Var 220152 220200 Dynamic Var 220152 220520 

    So the static var increased by 48 bytes, and the dynamic var increased by 368 bytes.

  • 5.3.11 to 5.3.22:

    Static Var 624472 625408 Dynamic Var 624472 624840 

    The static var increased by 936 bytes while dynamic var increased by 368 bytes.

So notice that in 5.3, both the static and the dynamic variables were copied. So the string is always duplicated.

But in 5.4 with static strings, only the zval structure was copied. Meaning that the string itself, which was interned, remains the same and is not copied...

One Other Thing

Another thing to note is that all of the above is moot. You're passing the variable as a parameter to the function. Then you're casting inside the function. So copy-on-write will be triggered by your line. So running that will always (well, in 99.9% of cases) trigger a variable copy. So at best (interned strings) you're talking about a zval duplication and associated overhead. At worst, you're talking about a string duplication...

like image 85
ircmaxell Avatar answered Sep 19 '22 08:09

ircmaxell