Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using __get() (magic) to emulate readonly properites and lazy-loading

I'm using __get() to make some of my properties "dynamic" (initialize them only when requested). These "fake" properties are stored inside a private array property, which I'm checking inside __get.

Anyway, do you think it's better idea to create methods for each of these proprties instead of doing it in a switch statement?


Edit: Speed tests

I'm only concerned about performance, other stuff that @Gordon mentioned are not that important to me:

  • unneeded added complexity - it doesn't really increase my app complexity
  • fragile non-obvious API - I specifically want my API to be "isolated"; The documentation should tell others how to use it :P

So here are the tests that I made, which make me think that the performance hit agument is unjustified:

Results for 50.000 calls (on PHP 5.3.9):

enter image description here

(t1 = magic with switch, t2 = getter, t3 = magic with further getter call)

Not sure what the "Cum" thing mean on t3. It cant be cumulative time because t2 should have 2K then...

The code:

class B{}



class A{
  protected
    $props = array(
      'test_obj' => false,
    );

  // magic
  function __get($name){
    if(isset($this->props[$name])){
      switch($name){

        case 'test_obj':
          if(!($this->props[$name] instanceof B))
            $this->props[$name] = new B;

        break;
      }

      return $this->props[$name];
    }

    trigger_error('property doesnt exist');
  }

  // standard getter
  public function getTestObj(){
    if(!($this->props['test_obj'] instanceof B))
      $this->props['test_obj'] = new B;

    return $this->props['test_obj'];
  }
}



class AA extends A{

  // magic
  function __get($name){
    $getter = "get".str_replace('_', '', $name); // give me a break, its just a test :P

    if(method_exists($this, $getter))
      return $this->$getter();


    trigger_error('property doesnt exist');
  }


}


function t1(){
  $obj = new A;

  for($i=1;$i<50000;$i++){
    $a = $obj->test_obj;

  }
  echo 'done.';
}

function t2(){
  $obj = new A;

  for($i=1;$i<50000;$i++){
    $a = $obj->getTestObj();

  }
  echo 'done.';
}

function t3(){
  $obj = new AA;

  for($i=1;$i<50000;$i++){
    $a = $obj->test_obj;

  }
  echo 'done.';
}

t1();
t2();
t3();

ps: why do I want to use __get() over standard getter methods? the only reason is the api beauty; because i don't see any real disadvantages, I guess it's worth it :P


Edit: More Speed tests

This time I used microtime to measure some averages:

PHP 5.2.4 and 5.3.0 (similar results):

t1 - 0.12s
t2 - 0.08s
t3 - 0.24s

PHP 5.3.9, with xdebug active this is why it's so slow:

t1 - 1.34s
t2 - 1.26s
t3-  5.06s

PHP 5.3.9 with xdebug disabled:

t1 - 0.30
t2 - 0.25
t3 - 0.86


Another method:
 // magic
  function __get($name){
    $getter = "get".str_replace('_', '', $name);

    if(method_exists($this, $getter)){
      $this->$name = $this->$getter();   // <-- create it
      return $this->$name;                
    }


    trigger_error('property doesnt exist');
  }

A public property with the requested name will be created dynamically after the first __get call. This solves speed issues - getting 0.1s in PHP 5.3 (it's 12 times faster then standard getter), and the extensibility issue raised by Gordon. You can simply override the getter in the child class.

The disadvantage is that the property becomes writable :(

like image 336
Alex Avatar asked Feb 18 '12 15:02

Alex


2 Answers

Here is the results of your code as reported by Zend Debugger with PHP 5.3.6 on my Win7 machine:

Benchmark results

As you can see, the calls to your __get methods are a good deal (3-4 times) slower than the regular calls. We are still dealing with less than 1s for 50k calls in total, so it is negligible when used on a small scale. However, if your intention is to build your entire code around magic methods, you will want to profile the final application to see if it's still negligible.

So much for the rather uninteresting performance aspect. Now let's take a look at what you consider "not that important". I'm going to stress that because it actually is much more important than the performance aspect.

Regarding Uneeded Added Complexity you write

it doesn't really increase my app complexity

Of course it does. You can easily spot it by looking at the nesting depth of your code. Good code stays to the left. Your if/switch/case/if is four levels deep. This means there is more possible execution pathes and that will lead to a higher Cyclomatic Complexity, which means harder to maintain and understand.

Here is numbers for your class A (w\out the regular Getter. Output is shortened from PHPLoc):

Lines of Code (LOC):                                 19
  Cyclomatic Complexity / Lines of Code:           0.16
  Average Method Length (NCLOC):                     18
  Cyclomatic Complexity / Number of Methods:       4.00

A value of 4.00 means this is already at the edge to moderate complexity. This number increases by 2 for every additional case you put into your switch. In addition, it will turn your code into a procedural mess because all the logic is inside the switch/case instead of dividing it into discrete units, e.g. single Getters.

A Getter, even a lazy loading one, does not need to be moderately complex. Consider the same class with a plain old PHP Getter:

class Foo
{
    protected $bar;
    public function getBar()
    {
        // Lazy Initialization
        if ($this->bar === null) {
            $this->bar = new Bar;
        }
        return $this->bar;
    }
}

Running PHPLoc on this will give you a much better Cyclomatic Complexity

Lines of Code (LOC):                                 11
  Cyclomatic Complexity / Lines of Code:           0.09
  Cyclomatic Complexity / Number of Methods:       2.00

And this will stay at 2 for every additional plain old Getter you add.

Also, take into account that when you want to use subtypes of your variant, you will have to overload __get and copy and paste the entire switch/case block to make changes, while with a plain old Getter you simply overload the Getters you need to change.

Yes, it's more typing work to add all the Getters, but it is also much simpler and will eventually lead to more maintainable code and also has the benefit of providing you with an explicit API, which leads us to your other statement

I specifically want my API to be "isolated"; The documentation should tell others how to use it :P

I don't know what you mean by "isolated" but if your API cannot express what it does, it is poor code. If I have to read your documentation because your API does not tell me how I can interface with it by looking at it, you are doing it wrong. You are obfuscating the code. Declaring properties in an array instead of declaring them at the class level (where they belong) forces you to write documentation for it, which is additional and superfluous work. Good code is easy to read and self documenting. Consider buying Robert Martin's book "Clean Code".

With that said, when you say

the only reason is the api beauty;

then I say: then don't use __get because it will have the opposite effect. It will make the API ugly. Magic is complicated and non-obvious and that's exactly what leads to those WTF moments:

Code Quality: WTF per minute

To come to an end now:

i don't see any real disadvantages, I guess it's worth it

You hopefully see them now. It's not worth it.

For additional approaches to Lazy Loading, see the various Lazy Loading patterns from Martin Fowler's PoEAA:

There are four main varieties of lazy load. Lazy Initialization uses a special marker value (usually null) to indicate a field isn't loaded. Every access to the field checks the field for the marker value and if unloaded, loads it. Virtual Proxy is an object with the same interface as the real object. The first time one of its methods are called it loads the real the object and then delegates. Value Holder is an object with a getValue method. Clients call getValue to get the real object, the first call triggers the load. A ghost is the real object without any data. The first time you call a method the ghost loads the full data into its fields.

These approaches vary somewhat subtly and have various trade-offs. You can also use combination approaches. The book contains the full discussion and examples.

like image 79
Gordon Avatar answered Nov 13 '22 10:11

Gordon


If your capitalization of the class names and the key names in $prop matched, you could do this:

class Dummy {
    private $props = array(
        'Someobject' => false,
        //etc...
    );

    function __get($name){
        if(isset($this->props[$name])){
            if(!($this->props[$name] instanceof $name)) {
                $this->props[$name] = new $name();
            }

            return $this->props[$name];
        }

        //trigger_error('property doesnt exist');
        //Make exceptions, not war
        throw new Exception('Property doesn\'t exist');     
    }
}

And even if the capitalization didn't match, as long as it followed the same pattern it could work. If the first letter was always capitalized you could use ucfirst() to get the class name.


EDIT

It's probably just better to use plain methods. Having a switch inside a getter, especially when the code executed for each thing you try to get is different, practically defeats the purpose of the getter, to save you from having to repeat code. Take the simple approach:

class Something {
    private $props = array('Someobject' => false);

    public getSomeobject() {
        if(!($this->props['Someobject'] instanceof Someobject)) {
            //Instantiate and do extra stuff
        }

        return $this->props['Someobject'];
    }

    public getSomeOtherObject() {
        //etc..
    }
}
like image 2
Bailey Parker Avatar answered Nov 13 '22 11:11

Bailey Parker