Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best way to handle dirty state in an ORM model

I don't want anyone saying "you should not reinvent the wheel, use an open source ORM"; I have an immediate requirement and cannot switch.

I'm doing a little ORM, that supports caching. Even not supporting caching, I would need this feature anyways, to know when to write an object to storage or not. The pattern is DataMapper.

Here is my approach:

  • I want to avoid runtime introspection (i.e. guessing attributes).
  • I don't want to use a CLI code generator to generate getters and setters (really I use the NetBeans one, using ALT+INSERT).
  • I want the model to be the closest to a POPO (plain old PHP object). I mean: private attributes, "hardcoded" getters and setters for each attribute.

I have an Abstract class called AbstractModel that all the models inherit. It has a public method called isDirty() with a private (can be protected too, if needed) attribute called is_dirty. It must return true or false depending if there is a change on the object data or not since it was loaded.

The issue is: is there a way to raise the internal flag "is_dirty" without coding in each setter $this->is_dirty = true? I mean: I want to have the setters as $this->attr = $value most of the time, except a code change is needed for business logic.

Other limitation is that I cannot rely on __set because on the concrete model class the attributes already exists as private, so __set is never called on the setters.

Any ideas? Code examples from others ORMs are accepted.

One of my ideas was to modify the NetBeans setters template, but I think there should be a way of doing this without relying on the IDE.

Another thought I had was creating the setters and then change the private attribute's name with an underscore or something. This way the setter would call to __set and have some code there to deal with the "is_dirty" flag, but this breaks the POPO concept a little, and it's ugly.

like image 837
Diego Avatar asked Jun 07 '12 21:06

Diego


People also ask

What is Orm and why should you use it?

As we will see, ORM lets you use classes instead of database tables, and those classes implement DBAL to save you from writing queries. This is because, with ORM, you never write queries to the database in your code. Instead, you only interact with objects of specific classes, that represent records in the database.

What is Orm diagram?

According to this article, there are mainly three parts to illustrate what is ORM diagram, to tell you how to create ORM diagrams and to show you some ORM diagram examples. Generally speaking, the ORM diagram is a diagram which can be used for data modelling in software engineering.

What are the disadvantages of using ORM?

There is an attached disadvantage in using ORM as well. That is when the database is in legacy file systems and disarranged. It becomes a task to arrange a whole lot of data and then map this with ORM. It is thereby suggested to use ORM when the back end is fairly managed.

How do you model a shipping address in Orm?

You start by modeling a new class, Account, to hold all the data of your customers like email, password, and so forth. Then, you create another class, ShippingAddress, to hold information about one individual address. When you do that with ORM, ORM creates two tables in the database (e.g. account and shipping_address) to map such objects.


2 Answers

Attention!
My opinion on the subject has somewhat changed in the past month. While the answer where is still valid, when dealing with large object graphs, I would recommend using Unit-of-Work pattern instead. You can find a brief explanation of it in this answer

I'm kinda confused how what-you-call-Model is related to ORM. It's kinda confusing. Especially since in MVC the Model is a layer (at least, thats how I understand it, and your "Models" seem to be more like Domain Objects).

I will assume that what you have is a code that looks like this:

  $model = new SomeModel;
  $mapper = $ormFactory->build('something');

  $model->setId( 1337 );
  $mapper->pull( $model );

  $model->setPayload('cogito ergo sum');

  $mapper->push( $model );

And, i will assume that what-you-call-Model has two methods, designer to be used by data mappers: getParameters() and setParameters(). And that you call isDirty() before mapper stores what-you-call-Model's state and call cleanState() - when mapper pull data into what-you-call-Model.

BTW, if you have a better suggestion for getting values from-and-to data mappers instead of setParameters() and getParameters(), please share, because I have been struggling to come up with something better. This seems to me like encapsulation leak.

This would make the data mapper methods look like:

  public function pull( Parametrized $object )
  {
      if ( !$object->isDirty() )
      {
          // there were NO conditions set on clean object
          // or the values have not changed since last pull
          return false; // or maybe throw exception
      }

      $data = // do stuff which read information from storage

      $object->setParameters( $data );
      $object->cleanState();

      return $true; // or leave out ,if alternative as exception
  }

  public static function push( Parametrized $object )
  {
      if ( !$object->isDirty() )
      {
          // there is nothing to save, go away
          return false; // or maybe throw exception
      }

      $data = $object->getParameters();
      // save values in storage
      $object->cleanState();

      return $true; // or leave out ,if alternative as exception
  }

In the code snippet Parametrized is a name of interface, which object should be implementing. In this case the methods getParameters() and setParameters(). And it has such a strange name, because in OOP, the implements word means has-abilities-of , while the extends means is-a.

Up to this part you should already have everything similar...


Now here is what the isDirty() and cleanState() methods should do:

  public function cleanState()
  {
      $this->is_dirty = false;
      $temp = get_object_vars($this);
      unset( $temp['variableChecksum'] );
      // checksum should not be part of itself
      $this->variableChecksum = md5( serialize( $temp ) );
  }

  public function isDirty()
  {
      if ( $this->is_dirty === true )
      {
          return true;
      }

      $previous = $this->variableChecksum;

      $temp = get_object_vars($this);
      unset( $temp['variableChecksum'] );
      // checksum should not be part of itself
      $this->variableChecksum = md5( serialize( $temp ) );

      return $previous !== $this->variableChecksum;
  }
like image 155
tereško Avatar answered Oct 06 '22 00:10

tereško


I would make a proxy to set for example:

class BaseModel {

   protected function _set($attr, $value) {
      $current = $this->_get($attr);
      if($value !== $current) {
         $this->is_dirty = true;
      }

      $this->$attr = $value;
   }
}

Then each child class would implemnt its setter by calling _set() and never set the property directly. Further, you can always inject more class specific code into each sub class's _set and just call parent::set($attr, $processedValue) if needed. Then if you want to use magic methods you make those proxy to property method that proxies to _set. I suppose this isnt very POPO though.

like image 26
prodigitalson Avatar answered Oct 05 '22 23:10

prodigitalson