Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Implementing 1 to n mapping for ORM c++

Tags:

c++

orm

mapping

I am writing a project where I need to implement a stripped down version of an ORM solution in C++. I am struck in implementing 1-n relationships for the same.

For instance, if the following are the classes:

class A
{
    ...
}

class B
{
    ...
    std::list<A> _a_list;
    ...
}

I have provided load/save methods for loading/saving to the db. Now, if I take the case of B and the following workflow:

  • 1 entry from _a_list is removed
  • 1 entry from _a_list is modified
  • 1 entry is added to _a_list

Now, I need to update the db using something like "b.save()". So, what would be the best way to save the changes, i.e, identify the additions, deletions and updates to _a_list.

like image 208
karan Avatar asked Dec 16 '09 11:12

karan


5 Answers

My first idea would be to encapsulate all possible db operations as command objects (Command pattern). This way you can create as many commands as you want until you call the Save() method to update the database. Here you need to ensure that these commands are handled as transactions. A quick implementation would be something like this:

Header:

#include <vector>

using namespace std;

class B;
class Cmd;

class B
{
    private:
        vector<Cmd*> m_commands;
    public:
        void AddCmd( Cmd* p_command );
        void Save();
};

class Cmd
{
    protected:
        B* m_database;

    public:
        Cmd( B* p_database );
        virtual void Execute() = 0;
        virtual void Undo() = 0;
};

class InsertCmd : public Cmd
{
    private:
        int m_newEntry;
    public:
        InsertCmd( B* p_database, int p_newEntry );
        void Execute() { cout << "insert " << m_newEntry << endl; }
        void Undo()    { /* undo of insert */ }
};

Source:

#include "DbClass.h"

void B::AddCmd( Cmd* p_command )
{
    m_commands.push_back(p_command);
}

void B::Save()
{
    for( unsigned int i=0; i<m_commands.size(); i++ )
        m_commands[i]->Execute();
}

Cmd::Cmd( B* p_database ) : m_database(p_database)
{
    m_database->AddCmd(this);
}

InsertCmd::InsertCmd( B* p_database, int p_newEntry ) 
: Cmd(p_database), m_newEntry(p_newEntry)
{
}

Test Main:

#include "DbClass.h"

int main()
{
    B database;
    InsertCmd  insert( &database, 10 );
    database.Save();

    return 0;
}
like image 120
codencandy Avatar answered Nov 12 '22 10:11

codencandy


One strategy would be to use an enum to represent the 'status' of a record. Ie

enum RecordState {
    RECORD_UNMODIFIED,
    RECORD_NEW,
    RECORD_CHANGED,
    RECORD_DELETED
};

You would give each record a RecordState (defaulting to RECORD _NEW / RECORD _UNMODIFIED as appropriate) and when Save() was called, it would perform the appropriate action for every record and reset their state to RECORD _UNMODIFIED. Deletes would be eliminated from the list as they were processed.

like image 30
Adam Luchjenbroers Avatar answered Nov 12 '22 09:11

Adam Luchjenbroers


Record status is indeed a good idea.

I suggest that either:

(a) the app keeps deleted objects in the arrays and they are actually removed only when the ORM-like code is called to do a save (which is when it does INSERTs, UPDATEs and DELETEs)

OR

(b) the ORM context needs to maintain internally a behind-the-scenes list of all objects that have either been SELECTEDed from disk or created in RAM for each database transaction (or if not using transactions, connection). This list is iterated when the ORM is asked to save and INSERTs, UPDATEs and DELETEs are based on this list.

In the second case, you often find an additional requirement to be able to dissociate/detach an object from the ORM in some parts of the system, to create a persistent snapshot of a state or a modified version of an object that (according to some high level data flow model or other) is not immediately for storage, so an extra bit or enum state is desirable to reflect detachment. You may wish to be able to reassociate/reattach an object with an ORM transaction but note that there may be integrity issues involved here that if they need handling must be handled, and the method for handling them is often application specific.

Note that freshly created objects that are deleted before their first save should not generate a SQL DELETE hence an enum with UNMODIFED, NEW, CHANGED, DELETED is in practice often not enough, you also need NEW_DELETED and if going along with my theories DETACHED.

like image 23
martinr Avatar answered Nov 12 '22 09:11

martinr


A bit late but in my opinion what you'll need / needed is a Unit Of Work. You current design is like a Registry which plays nicely with the UoW.

like image 36
Derick Schoonbee Avatar answered Nov 12 '22 10:11

Derick Schoonbee


I want to share my view on how one could implement the “1 to n” relation. Side “1” is the master table, whereas side “n” corresponds to the slave (child) table. I suppose, we want to manipulate the relation from both sides. From slave's point the relation would look like a single object property, possibly with ability to set/change/clear the object reference specified by the property. From master's point the same relation would be a collection-like property, providing us with the means to iterate over/add/delete object references from that collection. Since any changes made to one side of the relation must be instantly made available from the other side, we have two options:

  1. Propagate the changes to all participants of the relation immediately. In this case the collection-like property mentioned above could be implemented using general-purpose container class, with the alteration methods overriden.
  2. Introduce some kind of intermediate “instance-of-relation” object, which will own all of the information about the relation for just one instance of master object. In this case each call for properties from either side will fetch requested info from that intermediate object.

Choosing between the two involves answering several important question:

  1. How the instances of mapped classes should be created? Could we create an instance without saving it in the IdentityMap? How about creating a linked structure of newly created objects?
  2. Could the instances of mapped classes be copied, at the same time retaining some knowledge one about each other, to propagate the changes? Or maybe we should have exactly one instance for each table record?
  3. Who is responsible for object deletion in all possible cases?

In either case the are some features, that are usually present in any ORM-like solution. For example, the IdentityMap design pattern assumes you register all instances of mapped classes that should render their changes to th DB in some kind of registry. This is necessary to perform later the “flush” operation. Of course, this requires maintaining the record status. I found the “instance-of-relation” approach to be relatively easier to implement. You can find the implementation in my still-in-devopment general-purpose ORM solution for C++: YB.ORM. Particularly, take a look at source files DataObject.h, DataObject.cpp, and the tests in TestDataObject.cpp (folder lib/orm/). YB.ORM library employs variant typed objects internally with statically typed “thin” wrappers, in contrast to your sample code. The DataObject class represents an instance of mapped class, where mapping rules are given in the metadata description. Such objects are always allocated in heap and are not copyable. They store the data values. They have a link to metadata information for the mapped table. Of course, current state is maintained (one of: New, Ghost, Dirty, Sync, ToBeDeleted, Deleted) within those objects. To support relations in which this class presents the “n” side, each of them has a set of pointers to instances of RelationObject class (slave_relations_ member). To support relations in which this class presents the “1” side, each of them also has a set of shared pointers to instances of RelationObject class (master_relations_ member).

The RelationObject class represents an instance of a relation. Such objects are always allocated in heap and are not copyable. They store and enumerate pointers to related DataObject instances: one pointer to master, and set of shared pointers to slaves. Thus they “own” slave DataObject instances, and the DataObject instances “own” (indirectly) all of the slave objects. Note, that RelationObject itself maintains something like state, to support lazy loading.

like image 1
vaclav Avatar answered Nov 12 '22 10:11

vaclav