Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Self "HABTM" or "HasMany Through" concept confusion

Bounty:

+500 rep bounty to a GOOD solution. I've seriously banged my head against this wall for 2 weeks now, and am ready for help.

Tables/Models (simplified to show associations)

  • nodes
    • id
    • name
    • node_type_id
  • node_associations
    • id
    • node_id
    • other_node_id
  • node_types
    • id
    • name

General Idea:

A user can create node types (example "TV Stations", "TV Shows", and "Actors"...anything). If I knew ahead of time what the node types were and the associations between each, I'd just make models for them - but I want this to be very open-ended so the user can create any node-types they want. Then, each node (of a specific node-type) can relate to any other node of any other node-type.

Description and what I've tried:

Every node should be able to be related to any/every other node.

My assumption is that to do that, I must have an association table - so I made one called "node_associations" which has node_id and other_node_id.

Then I set up my association (using hasMany through I believe): (below is my best recollection of my set-up... it might be slightly off)

//Node model
public $hasMany = array(
    'Node' => array(
        'className' => 'NodeAssociation',
        'foreignKey' => 'node_id'

    ),
    'OtherNode' => array(
        'className' => 'NodeAssociation',
        'foreignKey' => 'other_node_id'
    )
);

//NodeAssociation model
public $belongsTo = array(
    'Node' => array(
        'className' => 'Node',
        'foreignKey' => 'node_id'

    ),
    'OtherNode' => array(
        'className' => 'Node',
        'foreignKey' => 'other_node_id'
    )
);

At first, I thought I had it - that this made sense. But then I started trying to retrieve the data, and have been banging my head against the wall for the past two weeks.

Example Problem(s):

Lets say I have a the following nodes:

  • NBC
  • ER
  • George Clooney
  • Anthony Edwards
  • Tonight Show: Leno
  • Jay Leno
  • Fox
  • Family Guy

How can I set up my data structure to be able to pull the all TV Stations, and contain their TV Shows, which contain their Actors (as example)? This would be SIMPLE with normal model setup:

$this->TvStation->find('all', array(
    'contain' => array(
        'TvShow' => array(
            'Actor'
        )
    )
));

And then, maybe I want to retrieve all male Actors and contain the TV Show which contain the TV Station. Or TV Shows that start at 9pm, and contain it's actor(s) and it's station...etc etc.

But - with HABTM or HasMany Through self (and more importantly, and unknown data set), I wouldn't know which field (node_id or other_node_id) the model is, and overall just can't wrap my head around how I'd get the content.

like image 529
Dave Avatar asked Sep 01 '12 03:09

Dave


3 Answers

The Idea

Let's try to solve this with convention, node_id will be the model who's alias comes alphabetically first and other_node_id will be the one that comes second.

For each contained model, we create a HABTM association on-the-fly to Node class, creating an alias for each association (see bindNodes and bindNode method).

Each table we query we add an extra condition on node_type_id to only return results for that type of node. The id of NodeType is selected via getNodeTypeId() and should be cached.

For filtering results using condition in deeply related associations, you would need to manually add extra join, creating a join for each jointable with a unique alias and then joining each node type itself with an alias to be able to apply the conditions (ex. selecting all TvChannels that have Actor x). Create a helper method for this in Node class.

Notes

I used foreignKey for node_id and associationForeignKey for other_node_id for my demo.

Node (incomplete)

<?php
/**
 * @property Model NodeType
 */
class Node extends AppModel {

    public $useTable = 'nodes';

    public $belongsTo = [
        'NodeType',
    ];

    public function findNodes($type = 'first', $query = []) {
        $node = ClassRegistry::init(['class' => 'Node', 'alias' => $query['node']]);
        return $node->find($type, $query);
    }

    // TODO: cache this
    public function nodeTypeId($name = null) {
        if ($name === null) {
            $name = $this->alias;
        }
        return $this->NodeType->field('id', ['name' => $name]);
    }

    public function find($type = 'first', $query = []) {
        $query = array_merge_recursive($query, ['conditions' => ["{$this->alias}.node_type_id" => $this->nodeTypeId()]]);
        if (!empty($query['contain'])) {
            $query['contain'] = $this->bindNodes($query['contain']);
        }
        return parent::find($type, $query);
    }

    // could be done better    
    public function bindNodes($contain) {
        $parsed = [];
        foreach($contain as $assoc => $deeperAssoc) {
            if (is_numeric($assoc)) {
                $assoc = $deeperAssoc;
                $deeperAssoc = [];
            }
            if (in_array($assoc, ['conditions', 'order', 'offset', 'limit', 'fields'])) {
                continue;
            }
            $parsed[$assoc] = array_merge_recursive($deeperAssoc, [
                'conditions' => [
                    "{$assoc}.node_type_id" => $this->nodeTypeId($assoc),
                ],
            ]);
            $this->bindNode($assoc);
            if (!empty($deeperAssoc)) {
                $parsed[$assoc] = array_merge($parsed[$assoc], $this->{$assoc}->bindNodes($deeperAssoc));
                foreach($parsed[$assoc] as $k => $v) {
                    if (is_numeric($k)) {
                        unset($parsed[$assoc][$k]);
                    }
                }
            }
        }
        return $parsed;
    }

    public function bindNode($alias) {
        $models = [$this->alias, $alias];
        sort($models);
        $this->bindModel(array(
            'hasAndBelongsToMany' => array(
                $alias => array(
                    'className' => 'Node',
                    'foreignKey' => ($models[0] === $this->alias) ? 'foreignKey' : 'associationForeignKey',
                    'associationForeignKey' => ($models[0] === $alias) ? 'foreignKey' : 'associationForeignKey',
                    'joinTable' => 'node_associations',
                )
            )
        ), false);
    }

}

Example

$results = $this->Node->findNodes('all', [
    'node' => 'TvStation', // the top-level node to fetch
    'contain' => [         // all child associated nodes to fetch
        'TvShow' => [
            'Actor',
        ]
    ],
]);
like image 83
tigrang Avatar answered Nov 02 '22 06:11

tigrang


I think you have incorrect relations between your models. I guess it will be enough with:

// Node Model
public $hasAdBelongsToMany = array(
    'AssociatedNode' => array(
        'className' => 'Node',
        'foreignKey' => 'node_id'
        'associationForeignKey' => 'associated_node_id',
        'joinTable' => 'nodes_nodes'
    )
);

// Tables

nodes

  • id
  • name
  • node_type_id

nodes_nodes

  • id
  • node_id
  • associated_node_id

node_types

  • id
  • name

Then you can try using ContainableBehavior to fetch your data. For Example, to find all TVShows belonging to a TVStation:

$options = array(
    'contain' => array(
        'AssociatedNode' => array(
            'conditions' => array(
                'AssociatedNode.node_type_id' => $id_of_tvshows_type
            )
        )
    ),
    conditions => array(
        'node_type_id' => $id_of_tvstations_type
    )
);
$nodes = $this->Node->find('all', $options);

EDIT :

You can even have second level conditions (see last example on this section, look at the 'Tag' model conditions). Try this:

$options = array(
    'contain' => array(
        'AssociatedNode' => array(
            'conditions' => array(
                'AssociatedNode.node_type_id' => $id_of_tvshows_type
            ),
            'AssociatedNode' => array(
                'conditions' => array( 'AssociatedNode.type_id' => $id_of_actors_type)
            )
        )
    ),
    conditions => array(
        'node_type_id' => $id_of_tvstations_type
    )
);
$nodes = $this->Node->find('all', $options);
like image 25
Choma Avatar answered Nov 02 '22 05:11

Choma


I think unfortunately part of the problem is that you want your solution to contain user data in the code. Since all your nodes types are user data, you want to avoid trying to use those as the classes methods in your application, as there could be infinite of them. Instead I would try and create methods that model the data operations you want to have.

One omission I see in the provided data model is a way to record the relationships between types. In your example you mention a relationship between TvStation -> TvShows -> Actor etc. But where are these data relationships defined/stored? With all of your node types being user defined data, I think you'll want to/need to record store those relationships somewhere. It seems like node_types needs some additional meta data about what the valid or desired child types for a given type are. Having this recorded somewhere might make your situation a bit simpler when creating queries. It might help to think of all the questions or queries you're going to ask the database. If you cannot answer all those questions with data that is in the database, then you are probably missing some tables. Model associations are just a proxy for data relations that already exist in your tables. If there are gaps there are probably gaps in your data model.

I don't think this is the answer you're looking for but hopefully it helps you find the right one.

like image 1
Mark Story Avatar answered Nov 02 '22 06:11

Mark Story