I'm really struggling with a recurring OOP / database concept.
Please allow me to explain the issue with pseudo-PHP-code.
Say you have a "user" class, which loads its data from the users
table in its constructor:
class User {
public $name;
public $height;
public function __construct($user_id) {
$result = Query the database where the `users` table has `user_id` of $user_id
$this->name= $result['name'];
$this->height = $result['height'];
}
}
Simple, awesome.
Now, we have a "group" class, which loads its data from the groups
table joined with the groups_users
table and creates user
objects from the returned user_id
s:
class Group {
public $type;
public $schedule;
public $users;
public function __construct($group_id) {
$result = Query the `groups` table, joining the `groups_users` table,
where `group_id` = $group_id
$this->type = $result['type'];
$this->schedule = $result['schedule'];
foreach ($result['user_ids'] as $user_id) {
// Make the user objects
$users[] = new User($user_id);
}
}
}
A group can have any number of users.
Beautiful, elegant, amazing... on paper. In reality, however, making a new group object...
$group = new Group(21); // Get the 21st group, which happens to have 4 users
...performs 5 queries instead of 1. (1 for the group and 1 for each user.) And worse, if I make a community
class, which has many groups in it that each have many users within them, an ungodly number of queries are ran!
The Solution, Which Doesn't Sit Right To Me
For years, the way I've got around this, is to not code in the above fashion, but instead, when making a group
for instance, I would join the groups
table to the groups_users
table to the users
table as well and create an array of user-object-like arrays within the group
object (never using/touching the user
class):
class Group {
public $type;
public $schedule;
public $users;
public function __construct($group_id) {
$result = Query the `groups` table, joining the `groups_users` table,
**and also joining the `users` table,**
where `group_id` = $group_id
$this->type = $result['type'];
$this->schedule = $result['schedule'];
foreach ($result['users'] as $user) {
// Make user arrays
$users[] = array_of_user_data_crafted_from_the_query_result;
}
}
}
...but then, of course, if I make a "community" class, in its constructor I'll need to join the communities
table with the communities_groups
table with the groups
table with the groups_users
table with the users
table.
...and if I make a "city" class, in its constructor I'll need to join the cities
table with the cities_communities
table with the communities
table with the communities_groups
table with the groups
table with the groups_users
table with the users
table.
What an unmitigated disaster!
Do I have to choose between beautiful OOP code with a million queries VS. 1 query and writing these joins by hand for every single superset? Is there no system that automates this?
I'm using CodeIgniter, and looking into countless other MVC's, and projects that were built in them, and cannot find a single good example of anyone using models without resorting to one of the two flawed methods I've outlined.
It appears this has never been done before.
One of my coworkers is writing a framework that does exactly this - you create a class that includes a model of your data. Other, higher models can include that single model, and it crafts and automates the table joins to create the higher model that includes object instantiations of the lower model, all in a single query. He claims he's never seen a framework or system for doing this before, either.
Please Note: I do indeed always use separate classes for logic and persistence. (VOs and DAOs - this is the entire point of MVCs). I have merely combined the two in this thought-experiment, outside of an MVC-like architecture, for simplicity's sake. Rest assured that this issue persists regardless of the separation of logic and persistence. I believe this article, introduced to me by James in the comments below this question, seems to indicate that my proposed solution (which I've been following for years) is, in fact, what developers currently do to solve this issue. This question is, however, attempting to find ways of automating that exact solution, so it doesn't always need to be coded by hand for every superset. From what I can see, this has never been done in PHP before, and my coworker's framework will be the first to do so, unless someone can point me towards one that does.
And, also, of course I never load data in constructors, and I only call the load() methods that I create when I actually need the data. However, that is unrelated to this issue, as in this thought experiment (and in the real-life situations where I need to automate this), I always need to eager-load the data of all subsets of children as far down the line as it goes, and not lazy-load them at some future point in time as needed. The thought experiment is concise -- that it doesn't follow best practices is a moot point, and answers that attempt to address its layout are likewise missing the point.
EDIT : Here is a database schema, for clarity.
CREATE TABLE `groups` (
`group_id` int(11) NOT NULL, <-- Auto increment
`make` varchar(20) NOT NULL,
`model` varchar(20) NOT NULL
)
CREATE TABLE `groups_users` ( <-- Relational table (many users to one group)
`group_id` int(11) NOT NULL,
`user_id` int(11) NOT NULL
)
CREATE TABLE `users` (
`user_id` int(11) NOT NULL, <-- Auto increment
`name` varchar(20) NOT NULL,
`height` int(11) NOT NULL,
)
(Also note that I originally used the concepts of wheel
s and car
s, but that was foolish, and this example is much clearer.)
SOLUTION:
I ended up finding a PHP ORM that does exactly this. It is Laravel's Eloquent. You can specify the relationships between your models, and it intelligently builds optimized queries for eager loading using syntax like this:
Group::with('users')->get();
It is an absolute life saver. I haven't had to write a single query. It also doesn't work using joins, it intelligently compiles and selects based on foreign keys.
As a beginner, OOP is also more difficult to read for several non-code related reasons. First, it's near impossible to understand why a piece of code exists if you're unfamiliar with the domain being modeled with classes. Secondly, OOP is a craft and is inherently opinionated.
Students find it very difficult to understand object oriented concepts like classes, constructor invocation, overloaded constructors, friend functions and other object oriented concepts [2]. Students who have been exposed to procedural programming find it a little difficult to move towards object oriented programming.
Best way to learn OOP concepts is to write more and more code and get it reviewed often. Practice maketh a good programmer. Think real world scenarios, define a problem statement - solve it in code and get it reviewed.
This course should take about 6 weeks to complete. You can check out the recommended course schedule below to see a quick overview of the lessons and assignments you'll complete each week. We're excited you're here learning with us. Let's get started!
Say you have a "wheel" class, which loads its data from the wheels table in its constructor
Constructors should not be doing any work. Instead they should contain only assignments. Otherwise you make it very hard to test the behavior of the instance.
Now, we have a "car" class, which loads its data from the cars table joined with the cars_wheels table and creates wheel objects from the returned wheel_ids:
No. There are two problems with this.
Your Car
class should not contain both code for implementing "car logic" and "persistence logic". Otherwise you are breaking SRP. And wheels are a dependency for the class, which means that the wheels should be injected as parameter for the constructor (most likely - as a collection of wheels, or maybe an array).
Instead you should have a mapper class, which can retrieve data from database and store it in the WheelCollection
instance. And a mapper for car, which will store data in Car
instance.
$car = new Car;
$car->setId( 42 );
$mapper = new CarMapper( $pdo );
if ( $mapper->fetch($car) ) //if there was a car in DB
{
$wheels = new WheelCollection;
$otherMapper = new WheelMapper( $pdo );
$car->addWheels( $wheels );
$wheels->setType($car->getWheelType());
// I am not a mechanic. There is probably some name for describing
// wheels that a car can use
$otherMapper->fetch( $wheels );
}
Something like this. The mapper in this case are responsible for performing the queries. And you can have several source for them, for example: have one mapper that checks the cache and only, if that fails, pull data from SQL.
Do I really have to choose between beautiful OOP code with a million queries VS. 1 query and disgusting, un-OOP code?
No, the ugliness comes from fact that active record pattern is only meant for the simplest of usecases (where there is almost no logic associated, glorified value-objects with persistence). For any non-trivial situation it is preferable to apply data mapper pattern.
..and if I make a "city" class, in its constructor I'll need to join the cities table with the cities_dealerships table with the dealerships table with the dealerships_cars table with the cars table with the cars_wheels table with the wheels table.
Jut because you need data about "available cares per dealership in Moscow" does not mean that you need to create Car
instances, and you definitely will not care about wheels there. Different parts of site will have different scale at which they operate.
The other thing is that you should stop thinking of classes as table abstractions. There is no rule that says "you must have 1:1 relation between classes and tables".
Take the Car
example again. If you look at it, having separate Wheel
(or even WheelSet
) class is just stupid. Instead you should just have a Car
class which already contains all it's parts.
$car = new Car;
$car->setId( 616 );
$mapper = new CarMapper( $cache );
$mapper->fetch( $car );
The mapper can easily fetch data not only from "Cars" table but also from "Wheel" and "Engines" and other tables and populate the $car
object.
P.S.: also, if you care about code quality, you should start reading PoEAA book. Or at least start watching lectures listed here.
my 2 cents
ActiveRecord in Rails implements the concept of lazy loading, that is deferring database queries until you actually need the data. So if you instantiate a my_car = Car.find(12)
object, it only queries the cars table for that one row. If later you want my_car.wheels
then it queries the wheels table.
My suggestion for your pseudo code above is to not load every associated object in the constructor. The car constructor should query for the car only, and should have a method to query for all of it's wheels, and another to query it's dealership, which only queries for the dealership and defers collecting all of the other dealership's cars until you specifically say something like my_car.dealership.cars
Postscript
ORMs are database abstraction layers, and thus they must be tuned for ease of querying and not fine tuning. They allow you to rapidly build queries. If later you decide that you need to fine tune your queries, then you can switch to issuing raw sql commands or trying to otherwise optimize how many objects you're fetching. This is standard practice in Rails when you start doing performance tuning - look for queries that would be more efficient when issued with raw sql, and also look for ways to avoid eager loading (the opposite of lazy loading) of objects before you need them.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With