Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you manage SQL Queries

Tags:

sql

php

mysql

At the moment my code (PHP) has too many SQL queries in it. eg...

// not a real example, but you get the idea...
$results = $db->GetResults("SELECT * FROM sometable WHERE iUser=$userid");
if ($results) {
    // Do something
}

I am looking into using stored procedures to reduce this and make things a little more robust, but I have some concerns..

I have hundreds of different queries in use around the web site, and many of them are quite similar. How should I manage all these queries when they are removed from their context (the code that uses the results) and placed in a stored procedure on the database?

like image 727
Rik Heywood Avatar asked Sep 01 '08 11:09

Rik Heywood


People also ask

How does SQL manage data?

SQL queries and other operations take the form of commands written as statements and are aggregated into programs that enable users to add, modify or retrieve data from database tables. A table is the most basic unit of a database and consists of rows and columns of data.


2 Answers

The best course of action for you will depend on how you are approaching your data access. There are three approaches you can take:

  • Use stored procedures
  • Keep the queries in the code (but put all your queries into functions and fix everything to use PDO for parameters, as mentioned earlier)
  • Use an ORM tool

If you want to pass your own raw SQL to the database engine then stored procedures would be the way to go if all you want to do is get the raw SQL out of your PHP code but keep it relatively unchanged. The stored procedures vs raw SQL debate is a bit of a holy war, but K. Scott Allen makes an excellent point - albeit a throwaway one - in an article about versioning databases:

Secondly, stored procedures have fallen out of favor in my eyes. I came from the WinDNA school of indoctrination that said stored procedures should be used all the time. Today, I see stored procedures as an API layer for the database. This is good if you need an API layer at the database level, but I see lots of applications incurring the overhead of creating and maintaining an extra API layer they don't need. In those applications stored procedures are more of a burden than a benefit.

I tend to lean towards not using stored procedures. I've worked on projects where the DB has an API exposed through stored procedures, but stored procedures can impose some limitations of their own, and those projects have all, to varying degrees, used dynamically generated raw SQL in code to access the DB.

Having an API layer on the DB gives better delineation of responsibilities between the DB team and the Dev team at the expense of some of the flexibility you'd have if the query was kept in the code, however PHP projects are less likely to have sizable enough teams to benefit from this delineation.

Conceptually, you should probably have your database versioned. Practically speaking, however, you're far more likely to have just your code versioned than you are to have your database versioned. You are likely to be changing your queries when you are making changes to your code, but if you are changing the queries in stored procedures stored against the database then you probably won't be checking those in when you check the code in and you lose many of the benefits of versioning for a significant area of your application.

Regardless of whether or not you elect not to use stored procedures though, you should at the very least ensure that each database operation is stored in an independent function rather than being embedded into each of your page's scripts - essentially an API layer for your DB which is maintained and versioned with your code. If you're using stored procedures, this will effectively mean you have two API layers for your DB, one with the code and one with the DB, which you may feel unnecessarily complicates things if your project does not have separate teams. I certainly do.

If the issue is one of code neatness, there are ways to make code with SQL jammed in it more presentable, and the UserManager class shown below is a good way to start - the class only contains queries which relate to the 'user' table, each query has its own method in the class and the queries are indented into the prepare statements and formatted as you would format them in a stored procedure.

// UserManager.php:

class UserManager
{
    function getUsers()
    {
        $pdo = new PDO(...);
        $stmt = $pdo->prepare('
            SELECT       u.userId as id,
                         u.userName,
                         g.groupId,
                         g.groupName
            FROM         user u
            INNER JOIN   group g
            ON           u.groupId = g.groupId
            ORDER BY     u.userName, g.groupName
        ');
        // iterate over result and prepare return value
    }

    function getUser($id) {
        // db code here
    }
}

// index.php:
require_once("UserManager.php");
$um = new UserManager;
$users = $um->getUsers();
foreach ($users as $user) echo $user['name'];

However, if your queries are quite similar but you have huge numbers of permutations in your query conditions like complicated paging, sorting, filtering, etc, an Object/Relational mapper tool is probably the way to go, although the process of overhauling your existing code to make use of the tool could be quite complicated.

If you decide to investigate ORM tools, you should look at Propel, the ActiveRecord component of Yii, or the king-daddy PHP ORM, Doctrine. Each of these gives you the ability to programmatically build queries to your database with all manner of complicated logic. Doctrine is the most fully featured, allowing you to template your database with things like the Nested Set tree pattern out of the box.

In terms of performance, stored procedures are the fastest, but generally not by much over raw sql. ORM tools can have a significant performance impact in a number of ways - inefficient or redundant querying, huge file IO while loading the ORM libraries on each request, dynamic SQL generation on each query... all of these things can have an impact, but the use of an ORM tool can drastically increase the power available to you with a much smaller amount of code than creating your own DB layer with manual queries.

Gary Richardson is absolutely right though, if you're going to continue to use SQL in your code you should always be using PDO's prepared statements to handle the parameters regardless of whether you're using a query or a stored procedure. The sanitisation of input is performed for you by PDO.

// optional
$attrs = array(PDO::ATTR_PERSISTENT => true);

// create the PDO object
$pdo = new PDO("mysql:host=localhost;dbname=test", "user", "pass", $attrs);

// also optional, but it makes PDO raise exceptions instead of 
// PHP errors which are far more useful for debugging
$pdo->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

$stmt = $pdo->prepare('INSERT INTO venue(venueName, regionId) VALUES(:venueName, :regionId)');
$stmt->bindValue(":venueName", "test");
$stmt->bindValue(":regionId", 1);

$stmt->execute();

$lastInsertId = $pdo->lastInsertId();
var_dump($lastInsertId);

Caveat: assuming that the ID is 1, the above script will output string(1) "1". PDO->lastInsertId() returns the ID as a string regardless of whether the actual column is an integer or not. This will probably never be a problem for you as PHP performs casting of strings to integers automatically.

The following will output bool(true):

// regular equality test
var_dump($lastInsertId == 1); 

but if you have code that is expecting the value to be an integer, like is_int or PHP's "is really, truly, 100% equal to" operator:

var_dump(is_int($lastInsertId));
var_dump($lastInsertId === 1);

you could run into some issues.

Edit: Some good discussion on stored procedures here

like image 194
Shabbyrobe Avatar answered Oct 14 '22 23:10

Shabbyrobe


First up, you should use placeholders in your query instead of interpolating the variables directly. PDO/MySQLi allow you to write your queries like:

SELECT * FROM sometable WHERE iUser = ?

The API will safely substitute the values into the query.

I also prefer to have my queries in the code instead of the database. It's a lot easier to work with an RCS when the queries are with your code.

I have a rule of thumb when working with ORM's: if I'm working with one entity at a time, I'll use the interface. If I'm reporting/working with records in aggregate, I typically write SQL queries to do it. This means there's very few queries in my code.

like image 44
Gary Richardson Avatar answered Oct 14 '22 23:10

Gary Richardson