Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to properly implement a custom session persister in PHP + MySQL?

I'm trying to implement a custom session persister in PHP + MySQL. Most of the stuff is trivial - create your DB table, make your read/write functions, call session_set_save_hander(), etc. There are even several tutorials out there that offer sample implementations for you. But somehow all these tutorials have conveniently overlooked one tiny detail about session persisters - locking. And now that's where the real fun starts!

I looked at the implementation of session_mysql PECL extension of PHP. That uses MySQL's functions get_lock() and release_lock(). Seems nice, but I don't like the way it's doing it. The lock is acquired in the read function, and released in the write function. But what if the write function never gets called? What if the script somehow crashes, but the MySQL connection stays open (due to pooling or something)? Or what if it the script enters a deadly deadlock?

I just had a problem where a script opened a session and then tried to flock() a file over an NFS share, while the other computer (that hosted the file) was also doing the same thing. The result was that the flock()-over-NFS call was blocking the script for about 30 seconds on each call. And it was in a loop of 20 iterations! Since that was an external operation, PHP's script timeouts didn't apply, and the session got locked for over 10 minutes every time this script was accessed. And, as luck would have it, this was the script that got polled by an AJAX shoutbox every 5 seconds... Major showstopper.

I already have some ideas on how to implement it in a better way, but I would really like to hear what other people suggest. I haven't had that much experience with PHP to know what subtle edge cases loom in the shadows which could one day jeopardize the whole thing.


Added:

OK, seems that nobody has anything to suggest. OK then, here's my idea. I'd like some opinon on where this could go wrong.

  1. Create a session table with InnoDB storage engine. This should ensure some proper locking of rows even under clustered scenarios. The table should have the columns ID, Data, LastAccessTime, LockTime, LockID. I'm omitting the datatypes here because they follow quite directly from the data that needs to be stored in them. The ID will be the ID of the PHP session. Data will of course contain the session data. LastAccessTime will be a timestamp which will be updated on each read/write operation and will be used by GC to delete old sessions. LockTime will be a timestamp of the last lock that was acquired on the session, and LockID will be a GUID of the lock.
  2. When a read operation is requested, there will be the following actions taken:
    1. Execute INSERT IGNORE INTO sessions (id, data, lastaccesstime, locktime, lockid) values ($sessid, null, now(), null, null); - this will create the session row if it is not there, but do nothing if it is already present;
    2. Generate a random lock id in the variable $guid;
    3. Execute UPDATE sessions SET (lastaccesstime, locktime, lockid) values (now(), now(), $guid) where id=$sessid and (lockid is null or locktime < date_add(now(), INTERVAL -30 seconds)); - this is an atomic operation which will either obtain a lock on the session row (if it's not locked or the lock is expired), or will do nothing.
    4. Check with mysql_affected_rows() if the lock was obtained or not. If it was obtained - proceed. If not - re-attempt the operation every 0.5 seconds. If in 40 seconds the lock is still not obtained, throw an exception.
  3. When a write operation is requested, execute UPDATE sessions SET (lastaccesstime, data, locktime, lockid) values (now(), $data, null, null) where id=$sessid and lockid=$guid; This is another atomic operation which will update the session row with the new data and remove the lock if it still has the lock, but do nothing if the lock was already taken away.
  4. When a gc operation is requested, simply delete all rows with lastaccesstime too old.

Can anyone see flaws with this?

like image 991
Vilx- Avatar asked Jun 20 '09 20:06

Vilx-


2 Answers

Ok. The answer is going to be a bit longer - so patience! 1) Whatever I am going to write is based on the experiments I have done over last couple of days. There may be some knobs/settings/inner working I may not be aware of. If you spot mistakes/ or do not agree then please shout!

2) First clarification - WHEN SESSION DATA is READ and WRITTEN

The session data is going to be read exactly once even if you have multiple $_SESSION reads inside your script. The read from session is a on a per script basis. Moreover the data fetch happens based on the session_id and not keys.

2) Second clarification - WRITE ALWAYS CALLED AT END OF SCRIPT

A) The write to session save_set_handler is always fired, even for scripts that only "read" from session and never do any writes. B) The write is only fired once, at the end of the script or if you explicitly call session_write_close. Again, the write is based on session_id and not keys

3) Third Clarification : WHY WE NEED Locking

  • What is this fuss all about?
  • Do we really need locks on session?
  • Do we really Need a Big Lock wrapping READ + WRITE

To explain the Fuss

Script1

  • 1: $x = S_SESSION["X"];
  • 2: sleep(20);
  • 3: if($x == 1 ) {
  • 4: //do something
  • 5: $_SESSION["X"] = 3 ;
  • 6: }
  • 4: exit;

Script 2

  • 1: $x = $_SESSION["X"];
  • 2: if($x == 1 ) { $_SESSION["X"] = 2 ; }
  • 3: exit ;

The inconsistency is that script 1 is doing something based on a session variable (line:3) value that has changed in by another script while script-1 was already running. This is a skeleton example but it illustrates the point. The fact that you are taking decisions based on something that is no longer TRUE.

when you are using PHP default session locking (Request Level locking) script2 will block on line 1 because it cannot read from the file that script 1 started reading at line1. So the requests to session data are serialized. When script2 reads a value, it is guaranteed to read the new value.

Clarification 4: PHP SESSION SYNCHRONIZATION IS DIFFERENT FROM VARIABLE SYNCHRONIZATION

Lot of people talk about PHP session synchronization as if it is like a variable synchronization, the write to memory location happening as soon as you overwrite variable value and the next read in any script will fetch the new value. As we see from CLARIFICATION #1 - That is not true. The script uses the values read at the start of the script throughout the script and even if some other script has changed the values, the running script will not know about new values till next refresh. This is a very important point.

Also, keep in mind that values in session changes even with PHP big locking. Saying things like, "script that finishes first will overwrite value" is not very accurate. Value change is not bad, what we are after is inconsistency, namely, it should not change without my knowledge.

CLARIFICATION 5: Do we REALLY NEED BIG LOCK?

Now, do we really need Big Lock (request level)? The answer, as in the case of DB isolation, is that it depends on how you want to do things. With the default implementation of $_SESSION, IMHO, only the big lock makes sense. If I am going to the use the value that I read at the beginning throughout my script then only the big lock makes sense. If I change the $_SESSION implementation to "always" fetch "fresh" value then you do not need BIG LOCK.

Suppose we implement a session data versioning scheme like object versioning. Now, script 2 write will succeed because script-1 has not come to write point yet. script-2 writes to session store and increments version by 1. Now, when script 1 tries to write to session, it will fail (line:5) - I do not think this is desirable, though doable.

===================================

From (1) and (2), it follows that no matter how complicated your script, with X reads and Y writes to session,

  • the session handler read() and write() methods are only called once
  • and they are always called

Now, there are custom PHP session handlers on net that try to do a "variable"-level locking etc. I am still trying to figure some of them. However I am not in favor of complex schemes.

Assuming that PHP scripts with $_SESSION are supposed to be serving web pages and are processed in milli-seconds, I do not think the additional complexity is worth it. Like Peter Zaitsev mentions here, a select for update with commit after write should do the trick.

Here I am including the code that I wrote to implement locking. It would be nice to test it with some "Race simulation" scripts. I believe it should work. There are not many correct implementations I found on net. It would be good if you can point out the mistakes. I did this with bare mysqli.

<?php
namespace com\indigloo\core {

    use \com\indigloo\Configuration as Config;
    use \com\indigloo\Logger as Logger;

    /*
     * @todo - examine row level locking between read() and write()
     *
     */
    class MySQLSession {

        private $mysqli ;

        function __construct() {

        }

        function open($path,$name) {
            $this->mysqli = new \mysqli(Config::getInstance()->get_value("mysql.host"),
                            Config::getInstance()->get_value("mysql.user"),
                            Config::getInstance()->get_value("mysql.password"),
                            Config::getInstance()->get_value("mysql.database")); 

            if (mysqli_connect_errno ()) {
                trigger_error(mysqli_connect_error(), E_USER_ERROR);
                exit(1);
            }

            //remove old sessions
            $this->gc(1440);

            return TRUE ;
        }

        function close() {
            $this->mysqli->close();
            $this->mysqli = null;
            return TRUE ;
        }

        function read($sessionId) {
            Logger::getInstance()->info("reading session data from DB");
            //start Tx
            $this->mysqli->query("START TRANSACTION"); 
            $sql = " select data from sc_php_session where session_id = '%s'  for update ";
            $sessionId = $this->mysqli->real_escape_string($sessionId);
            $sql = sprintf($sql,$sessionId);

            $result = $this->mysqli->query($sql);
            $data = '' ;

            if ($result) {
                $record = $result->fetch_array(MYSQLI_ASSOC);
                $data = $record['data'];
            } 

            $result->free();
            return $data ;

        }

        function write($sessionId,$data) {

            $sessionId = $this->mysqli->real_escape_string($sessionId);
            $data = $this->mysqli->real_escape_string($data);

            $sql = "REPLACE INTO sc_php_session(session_id,data,updated_on) VALUES('%s', '%s', now())" ;
            $sql = sprintf($sql,$sessionId, $data);

            $stmt = $this->mysqli->prepare($sql);
            if ($stmt) {
                $stmt->execute();
                $stmt->close();
            } else {
                trigger_error($this->mysqli->error, E_USER_ERROR);
            }
            //end Tx
            $this->mysqli->query("COMMIT"); 
            Logger::getInstance()->info("wrote session data to DB");

        }

        function destroy($sessionId) {
            $sessionId = $this->mysqli->real_escape_string($sessionId);
            $sql = "DELETE FROM sc_php_session WHERE session_id = '%s' ";
            $sql = sprintf($sql,$sessionId);

            $stmt = $this->mysqli->prepare($sql);
            if ($stmt) {
                $stmt->execute();
                $stmt->close();
            } else {
                trigger_error($this->mysqli->error, E_USER_ERROR);
            }
        }

        /* 
         * @param $age - number in seconds set by session.gc_maxlifetime value
         * default is 1440 or 24 mins.
         *
         */
        function gc($age) {
            $sql = "DELETE FROM sc_php_session WHERE updated_on < (now() - INTERVAL %d SECOND) ";
            $sql = sprintf($sql,$age);
            $stmt = $this->mysqli->prepare($sql);
            if ($stmt) {
                $stmt->execute();
                $stmt->close();
            } else {
                trigger_error($this->mysqli->error, E_USER_ERROR);
            }

        }

    }
}
?>

To register the object session Handler,

$sessionHandler = new \com\indigloo\core\MySQLSession();
session_set_save_handler(array($sessionHandler,"open"),
                            array($sessionHandler,"close"),
                            array($sessionHandler,"read"),
                            array($sessionHandler,"write"),
                            array($sessionHandler,"destroy"),
                            array($sessionHandler,"gc"));

ini_set('session_use_cookies',1);
//Defaults to 1 (enabled) since PHP 5.3.0
//no passing of sessionID in URL
ini_set('session.use_only_cookies',1);
// the following prevents unexpected effects 
// when using objects as save handlers
// @see http://php.net/manual/en/function.session-set-save-handler.php 
register_shutdown_function('session_write_close');
session_start();

Here is another version done with PDO. This one checks for existence of sessionId and does update or Insert. I have also removed the gc function from open() as it unnecessarily fires a SQL query on each page load. The stale session cleanup can easily be done via a cron script. This should be the version to use if you are on PHP 5.x. Let me know if you find any bugs!

=========================================

namespace com\indigloo\core {

    use \com\indigloo\Configuration as Config;
    use \com\indigloo\mysql\PDOWrapper;
    use \com\indigloo\Logger as Logger;

    /*
     * custom session handler to store PHP session data into mysql DB
     * we use a -select for update- row leve lock 
     *
     */
    class MySQLSession {

        private $dbh ;

        function __construct() {

        }

        function open($path,$name) {
            $this->dbh = PDOWrapper::getHandle();
            return TRUE ;
        }

        function close() {
            $this->dbh = null;
            return TRUE ;
        }

        function read($sessionId) {
            //start Tx
            $this->dbh->beginTransaction(); 
            $sql = " select data from sc_php_session where session_id = :session_id  for update ";
            $stmt = $this->dbh->prepare($sql);
            $stmt->bindParam(":session_id",$sessionId, \PDO::PARAM_STR);
            $stmt->execute();
            $result = $stmt->fetch(\PDO::FETCH_ASSOC);
            $data = '' ;
            if($result) {
                $data = $result['data'];
            }

            return $data ;
        }

        function write($sessionId,$data) {

            $sql = " select count(session_id) as total from sc_php_session where session_id = :session_id" ;
            $stmt = $this->dbh->prepare($sql);
            $stmt->bindParam(":session_id",$sessionId, \PDO::PARAM_STR);
            $stmt->execute();
            $result = $stmt->fetch(\PDO::FETCH_ASSOC);
            $total = $result['total'];

            if($total > 0) {
                //existing session
                $sql2 = " update sc_php_session set data = :data, updated_on = now() where session_id = :session_id" ;
            } else {
                $sql2 = "insert INTO sc_php_session(session_id,data,updated_on) VALUES(:session_id, :data, now())" ;
            }

            $stmt2 = $this->dbh->prepare($sql2);
            $stmt2->bindParam(":session_id",$sessionId, \PDO::PARAM_STR);
            $stmt2->bindParam(":data",$data, \PDO::PARAM_STR);
            $stmt2->execute();

            //end Tx
            $this->dbh->commit(); 
        }

        /*
         * destroy is called via session_destroy
         * However it is better to clear the stale sessions via a CRON script
         */

        function destroy($sessionId) {
            $sql = "DELETE FROM sc_php_session WHERE session_id = :session_id ";
            $stmt = $this->dbh->prepare($sql);
            $stmt->bindParam(":session_id",$sessionId, \PDO::PARAM_STR);
            $stmt->execute();

        }

        /* 
         * @param $age - number in seconds set by session.gc_maxlifetime value
         * default is 1440 or 24 mins.
         *
         */
        function gc($age) {
            $sql = "DELETE FROM sc_php_session WHERE updated_on < (now() - INTERVAL :age SECOND) ";
            $stmt = $this->dbh->prepare($sql);
            $stmt->bindParam(":age",$age, \PDO::PARAM_INT);
            $stmt->execute();
        }

    }
}
?>
like image 198
rjha94 Avatar answered Nov 02 '22 07:11

rjha94


I just wanted to add (and you may already know) that PHP's default session storage (which uses files) does lock the sessions files. Obviously using files for sessions has plenty of shortcomings which is probably why you are looking at a database solution.

like image 30
pifantastic Avatar answered Nov 02 '22 07:11

pifantastic