Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

File vs database for storage efficiency in chat app

I'm working on a simple AJAX chat add-on for my PHP application so I can provide real-time support to my users. I'm currently using a MySQL database storing the text, timestamp and user_id of the person who is chatting. I got to thinking about how I could optimize my chat and I got to thinking about removing the need for the SQL database.

My question, would it be more efficient to use fwrite() to append extra data to a PHP file to store the same information rather than creating an SQL connection to retrieve new posts to the chat? I know how I would accomplish this effectively, I'm just trying to figure out which way would be more efficient.

I've looked a little bit at SQLite as well; would that be a better alternative than using a MySQL database?

like image 293
Phillip Avatar asked Jun 11 '11 07:06

Phillip


2 Answers

Database Management Systems (DBMS) exists because it's not as easy as it seems to store and access data the right way.

Storing data in a file implies access concurrency issues. When the file will grow bigger, you will have to face important memory usage or write a lot of code to load just what you need. It will also be quite hard to do basic operations like filtering (SQL WHERE clause) or updating a row. BTW, changing the data structure promises to be error-prone. I simpler words: you will have to write a lot of code and face a lot of bugs.

IMO, not using any kind of DBMS is recreating the wheel. However choosing the right one is important.

like image 96
SandRock Avatar answered Oct 13 '22 03:10

SandRock


Ill throw my hat in the ring here even though this is an old question. For speed, reliability, and ease of use a DB is the obvious easy choice... With one major caveat that a lot of people are overlooking, and that is most shared hosts (the most common form of web hosting) only allow 15 or so connections at once, even VPS's usually only allow 100-200, dedicated being 500 or more. What that means is if you have (n) users pooling every (s) seconds those connections are going to get eaten up fast, even faster if you are also running any sort of CMS. Being in the middle of developing my own chat room code on a VPS I am also facing these issues myself.

So far my method has been this.

  • Make sure to pass a lastMessageReceived variable to limit the response.
  • If a public chatroom pass a timestamp filter as well as the above
  • if at all possible use a DB caching engine such as MySQLnd with query caching enabled and TTL set to whatever your pooling rate is.
  • Don't get crazy with your pooling rate 1-2 second intervals may seem neat and snappy, but it'll kill your connection count. Dropping that to 5s or even more wont really make a huge difference and users probably wont notice, and your server load will be much lighter. Even consider a variable pooling rate that raises itself during high load.
  • Write your ajax to use timeout instead of interval for its pooling and place the timeout call in the ajax success callback, this prevents the requests from stacking up during peak ussage.
  • And the biggest one, if using a shared chat room with many users, write your own code to cache the SQL query into a json file and serve that to ajax requests, and write some custom TTL code to check its age and re-populate it as needed during requests, CRON would be great here if your host allows. Age checking a file and redirecting an AJAX request to it is a higher level function with very little server overhead especially compared to querying a DB. And don't parse the file in PHP looking to filter out old messages, store the file with the first message in the filename such as chat_243.json and save it as already formatted json, then just serve up the entire file if a request comes into the php with lastMessageReceived = 243. Since this will create several files you will need a function to clean up files older than (m) minutes, but this is also light duty work for the server.

There are also options like DB engines designed for chat and sockets (node.js), but those require more server tweaking than typical hosting accounts allow, for my purposes I have been writing my chat-room keeping in mind the idea that it may get deployed to a shared server at some point.

Here is the code I am using currently, and it has pretty much allowed me to expand my chat to an unlimited number of connection (within the servers bandwidth)

$cacheFile = 'cache/chat_'.$_GET['last'].'.json';

if (file_exists($cacheFile) && filemtime($cacheFile) + QUERY_REFRESH_RATE > time())
{
    readfile($cacheFile);
} else {
    require_once("../../../../wp-load.php");
    $timestampMin = gmdate("Y-m-d H:i:s", (time() - 7200));

    $sql= "/*qc=on*/" . "SELECT * FROM ". DB_TABLE ."chat_posts WHERE ID > ". $_GET['last'] . " AND timestamp > '".$timestampMin."' ORDER BY ID;";
    $posts = $wpdb->get_results($sql);

    $json = json_encode($posts);
    echo $json;
    file_put_contents($cacheFile,$json);
}
like image 36
Wobbles Avatar answered Oct 13 '22 05:10

Wobbles