I have made a lot of searches about caching data in files (serialize/unserialise vs json_encode/decode, var_export, igbonary) and mysql queries (optimizations, stored procedures, query cache), but at this moment, I wonder what is the better way to optimize a concrete case like the following.
Sorry in advance : this is a long topic for a small answer I guess, but it is necessary to understand the project. And excuse my poor english, which is not my first language.
Let's imagine that we have this database relationships.
Description of the database (estimated number of records in parentheses) :
So we have several relationships, and each is recorder at the ITEM creation/update.
I have already cached ITEMs data in folders and files with the following example :
public function cacheItem()
{
$req=mysql_query("SELECT id, title, content, id_mod, id_cat
FROM ITEM
WHERE ITEM.id='".$this->id."'")or die(mysql_error());
if(mysql_num_rows($req)==1)
{
$this->itemData=mysql_fetch_array($req);
$this->folder=floor($this->id/1000);//1000 items max per folder
$this->itemUrl=$this->folder."/".$this->id.".txt";
if(!file_exists($this->itemUrl))
{
touch($this->itemUrl);
}
file_put_contents($this->itemUrl,serialize($this->itemData),LOCK_EX);
}
}
And I get them by an unserialize(file_get_contents($url))
, this part works like a charm !
Now I wish to optimize the lists of ITEMs to display them by several options (for example), foreach display with a limit of 100 per pagination :
I already know how to do this in SQL and to put the results in a cache tree.
My problem, with those cache files, is that when a new ITEM is created/updated, the list may have to be refreshed with a lot of strictness.
First question :
So what will happen if ITEMs are created/updated (so those lists too) at the same time ?
Does the LOCK_EX
of the file_put_contents();
will do his job while getting files from file_get_contents();
?
Second question
I understand that more PHP will work, less mySQL will (and the otherwise), but what is the better (faster to display) way to do those lists with pagination, which will be displayed each second or more, and only modified by adding/updating a new ITEM ?
Any ideas, examples, links greatly appreciated.
P.S. : just for fun I may ask "how does Facebook" and "how does stackoverflow" ?
In-memory data lookup: If you have a mobile / web app front end you might want to cache some information like user profile, some historical / static data, or some api response according to your use cases. Caching will help in storing such data.
In-memory data caching can be one of the most effective strategies to improve your overall application performance and to reduce your database costs. Caching can be applied to any type of database including relational databases such as Amazon RDS or NoSQL databases such as Amazon DynamoDB, MongoDB and Apache Cassandra.
You can optimize your database caching by tweaking the schema. Indexing in a database is one way of optimizing database reads. The database also has a working set of data in-memory to handle frequent requests to the same data. Server caching is the custom caching of data in a server application.
A proper caching strategy includes effective use of both write-through and lazy loading of your data and setting an appropriate expiration for the data to keep it relevant and lean.
First question:
Your operations should be fine with LOCK_EX
. The files may get locked if accessed simultaneously which will definitely slow things down, but all operations should complete correctly. However, this is a good example why you should not implement your own cache system.
Second question:
MySQL will definitely be faster than your cache system (Unless you do some seriously wicket coding and not in PHP). Databases like MySQL have done a lot of work in optimizing their performance.
I don't believe that stored procedures in MySQL will offer you any real benefit in the examples provided above over plain old SELECT
queries.
Using a NoSQL approach like MongoDB can help you if you use sharding on a server cluster. This is more difficult to write and more servers cost more money. Also, it is not clear from your question if moving to a different database system is an option.
If you stick with MySQL, it is probably easier to implement load balancing application servers than a database server cluster. With this in mind, more work done by PHP is preferred to more work in MySQL. I would not follow this approach though, because you are giving up much for only a small benefit.
In short, I recommend that you stick to plain SELECT
queries to get what you need. Run your application and database on separate servers, and use the more powerful server for your DB server.
PS. Facebook write a pre-compiler for PHP to make their code run faster. In my opinion, PHP is not a very fast language and you can get better results from Python or Node.js.
Stackoverflow use ASP.NET MVC with MS SQL Server. They have a single big powerful server for the database and apparently rather use DB queries where they can. They also use load balanced application servers that are separate from their DB server.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With