Working on a system which will store more than 4 million
records per day
.
To reduce I/O and increase speed I change the storage from database to file. So data will change to json and directly written to file.
The system is ppc
system written by PHP which show banner in several sites with their own servers through an iframe.
whenever this banner load in any site, I'll store one record of it's info in file (was an insert to database before) and update tow fields in tow tables in database.
When visits go up and reach almost 3000 visit per minute, iframe loading speed reduce significantly, further more sometimes cause printing server timeout in iframe.
I'm looking for ways to reduce resource using and increasing loading speed and also preventing timeout.
Any help will be highly appreciated...
To tackle this situation you can use various ways We will use JavaScript load load iframe content after website has loaded. The onload function on body tag ensures the website content in the html tag is loaded. Once that content is loaded then we start to load our iframe content.
Every iframe on a page will increase the memory used as well as other computing resources like your bandwidth. So, you should not use iframe excessively without monitoring what's going on, or you might end up harming your page performance.
iFrames tend to neither help nor hurt your search engine ranking. For this reason, it's best to refrain from using iFrames on main pages that you want to rank high in search engine results. Instead, fill high-priority pages with useful, unique content and save iFrames for other pages.
iframes should load asynchronously without any effort on your part.
With 3k requests a minute, and with hope of growth, you need to start utilizing big data
architecture and tools.
Here are some broad picture highlights to consider:
CDN
to store and serve the images.mapReduce
software to store the data, such as hadoop.distributed
, as opposed to one huge server.load balancing
server.At this particular point in the size of your application I would focus on two specific improvements to your infrastructure and code logic.
By separating these two concerns you will be able to provide more consistent performance for the customers of your ppc
service. If you aren't already, using a CDN or otherwise offloading the demand of serving the images themselves from your severs can help improve response time considerably.
Another area that will give you large gains is separating the serving of banner code from the processes that store the impression data to disk. There are a number of ways to do this, but a successful solution I have had experience with is utilizing ActiveMQ (http://activemq.apache.org/) or a similar queuing system. A queuing system will help balance your impression storage load over time by storing impression data in memory and sending off those data points at a consistent rate to consumer (aka worker) processes that can store that data into a DB or other storage medium. This allows the workload of actually storing impressions on disk to be separated from the process of serving ads. You can also set up multiple processes to consume the queued up jobs, which leads to the second area of improvement.
Building a horizontally scalable solution basically means that instead of needing to increase the size and power of a single server, you can just add additional smaller servers that will evenly share the workload of the system demands. This has multiple advantages, one of which being it is easier (and usually cheaper) to add a few more small servers to a pool rather than upgrading one big server to be larger and more powerful. It also had the advantage of being more robust in the event that a server fails.
In this case I think a good solution would be to have one server or process acting as a router, which will just load balance requests by sending them off to different servers that are doing the actual processing of the request. There are a lot of good resources about building routing or load balancing scripts in PHP out there on the internet, but basically you will receive requests at one endpoint, and then send that request to a different server to actually be fulfilled. If you build a dynamic list of servers that are ready to receive requests then you can easily increase the number of servers fulfilling requests when your start to see unacceptable performance. This will also give you the benefit of being able to easily remove a server from the list if it goes down, and then any traffic would just get routed to a different server that was still up.
If you haven't already, it would be good to look into lighttpd (http://www.lighttpd.net/) or nginx (https://www.nginx.com/) as alternatives to Apache which are built to be able to handle large volumes of requests with less overhead. These would be especially well suited to handling the requests on your router server.
Once you have horizontal scaling set up for requests, it would be fairly simply to set up horizontal scaling for storage servers as well. You can easily do this by modding an ID by the number of servers in the pool to determine where to send the request.
$serverNumber = $adID % $availableServers;
Although you can definitely see good performance improvements by optimizing storage methods and server tuning, at some point in a large application you will want to be able to add additional servers to get the job done. I think with the above steps you will be in a very good place to scale your application smoothly as it grows in size.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With