Often times people talk in their (optimization & performance related) questions and answers about 'heavy load'.
I'm trying to quantify this in the context of a regular web application on a typical server (take SO & its fairly small infrastructure as example) in a number of Requests per Minute, assuming that they return immediately (to simplify and take database speeds etc. out of the equation).
I'm looking for a nominal number/range, not 'where the CPU maxes out' or similar. A rough approximation would be great (e.g. >5000/min). Thank you!
Also, typical web servers and applications are usually not very well suited for high-level optimization, so Vinko Vrsalovic's suggestion of 200 is pretty realistic.
If you know the number of concurrent users at any given time, the response time of their requests, and the average user think time, then you can calculate the number of requests per minute. Typically, start by estimating the number of concurrent users that are on the system.
Average 200-300 connections per second. Spiking to 800 connections per second.
With a single CPU core, a web server can handle around 250 concurrent requests at one time, so with 2 CPU cores, your server can handle 500 visitors at the same time. Getting the balance right between performance and cost is crucial as your site grows in popularity.
I would think that the proper answer to this, given that you don't want the hardware load measure (CPU, memory, IO utilization), is that heavy load is the amount of requests per time unit at or over the required maximum amount of requests per time unit.
The required maximum amount of requests is what has been defined with the customer or with whomever is in charge of the overall architecture.
Say X is that required maximum load for the application. I think something like this would approximate the answer:
0 < Light Load < X/2 < Regular Load < 2X/3 < High Load < X <= Heavy Load
The thing with a single number out of thin air is that it has no relation whatsoever with your application. And what heavy load is is totally, absolutely, ineludibly tied to what the application is supposed to do.
Although 200 requests per second is a load that would keep small webservers busy (~12000 a minute).
The from-the-box number of open connections for most servers is usually around 256
or fewer, ergo 256
requests per second. You can push it up to 2000-5000
for ping requests or to 500-1000
for lightweight requests. Making it even higher is very difficult and requires changes all the way in network, hardware, OS, server application and user application (see problem 10k).
Seek speed + latency for HDDs is around 1-10ms, for SSDs it's 0.1-1 ms
. So, it's 100-100 000
IOPS. Let's take 100 000
as top value (SSD consequential write)
Usually connection stays open for at least 1 x latency value
ms. Latency from client to server is rarely below 50-100 ms
, so only 100 000/50
= 2000
IOPS can create new connections.
So, 2000
ping request per second from different clients is a base upper limit for a normal server. It can be improved via usage of RAM disk or adding more SSDs to increase IOPS number, routing requests to reduce ping, changing/modifying OS to reduce kernel overhead etc. Usually it's also higher due to many requests coming from same client (connection) and limited number of clients at all. In good conditions it can go up to hundreds of thousands
On the other hand, higher ping, application execution time, OS and hardware imperfection can easily reduce the base value to several hundreds requests per second. Also, typical web servers and applications are usually not very well suited for high-level optimization, so Vinko Vrsalovic's suggestion of 200
is pretty realistic.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With