Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

stress testing web applications on less capable hardware

My organization is having an interesting internal debate right now that raises a question that I would like to open to the community at large.

The issue at hand is our environment in which we do stress-testing, capacity-testing, performance-regression-testing, and the like.

On one side of the debate are some software engineers who would like this environment to mirror the production environment as much as possible, in the interest of making the results as meaningful as possible. While we currently do have an environment for such testing, it is far less capable than the production system, and these software engineers feel that they are reaching the limits of what they can learn from it.

On the other side of the debate are some network engineers who both administer the environments and control the purse-strings. While they concede that capacity-testing would be better in an environment that is a better replica of the production environment, they argue that – for the purposes of stress testing – a more modest environment would have the effect of magnifying performance bottlenecks, making them easier to discover and fix.

This finally brings us to the part that piqued my interest: one software engineer suggests that while a more modest stress environment will increase the likelihood that you will encounter some bottleneck, it does not necessarily follow that it would help you find the next bottleneck you may encounter in production. The scaling effect, he argues, may not be linear.

Is there merit to that point of view? If yes, then why? What are the sources of that nonlinearity?

There are a lot of moving parts involved here: a cluster of java application servers, a cluster of database servers, lots of dynamic content being generated for each HTTP hit.


Edit: I appreciate everybody's thoughts so far, but I was really hoping that someone would do more than re-affirm one side or the other and actually tackle the question of "why". If there is such nonlinearity, what gives rise to it? Better yet it would be great if the reasons were expressed in terms of the CPU, memory, bandwidth, latency, interactions between subsystems, what have you... TerryE, you have come the closest. You should re-post your comment as an answer for the bounty if no one else steps up

like image 853
pohl Avatar asked Oct 22 '12 16:10

pohl


People also ask

Why is stress testing applicable to only certain types of systems?

Stress testing is particularly important for critical software but is used for all types of software. Stress testing emphasizes robustness, availability, and error handling under a heavy load rather than what is correct behavior under normal situations.

What is stress testing in web application?

Stress Testing is a type of software testing that verifies stability & reliability of software application. The goal of Stress testing is measuring software on its robustness and error handling capabilities under extremely heavy load conditions and ensuring that software doesn't crash under crunch situations.

Which tool can be used for stress testing of application?

LoadrunnerLoadrunner from HP is the widely used tool to perform stress testing and the results provided by Loadrunner are considered as a benchmark.

Can you give three reasons why you need to conduct stress testing on your computer devices?

Reasons can include: to determine breaking points and safe usage limits; to confirm that the intended specifications are being met; to determine modes of failure (how exactly a system may fail), and to test stable operation of a part or system outside standard usage.


2 Answers

Your software developer is right and I will take the point even further.

When you test an application components, like a web service, to see its behaviour under load, it is understandable to use a less capable environment. You can find the bottlenecks about memory, io etc. And most probably will find bugs and oversights like out of memory errors and log files getting huge.

But when your application components are running as intended and you need to test the whole shebang, you need to test the real environment.

When you run stress tests on an environment, you measure that environment's behaviour under load and its bottlenecks. While this tests may provide valuable information, this information will not be about your production system. The bottlenecks you find might not be relevant to your real system and you may spend precious development time to fix the bugs that do not exist. To know about bottlenecks you really might face with, you should run your stress tests on your real production system (preferably before the grand opening).

like image 161
ali köksal Avatar answered Nov 24 '22 12:11

ali köksal


The assumption of the network engineers is that modest system is basically a scale model of the production system. They are also assuming that the various characteristics of the production environment which would be affecting the software performance are mirrored in the more modest system just at lower levels however in the same ratios. For instance, the CPU is not as fast, there is not quite as much memory, the storage is a bit slower, etc. and all of these differences are in similar ratios such that if everything were magically multiplied by some factor, say 1.77, the resulting changed modest system would be exactly like the production system.

However that the modest system is an exact scale model in all particulars of the production system is difficult for me to believe.

Here is a specific example. Lets say that measurements on the production system indicates that CPU utilization, the percentage of time the CPU is not idle, is too high. So you put the software on the modest system and do measurements and discover that on the modest system, the CPU utilization is lower. An investigation reveals that the modest system has slower storage so the CPU is spending more time idle waiting on data transfer from storage to complete because the application is I/O bound on the modest system where as on the production system it is not. This difference is due to the modest machine not being an exact scale model of the production machine because the CPU ratio is different than the I/O transfer ratio.

Another example would be having more memory allowing fewer page faults in the production environment. When the software is loaded onto the more modest machine, there are more page faults due to having less physical memory. With the various applications paging in and out, they begin to affect each other as pages of other applications are swapped out and then swapped back in again. On the product machine with larger memory, this cascading page fault behavior is not seen because there is sufficient memory to hold more applications simultaneously.

The point that I am really trying to make here is that a computer with all its various parts and applications is a complex, dynamic system. The idea that one computing environment is just a scale model of another is too simplistic of an assumption. Using a modest system can certainly provide valuable data. However once the gross adjustments have been made to the software and you are beginning to get into more subtle detailed adjustments, the differences in the environment can have a large impact on the results of the testing.

Some citations. Computer systems are dynamical systems by Tod Mytkowicz, Amer Diwan, and Elizabeth Bradley.

Bayesian fault detection and diagnosis in dynamic systems by Uri Lerner, Ronald Parr, Daphne Koller, and Gautam Biswas.

like image 22
Richard Chambers Avatar answered Nov 24 '22 13:11

Richard Chambers