Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I correlate pageviews with memory spikes?

I'm having some memory problems with an application, but it's a bit difficult to figure out exactly where it is. I have two sets of data:

Pageviews

  • The page that was requested
  • The time said page was requested

Memory use

  • The amount of memory being used
  • The time this memory use was recorded

I'd like to see exactly which pageviews are correlated with high memory usage. My guess is that I'll be doing a T-test of some kind to determine which pageviews are correlated with increased memory usage. However, I'm a bit uncertain as to what kind of T-test to go with. Can someone at least point me in the right direction?

like image 744
Jason Baker Avatar asked Feb 28 '23 15:02

Jason Baker


1 Answers

I would suggest constructing a dataset with two columns. The first would be the proportion of each page appearances in the highest memory usage times of the distribution, and the second the proportion of those (same) pages for the rest of the values of the memory distribution.

Then you would have to perform a paired test to check if the median of the differences (high - rest) is less or equal to zero (H0), against the alternative hypothesis that the median of difference is greater than zero (H1). I would suggest using the non parametric test Wilcoxon Signed Ranks Test which is a variation of Mann - Whitney Test for paired samples. It also takes into account the magnitude of the differences in each pair, something that other tests ignore (e.g. sign test).

Keep in mind that ties (zero differences) present numerous problems in derivations of nonparametric methods and should be avoided. The preferable way to deal with ties is to add a slight bit of "noise" to the data. That is, complete the test after modifying tied values by adding a small enough random variable that will not affect the ranking of the differences

I hope that test's results and plotting the differences distribution will give you insight into where the problem is.

This is an implementation of Wilcoxon Signed Ranks Test in R language

like image 150
George Dontas Avatar answered Mar 05 '23 17:03

George Dontas