I'm evaluating k6 for my load testing needs. I've set up a basic load test and I'm currently trying to interpret the error messages and result values I get. Maybe someone can help me interpret what I'm seeing:
If I crank up the VUS to about 300, I start seeing error messages in the console and at 500 lots of error messages.
These mostly consist of:
I also have problems with several checks:
How can res.status
be 0 but the body still contains the proper values?
I suspect that I'm reaching the connection limit of my load producing machine and that's why I get the error messages. So I'd have to set up a cluster or move to the Cloud runners!?
The stats generated by k6 show long http_req_blocked
values, which I interpret as the time waiting to get a connection port. This seems to indicate that the connection pool of my test running machine is at its limits.
http_req_blocked...........: avg=5.66s min=0s med=3.26s max=59.38s p(90)=13.12s p(95)=20.31s
http_req_connecting........: avg=1.85s min=0s med=280.16ms max=24.27s p(90)=4.2s p(95)=9.24s
http_req_duration..........: avg=2.05s min=0s med=496.24ms max=1m0s p(90)=4.7s p(95)=8.39s
http_req_receiving.........: avg=600.94ms min=0s med=82.89µs max=58.8s p(90)=436.95ms p(95)=2.67s
http_req_sending...........: avg=1.42ms min=0s med=35.8µs max=11.76s p(90)=56.22µs p(95)=62.45µs
http_req_tls_handshaking...: avg=3.85s min=0s med=1.78s max=58.49s p(90)=8.93s p(95)=15.81s
http_req_waiting...........: avg=1.45s min=0s med=399.43ms max=1m0s p(90)=3.23s p(95)=5.87s
Can anyone help me out interpret the results I'm seeing?
You are likely running out of CPU on the runner.
As explained in the http specific metrics of the documentation, you are right about http_req_blocked
it is (mostly) the time from when we say we want to make a
request to when we get a socket on which to do it. This is most likely because:
You will need to monitor them (you are highly advised to do this regardless) as test at 100% runner CPUs are probably not very representable :) and you likely don't want the system you are testing to get to 100% as well.
The status code === 0 means that we couldn't make the request/read the response ... for some reason, usually explained by the error
and error_code
.
As I commented if you have status code 0 and a body this is most likely a bug ... at least I don't remember there being a case where this won't be true.
The errors you have list mean (most likely):
dial tcp XXX:443: i/o timeout
this is literally we tried to get a tcp connection and it took too long (probably the reason for the big http_req_blocking)
read tcp YYY(local ip):35252->XXX(host ip):443: read: connection reset by peer
the other side closed the connection .. likely because some timeout was reached - for example, if we don't read over 30 seconds the server decides that we won't read anymore and closes it ... and in the case where CPU is 100% there is a good chance some connection won't get time to be read from.
level=warning msg="Request Failed" error="unexpected EOF"
literally, what it says .. the connection was closed when we totally didn't expect, or more accurately the golang net/http stdlib didn't expect. Likely again a timeout just at a point in the life of the request where the other errors aren't returned.
Get https://REQUEST_URL/: context deadline exceeded"
This is because a request took longer then the timeout (by default 60s) and will at some point be changed to a better error message.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With