I'm used to run
varnishadm -T localhost:6082 debug.health
to check the backends health status, but how can I check why a probe fails in details (eg timeouts, wrong http status code)?
In Varnish 4.0 you can see the status of all backends and their recent probe success rate with
varnishadm backend.list
A little tricky to find [1] but:
Every poll is recorded in the shared memory log as follows:
NB: subject to polishing before 2.0 is released!
0 Backend_health - b0 Still healthy 4--X-S-RH 9 8 10 0.029291 0.030875 HTTP/1.1 200 Ok
...
Notice that the second word indicates present state, and the first word == "Still" indicates unchanged state.
- 4--X-S-RH -- Flags indicating how the latest poll went
- 4 -- IPv4 connection established
- 6 -- IPv6 connection established
- x -- Request transmit failed
- X -- Request transmit succeeded
- s -- TCP socket shutdown failed
- S -- TCP socket shutdown succeeded
- r -- Read response failed
- R -- Read response succeeded
- H -- Happy with result
- 9 -- Number of good polls in the last .window polls
- 8 -- .threshold (see above)
- 10 -- .window (see above)
- 0.029291 -- Response time this poll or zero if it failed
- 0.030875 -- Exponential average (r=4) of responsetime for good polls.
- HTTP/1.1 200 Ok -- The HTTP response from the backend.
So you should use varnishlog to get fail details.
[1] https://www.varnish-cache.org/trac/wiki/BackendPolling#SHMlog
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With