I'm implementing a /_status/
endpoint which does some sanity checks on data in our database.
For example, we are collecting measurements and the status should go "bad" if the latest measurement is over an hour old.
I would like to point Pingdom at this URL to leverage their alerting infrastructure and tell us when something's wrong.
On a "good" status I will serve an HTML page with an HTTP 200 OK status. But what would an appropriate HTTP status code be for "bad"? Or would it be more correct not to convey this information via status code, but via HTML content instead?
Thanks!
The 424 (Failed Dependency) status code means that the method could not be performed on the resource because the requested action depended on another action and that action failed.
An update can be implemented with an HTTP PUT or PATCH method. The difference lies in the amount of data the client has to send to the backend. 200 OK - This is the most appropriate code for most use-cases.
The 200 status code is by far the most common returned. It means, simply, that the request was received and understood and is being processed. A 201 status code indicates that a request was successful and as a result, a resource has been created (for example a new page).
500: “There was an error on the server and the request could not be completed.” This is generic code that simply means “internal server error”. Something went wrong on the server and the requested resource was not delivered.
Well... this is an old question, but I ended up here, so I thought I'd give my two cents here: It seems pretty clear that a 2xx should be returned if all is OK
If health is not OK, I think it should return a 5xx result (4xx talks about the client being at fault in the request; 2xx and 3xx are all successful to some degree).
I think that a 5xx is correct because this is a special request that is answering about the state of the whole service. Also, because most Load Balancers offer liveliness checks based on response codes and not all offer a way to parse a more complex payload (other than perhaps a RegExp Match which can make the check brittle).
I agree with @Julien that a 500 (specifically) doesn't seem appropriate, and we've decided on 503 Service Unavailable.
503 seems to fit for a couple of reasons:
We just had a similar discussion in our group. We decided for our purposes that the HTTP response codes should be reporting on your server's success or failure to honor the request. For a GET, this would mean whether or not you can respond with the requested resource. In this case, the requested resource is a health report, so as long as you're returning that successfully, it should be a 200 response.
We're returning JSON for our health check, with a top-level "isHealthy" field set to true or false. Our load balancer and other monitors will parse the JSON and use this field to determine if the system is healthy or not.
If you don't want to parse JSON in your monitors, you could try putting a custom response header to indicate binary health of the system, e.g., System-Health: true
or System-Health: false
. You might have better luck getting monitors which can check that.
If you really want to use a response code, I would recommend an additional endpoint called something like "health" which returns a "204 No Content" when healthy, and a "404 Not Found" when not healthy. In this case, the resource defined by the URL is, symbolically, the health of your system, and so if it's healthy, you can return a successful response. If it's unhealthy, then it's health can't be found, hence the 404.
If your data is 'bad' because there is a service failure (even if that is a backend job failing) then a HTTP 500 seems like a valid response. It indicates that something, somewhere is broken.
It isn't very specific, you're shrugging your shoulders and saying:
The 500 (Internal Server Error) status code indicates that the server encountered an unexpected condition that prevented it from fulfilling the request.
ietf rfc7231
If you ask for health and the server state is not healthy, I'm partial to 409 Conflict which "Indicates that the request could not be processed because of conflict in the current state of the resource" .
Some people might object that if you can respond then the request can be processed, but I disagree. Every error message is a response. The server defines resource semantics. If you ask for the good news resource and the server responds "here is bad news", it didn't give you what it defines to have offered at that resource.
In practice, it's much easier to say 2**="up" 4**="down" and pipe request counts into an availability metric and have a load balancer remove the server from its pool based on the response code. Coming up with ways to argue that "hey, we told you something, so 200 OK" just seems like missing the forrest for the trees to me.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With