We have been using Cowboy in production on our Compute Engine machines on GCP and we started benchmarking and improving the performance of our service to handle more Reqs/sec (in our case since we are in Adtech it is bids/sec).
After isolating and handling a lot of the issues separately we came down to Cowboy optimization, these are our current findings and limitations:
We are using Cowboy 2.5 with 200 acceptors and max backlog of 1024
init(Req, _State) ->
T1 = erlang:monotonic_time(),
{ok, BRjson, _} = cowboy_req:read_body(Req),
%% ---- rest of work goes here but is switched off for our test---
erlang:send_after(60, self(), {'RSP', x, no_workers}),
{cowboy_loop, Req, #state{t1 = T1}, hibernate}.
OTP 21
VM args: -smp auto +P 134217727 +K true +A 64 -rate 1200 +stbt db +scl false +sfwi 500 +spp true +zdbbl 8092
Json requests ~4KB in size. And testing is done using a separate machine on the same internal network (no SSL) using jmeter. All requests are POST with keep-alive
GCP Compute Engine 10 vcpu cores and 14GB RAM (now and tested before with the 4 vcpu)
We are able to reach to ~1900 reqs/sec but all CPU cores in htop are showing almost 80% utilization
At 1000 reqs/sec we se cpu utilization at 45-50% per core (still high bearing in mind that no other part of our application is running)
*Note: using the 4 vcpu machine we were able to get close to 700 reqs/sec and memory in all of our tests is barely utilizied or changing with load
QUESTION: How to improve cowboy's performance in terms of cpu usage?
First off, thanks @Pouriya for suggestions--actually, discussing this back and forth made me go back and re-check one of my comments about the right tool for the job. PS: we are on GCP so 72 cores would be out of question at this stage.
Cowboy is great! but it does add a bit of overhead in the critical path of each request--a feature (or issue in my case) that is not needed.
We tested again with Elli (https://github.com/elli-lib/elli) but built a proper testing setup this time and it provided improvement up to 20% ~ exactly what we needed!
If anyone at Cowboy/Ranch team has a way of drastically improving CPU overhead will gladly test since we still use it in our APIs but not the critical path.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With