Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How should a node.js stack for a high demand application be setup?

I'm currently working on a Node.js stack application used by over 25000 people, we're using Sails.js framework in particular and we got MongoDB Application is running at a EC2 instance with 30GB of RAM, databse is running on a Mongolab AWS based cluster in same zone the EC2 is. We even got an Elastic Cache Redis instance with 1.5GB for storage.

So the main and huge problem we're facing is LATENCY. When we reach a peak of concurrent users requesting application we're getting multiple timeouts and sails application reaching over 7.5GB of RAM, HTTP requests to API take longer than 15 seconds (which is unacceptable) and when even get 502 and 504 responses sent by nginx.

I can notice Mongo write operations as our main latency issue, however even GET requests take long when a demand peak is present. I can't access production servers, I only got a keymetrics monitoring tool by pm2 (which is actually great) and New Relic alerts.

So, I'd like to know some roadmap to cope these issues, maybe more detailed information should be offered, so far I can say application seems stable when not much users are present.

What are main factors and setup to consider?

So far I know what I should do, but I'm not sure about details or the hows.

IMHO:

  1. Cache as much as possible.
  2. Delay MongoDB write operations.
  3. Separate Mongo databases with higher write demand.
  4. Virtualize?
  5. Tune up node setups.

On optimising code, I've posted another stackoverflow question with one example of code patterns I'm following.

What are your advise and opinion for production applications?

like image 732
diegoaguilar Avatar asked Aug 01 '15 06:08

diegoaguilar


2 Answers

Basically most of main points are already present in answers. I'll just summarise them.

To optimize your application you could do several main things.

  1. Try to move form node.js to io.js it still have a bit better performance and latest cutting edge updated. (But read carefully about experimental features). Or at least from node.js v10 to v12. There was lot of performance optimisations.

  2. Avoid using synchronous functions that uses I/O operations or operating with big amount of data.

  3. Switch from one node process to clustering system.

  4. Check your application to memory leaks. I'm using memwatch-next for node.js v12 and memwatch for node.js v10

  5. Try to avoid saving data to global variables

  6. Use caching. For data that should be accessible globally you could use Redis or Memcached is also a great store.

  7. Avoid of using async with Promises. Both libs are doing same things. So no need to use both of them. (I saw that in your code example).

  8. Combine async.waterfall with async.parallel methods where it could be done. For example if you need to fetch some data from mongo that is related only to user, you could fetch user and then in parallel fetch all other data you need.

  9. If you are using sails.js make sure that it's in production mode. (I assume you already did this)

  10. Disable all hooks that you don't need. In most cases grunt hook is useless.And if you don't need Socket.io in your application - disable it using .sailsrc file. Something like:

    { "generators": { "modules": {} }, "hooks": { "grunt": false, "sockets": false } }

Another hooks that could be disabled are: i18n, csrf, cors. BUT only if you don't use them in your system.

  1. Disable useless globalisation. In config/globals.js. I assume _, async, services could be disabled by default. Just because Sails.js uses old version of lodash and async libraries and new versions has much better performance.

  2. Manually install lodash and async into Sails.js project and use new versions. (look point 11)

  3. Some "write to mongo" operations could be made after returning result to user. For example: you can call res.view() method that will send response to user before Model.save() BUT code will continue running with all variables, so you could save data to mongo DB. So user wouldn't see delay during write operation.

  4. You could use queues like RabbitMQ to perform operations that require lot of resources. For example: If you need to store big data collection you could send it to RabbitMQ and return response to user. Then handle this message in "background" process ad store data. It will also can help you with scaling your application.

like image 61
Konstantin Zolotarev Avatar answered Sep 28 '22 07:09

Konstantin Zolotarev


Firstly, ensure that you are not using synchronous I/O. If you can run on io.js, there is --trace-sync-io flag (iojs --trace-sync-io server.js) that will warn you if you use synchronous code with the following console warning: WARNING: Detected use of sync API.

Secondly, find out why your RAM usage goes so high. If it's because of lots of data loaded into memory (XML parsing, large amount of data returned from MongoDB, etc), you should consider using streams. V8 garbage collection (Google's JavaScript VM used in Node.js / io.js) may cause slowdown if your memory usage goes very high. More here: Node.js Performance Tip of the Week: Managing Garbage Collection and Node.js Performance Tip of the Week: Heap Profiling

Thirdly, experiment with Node.js clustering and MongoDB sharding.

Lastly, check if you using or can switch to MongoDB 3.x. We've observed some significant performance gains just by upgrading from 2.x to 3.x.

like image 25
krl Avatar answered Sep 28 '22 07:09

krl