Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should Node.js be used for intensive processing?

Let's say I'm building a 3-tier web site, with Mongo DB on the back end and some really lightweight javascript in the browser (let's say just validation on forms, maybe a couple of fancy controls which fire off some AJAX requests).

I need to choose a technology for the 'middle' tier (we could segment this into sub-tiers but that detail isn't the focus here, just the overall technology choice), where I want to crunch some raw data coming out of the DB, and render this into some HTML which I push to the browser. A fairly typical thin-client web architecture.

My safe choice would be to just implement this middle tier in Java, using some libraries like Jongo to talk to the Mongo DB and maybe Jackson to marshal/unmarshal JSON to talk to my fancy controls when they make AJAX requests. And some Java templating framework for rendering my HTML on the server.

However, I'm really intrigued by the idea of throwing all this out the window and using Node.js for this middle tier, for the following reasons:

  • I like javascript (the good parts), and let's say for this application's business logic it would be more expressive than Java.

  • It's javascript everywhere. No need to switch between languages, and indeed the OO and functional paradigms, when working anywhere on the stack. There's no translation plumbing between the tiers, JSON is supported natively everywhere.

  • I can reuse validation logic on the client and server.

  • If in the future I decide to do the HTML rendering client-side in the browser, I can reuse the existing templates with something like Backbone with a pretty minimal refactoring / retesting effort.

If you're at this point and like Node, all the above will seem obvious. So I should choose Node right?

BUT... this is where it falls down for me: as we all know Node is based around a single-threaded async I/O web server model. This is great for my scalability and performance in terms of servicing requests for data, but what about my business logic? What about my template rendering? Won't this stuff cause a huge bottleneck for all requests on the single thread?

Two obvious solutions come to mind, but neither of them sits right:

  1. Keep the 'blocking' business logic in there and just use a cluster of Node instances and a load balancer, to service requests in true parallel. Ok great, so why isn't Node just multi-threaded in the first place? Or was this always the idea, to Keep It Simple Stupid and avoid the possibility of multi-threaded complexity in the base case, making the programmer do the extra setup work on top of this if multi-core processing power is desired?

  2. Keep a single node instance, and keep it non-blocking by just calling out to some java implementation of my business logic running on some other, muti-threaded, app server. Ok, this option completely nullifies every pro I listed of using Node (in fact it adds complexity over just using Java), other than the possible gains in performance and scalability for CRUD requests to the DB.

Which leads me finally to the point of my question - am I missing some huge important piece of the Node puzzle, have I just got my facts completely wrong, or is Node just unsuitable for crunching business logic on the server? Put another way, is Node just useful for sitting over a database and servicing many CRUD requests in a more performant and scalable way than some other implementation which blocks on I/O? And you have to do all your business logic in some tier below, or even client-side, to maintain any reasonable levels of performance and scalability?

Considering all the buzz over Node, I'd rather hoped it brought more to the table than this. I'd love to be convinced otherwise!

like image 869
davnicwil Avatar asked Nov 12 '12 20:11

davnicwil


People also ask

Is there any way to do CPU intensive work with Node JS?

You can have a look at this package, the-computer, which may help you do some cpu intensive works in a single instance of node.js app in a simple way. Definitely it is not as effective as raw c++ libs, but it can cover most general computing cases, keeping you in node.js garden while allowing you leverage the cores of the cup.

What is Node JS and why should you care?

Wikipedia states, that “Node.js is an open-source and cross-platform environment to execute code”. According to TechTarget, it is “a development platform aimed at building server-side applications”. And PCMag tells us that Node.js is “a platform with its own web server for better control”. That is certainly enough to grasp the main idea.

What are the most common misconceptions about Node JS?

A common misunderstanding is in regards to where Node.js is used. Many believe that Node.js is primarily used for back-end frameworks and for developing servers, but this is not true: Node.js can be used on both the front-end and the back-end.

Why Node JS is the best back-end framework?

There comes the Node Js framework which is a famous back-end framework completely relying on Javascript. Nodejs is known for its speed of processing requests, that’s one of the reasons why Nodejs is selected as Technology for Web servers and IoT based applications.


1 Answers

On any given system you have N cpus available (1-64, or whatever N happens to be). In any CPU-intensive application, you're going to be stuck with a throughput of N cpus. There's no magical way to fix that by adding more than N threads/processes/whatever. Either your code has to be more efficient, or you need more CPUs. More threads won't help.

One of the little-appreciated facts about multiple-CPU performance is that if you need to run N+1 CPU-intensive operations at the same time, your throughput per CPU goes down quite a bit. A CPU-intensive process tends to hang on to that CPU for a very long time before giving it up, starving the other tasks pretty badly. In the majority of cases, it is blocking I/O and the concomitant task-switching that makes modern OS multitasking work as well as it does. If more of our every-day common tasks were CPU-bound, we would discover we needed a lot more CPUs in our machines than we do currently.

The nice thing that Node.js brings to the server party efficiency-wise is a thorough use of each thread. Ideally, you end up with less task switching. This isn't a huge win, but having N threads handling N*C connections asynchronously is going to have a performance advantage over N*C blocking threads running on the same number of CPUs. But the bottom line on CPUs remains the same: if you have more than N worth of actual CPU work to be done, you're going to feel some pain.

The last time I looked at the Node.js API there was a way to launch a server with one listener plus one worker thread per CPU. If you can do that, I would be inclined to go with Node.js provided a few caveats are met:

  • The Javascript-everywhere approach buys you some simplicity. For something complicated, I would be concerned about the asynchronous programming style making things harder rather than easier.
  • The template-processing and other CPU-intensive tasks aren't appreciably slower in Node.js than your other language/platform choices.
  • The database drivers are reliable.

There is one downside that I can see:

  • If a thread crashes, you lose all of the connections being serviced by that thread.

Finally, try to remember that programmer time is generally more expensive than servers or bandwidth.

like image 114
slashingweapon Avatar answered Oct 19 '22 17:10

slashingweapon