For a long time, I had thought of C++ being faster than JavaScript. However, today I made a benchmark script to compare the speed of floating point calculations in the two languages and the result is amazing!
JavaScript appears to be almost 4 times faster than C++!
I let both of the languages to do the same job on my i5-430M laptop, performing a = a + b
for 100000000 times. C++ takes about 410 ms, while JavaScript takes only about 120 ms.
I really do not have any idea why JavaScript runs so fast in this case. Can anyone explain that?
The code I used for the JavaScript is (run with Node.js):
(function() { var a = 3.1415926, b = 2.718; var i, j, d1, d2; for(j=0; j<10; j++) { d1 = new Date(); for(i=0; i<100000000; i++) { a = a + b; } d2 = new Date(); console.log("Time Cost:" + (d2.getTime() - d1.getTime()) + "ms"); } console.log("a = " + a); })();
And the code for C++ (compiled by g++) is:
#include <stdio.h> #include <ctime> int main() { double a = 3.1415926, b = 2.718; int i, j; clock_t start, end; for(j=0; j<10; j++) { start = clock(); for(i=0; i<100000000; i++) { a = a + b; } end = clock(); printf("Time Cost: %dms\n", (end - start) * 1000 / CLOCKS_PER_SEC); } printf("a = %lf\n", a); return 0; }
And effectively, JavaScript performs usually better than C++ in this case, because JavaScript optimizes at compile time by default, while C++ compilers needs to be told to optimize.
Rather than forking for every incoming request, or using a pool of processes, node typically uses a single process. This process then calls a 'handler' function every time a web request comes in. This design allows extremely fast responses, and low overhead per-request.
Short Answer. If you are a proficient C# developer and novice JavaScript developer - your C# will most certainly be faster. If you are proficient at both then your C# will probably be faster, but the difference may not be as much as you thought - this is all very program specific.
C++ is ten or more times faster than JavaScript across the board. There is no argument which is faster. In fact, a lot of the time when you compare two languages it's going to be the C language with faster compile time. This result is because C++ is mid-level and compiled.
I may have some bad news for you if you're on a Linux system (which complies with POSIX at least in this situation). The clock()
call returns number of clock ticks consumed by the program and scaled by CLOCKS_PER_SEC
, which is 1,000,000
.
That means, if you're on such a system, you're talking in microseconds for C and milliseconds for JavaScript (as per the JS online docs). So, rather than JS being four times faster, C++ is actually 250 times faster.
Now it may be that you're on a system where CLOCKS_PER_SECOND
is something other than a million, you can run the following program on your system to see if it's scaled by the same value:
#include <stdio.h> #include <time.h> #include <stdlib.h> #define MILLION * 1000000 static void commaOut (int n, char c) { if (n < 1000) { printf ("%d%c", n, c); return; } commaOut (n / 1000, ','); printf ("%03d%c", n % 1000, c); } int main (int argc, char *argv[]) { int i; system("date"); clock_t start = clock(); clock_t end = start; while (end - start < 30 MILLION) { for (i = 10 MILLION; i > 0; i--) {}; end = clock(); } system("date"); commaOut (end - start, '\n'); return 0; }
The output on my box is:
Tuesday 17 November 11:53:01 AWST 2015 Tuesday 17 November 11:53:31 AWST 2015 30,001,946
showing that the scaling factor is a million. If you run that program, or investigate CLOCKS_PER_SEC
and it's not a scaling factor of one million, you need to look at some other things.
The first step is to ensure your code is actually being optimised by the compiler. That means, for example, setting -O2
or -O3
for gcc
.
On my system with unoptimised code, I see:
Time Cost: 320ms Time Cost: 300ms Time Cost: 300ms Time Cost: 300ms Time Cost: 300ms Time Cost: 300ms Time Cost: 300ms Time Cost: 300ms Time Cost: 300ms Time Cost: 300ms a = 2717999973.760710
and it's three times faster with -O2
, albeit with a slightly different answer, though only by about one millionth of a percent:
Time Cost: 140ms Time Cost: 110ms Time Cost: 100ms Time Cost: 100ms Time Cost: 100ms Time Cost: 100ms Time Cost: 100ms Time Cost: 100ms Time Cost: 100ms Time Cost: 100ms a = 2718000003.159864
That would bring the two situations back on par with each other, something I'd expect since JavaScript is not some interpreted beast like in the old days, where each token is interpreted whenever it's seen.
Modern JavaScript engines (V8, Rhino, etc) can compile the code to an intermediate form (or even to machine language) which may allow performance roughly equal with compiled languages like C.
But, to be honest, you don't tend to choose JavaScript or C++ for its speed, you choose them for their areas of strength. There aren't many C compilers floating around inside browsers and I've not noticed many operating systems nor embedded apps written in JavaScript.
Doing a quick test with turning on optimization, I got results of about 150 ms for an ancient AMD 64 X2 processor, and about 90 ms for a reasonably recent Intel i7 processor.
Then I did a little more to give some idea of one reason you might want to use C++. I unrolled four iterations of the loop, to get this:
#include <stdio.h> #include <ctime> int main() { double a = 3.1415926, b = 2.718; double c = 0.0, d=0.0, e=0.0; int i, j; clock_t start, end; for(j=0; j<10; j++) { start = clock(); for(i=0; i<100000000; i+=4) { a += b; c += b; d += b; e += b; } a += c + d + e; end = clock(); printf("Time Cost: %fms\n", (1000.0 * (end - start))/CLOCKS_PER_SEC); } printf("a = %lf\n", a); return 0; }
This let the C++ code run in about 44ms on the AMD (forgot to run this version on the Intel). Then I turned on the compiler's auto-vectorizer (-Qpar with VC++). This reduced the time a little further still, to about 40 ms on the AMD, and 30 ms on the Intel.
Bottom line: if you want to use C++, you really need to learn how to use the compiler. If you want to get really good results, you probably also want to learn how to write better code.
I should add: I didn't attempt to test a version under Javascript with the loop unrolled. Doing so might provide a similar (or at least some) speed improvement in JS as well. Personally, I think making the code fast is a lot more interesting than comparing Javascript to C++.
If you want code like this to run fast, unroll the loop (at least in C++).
Since the subject of parallel computing arose, I thought I'd add another version using OpenMP. While I was at it, I cleaned up the code a little bit, so I could keep track of what was going on. I also changed the timing code a bit, to display the overall time instead of the time for each execution of the inner loop. The resulting code looked like this:
#include <stdio.h> #include <ctime> int main() { double total = 0.0; double inc = 2.718; int i, j; clock_t start, end; start = clock(); #pragma omp parallel for reduction(+:total) firstprivate(inc) for(j=0; j<10; j++) { double a=0.0, b=0.0, c=0.0, d=0.0; for(i=0; i<100000000; i+=4) { a += inc; b += inc; c += inc; d += inc; } total += a + b + c + d; } end = clock(); printf("Time Cost: %fms\n", (1000.0 * (end - start))/CLOCKS_PER_SEC); printf("a = %lf\n", total); return 0; }
The primary addition here is the following (admittedly somewhat arcane) line:
#pragma omp parallel for reduction(+:total) firstprivate(inc)
This tells the compiler to execute the outer loop in multiple threads, with a separate copy of inc
for each thread, and adding together the individual values of total
after the parallel section.
The result is about what you'd probably expect. If we don't enable OpenMP with the compiler's -openmp
flag, the reported time is about 10 times what we saw for individual executions previously (409 ms for the AMD, 323 MS for the Intel). With OpenMP turned on, the times drop to 217 ms for the AMD, and 100 ms for the Intel.
So, on the Intel the original version took 90ms for one iteration of the outer loop. With this version we're getting just slightly longer (100 ms) for all 10 iterations of the outer loop -- an improvement in speed of about 9:1. On a machine with more cores, we could expect even more improvement (OpenMP will normally take advantage of all available cores automatically, though you can manually tune the number of threads if you want).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With