I work on a small company where I work to build some banking software. Now, I have to build some data structure like:
Array [Int-Max] [2] // Large 2D array
Save that to disk and load it next day for future work.
Now, as I only know Java (and little bit C), they always insist me to use C++ or C. As per their suggestion:
They have seen Array [Int-Max] [2] in Java will take nearly 1.5 times more memory than C and C++ takes some what reasonable memory footprint than Java.
C and C++ can handle arbitrarily large files where as Java can't.
As per their suggestion, as database/data-structure become large Java just becomes infeasible. As we have to work on such large database/data-structure, C/C++ is always preferable.
Now my question is,
Why is C or C++ always preferable on large database/data-structure over Java ? Because, C may be, but C++ is also an OOP. So, how it get advantage over Java ?
Should I stay on Java or their suggestion (switch to C++) will be helpful in future on large database/data-structure environment ? Any suggestion ?
Sorry, I have very few knowledge of all those and just started to work on a project, so really confused. Because until now I have just build some school project, have no idea about relatively large project.
Note that, in Java, arr exists but is null-valued. In C, arr doesn't exist until a complete declaration appears. Array objects in Java must be instantiated with a new operation, and it's there that the array size is specified: int[] arr = new int [10]; int[][] 2Darr = new int[10][20];
As per their suggestion: They have seen Array [Int-Max] [2] in Java will take nearly 1.5 times more memory than C and C++ takes some what reasonable memory footprint than Java. C and C++ can handle arbitrarily large files where as Java can't.
The memory allocation for an array includes the header object of 12 bytes plus the number of elements multiplied by the size of the data type that will be stored and padding as needed for the memory block to be a multiple of 8 bytes.
An array is faster and that is because ArrayList uses a fixed amount of array. However when you add an element to the ArrayList and it overflows. It creates a new Array and copies every element from the old one to the new one.
why C/C++ is always preferable on large database/data-structure over Java ? Because, C may be, but C++ is also an OOP. So, how it get advantage over Java ?
Remember that a java array (of objects)1 is actually an array of references. For simplicity let's look at a 1D array:
java:
[ref1,ref2,ref3,...,refN]
ref1 -> object1
ref2 -> object2
...
refN -> objectN
c++:
[object1,object2,...,objectN]
The overhead of references is not needed in the array when using the C++ version, the array holds the objects themselves - and not only their references. If the objects are small - this overhead might indeed be significant.
Also, as I already stated in comments - there is another issue when allocating small objects in C++ in arrays vs java. In C++, you allocate an array of objects - and they are contiguous in the memory, while in java - the objects themselves aren't. In some cases, it might cause the C++ to have much better performance, because it is much more cache efficient then the java program. I once addressed this issue in this thread
2) Should I stay on Java or their suggestion (switch to C++) will be helpful in future on large database/data-structure environment ? Any suggestion ?
I don't believe we can answer it for you. You should be aware of all pros and cons (memory efficiency, libraries you can use, development time, ...) of each for your purpose and make a decision. Don't be afraid to get advises from seniors developers in your company who have more information about the system then we are.
If there was a simple easy and generic answer to this questions - we engineers were not needed, wouldn't we?
You can also profile your code with the expected array size and a stub algorithm before implementing the core and profile it to see what the real difference is expected to be. (Assuming the array is indeed the expected main space consumer)
1: The overhead I am describing next is not relevant for arrays of primitives. In these cases (primitives) the arrays are arrays of values, and not of references, same as C++, with minor overhead for the array itself (length
field, for example).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With