I am pretty impressed by the C++ library Eigen which uses expression templates to gain enormous speedup in matrix/vector calculation.
I would like to clone this library in scala. As far as I know scalas type system is not powerful enough to do something like this, but it should be possible lightweight modular staging LMS. There seems to be several project out there (Delight, virtualization-lms,etc). Which would be the right to use for this kind of project in terms of reliability and performance?
Thanks
Edit: I just came across macros in scala 2.10. Maybe this is what I want to use here.
@om-nom-nom
The important part is explained in http://eigen.tuxfamily.org/dox/TopicInsideEigenExample.html
The example explains that a vector addition
u = v + w
does have good performance in (native) C++ since a temporary variable is created for the addition and the this variable is assigned to u as
for(int i = 0; i < size; i++) tmp[i] = v[i] + w[i];
for(int i = 0; i < size; i++) u[i] = tmp[i];
Eigen uses template metaprogramming (explained step by step in the link above) to reduce this at compile time to
for(int i = 0; i < size; i++) u[i] = v[i] + w[i];
which is obviously faster and does not need an extra variable.
To correctly answer this question, you need to ask yourself extra questions:
Are you really sure that C++ code with templates performs better than Scala code? Moderns benchmarks show Java is faster than C++ on matrix inversion. This is mainly due to an improvement in VMs, as well as in HW.
How big are the business benefits you would obtain by having a faster matrix tool compared to the increased cost of dealing with unmanaged memory, dangling pointers, and increased errors and bug due to coding in C++ ?
If the problem can be solved with satisfactory performances in Scala at small scale, and the differences become significant only at high scale, wouldn't be reasonable to look at dividing the problem (matrix / vector multiplication) into different tasks which can be executed in parallel?
Personal note:I had few emails discussions with Joshua Bloch, one of the most influential Java Developers ever and author of Effective Java, and he has pointed me towards an interesting presentation by Brian Goletz (author of Java Concurrency in Practice and very influential in the Java world as well): Not Your Father's Von Neumann Machine: A Crash Course in Modern Hardware
If you conclude that the benefit is there and this is significant, and that in the future your problem siz won't grow so that you would not need to benefit of multi-core execution, you probably need to stay in C++. In the other case, have a look to Scala Macros, which are available since 2.10-M3.
*Extra: avoiding an intermediate variable does not really make sense when using languages which runs on a top of VM such as Java or C#. In fact, as the article you pointed correctly described, there is a certain hazard in how the JVM translate java bytecode into assembler with the JIT. Many of the optimization you can run by hand are already applied by the JVM, and most of this reasoning is useless if you take the precaution of declaring methods and variables FINAL. *
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With