I am curious why Raku performs so bad manipulating multidimensional arrays. I have made a quick test initializing a 2 dimension matrix in Python, C# and Raku and the elapsed time is surprisingly high for the later.
For Raku
my @grid[4000;4000] = [[0 xx 4000] xx 4000];
# Elapsed time 42 seconds !!
For Python
table= [ [ 0 for i in range(4000) ] for j in range(4000) ]
# Elapsed time 0.51 seconds
C#
int [,]matrix = new int[4000,4000];
//Just for mimic same behaviour
for(int i=0;i<4000;i++)
for(int j=0;j<4000;j++)
matrix[i,j] = 0;
# Elapsed time 0.096 seconds
Am I doing wrong? It seems way too much difference.
In a multidimensional array, each element in each dimension has the same, fixed size as the other elements in that dimension. In a jagged array, which is an array of arrays, each inner array can be of a different size. By only using the space that's needed for a given array, no space is wasted.
In many applications where 2 dimensional arrays are used,not all rows need to have all the elements i.e they are sparse. Many rows have 0 elements.In such cases it is better to use 2D jagged arrays as they allow unequal number of elements in each row and also allow for empty rows.
Raku firing is a fast process. The kiln is heated up more quickly than a non-raku firing. And, the pottery is removed from the kiln when it is red hot. After being removed from the kiln it is cooled down quickly too.
A jagged array is an array of arrays such that member arrays can be of different sizes, i.e., we can create a 2-D array but with a variable number of columns in each row. These types of arrays are also known as Jagged arrays.
I'll begin with code that is much more closely aligned with your Python code than your own translation. I think the Raku code that's most directly equivalent to your Python is:
my \table = [ [ 0 for ^4000 ] for ^4000 ];
say table[3999;3999]; # 0
This code declares a sigil-free identifier1. It:
Drops "shaping" (the [4000;4000]
in my @table[4000;4000]
). I dropped it because your Python code isn't doing it. Shaping confers advantages but has performance implications.2
Uses binding instead of assignment. I switched to binding because your Python code is doing binding, not assignment. (Python doesn't distinguish between the two.) While Raku's assignment approach brings fundamental advantages that are worth having for general code, it has performance implications.3
This code I've begun my answer with is still slow.
First, the Raku code, run via a Rakudo compiler from December 2018, is about 5 times slower than your Python code, using a Python interpreter from June 2019, on the same hardware.3
Second, both the Raku code and the Python code are slow, eg compared to your C# code. We can do better...
The following code is worth considering:
my \table = [ [ 0 xx Inf ] xx Inf ];
say table[ 100_000; 100_000 ]; # 0
Despite this code corresponding to a notional 10 billion element array rather than the mere 16 million element one in your Python and C# code, the wallclock time for running it is less than half that of the Python code, and just 5 times slower than the C# code. That suggests Rakudo is running the Raku code over a thousand times as fast as the equivalent Python code, and a hundred times as fast as the C# code.
The Raku code appears to be so much faster because the table is being initialized lazily by using xx Inf
.4 The only significant work is done on running the say
line. This causes creation of 100,000 first dimension arrays, and then populating just the 100,000th second dimension array with 100,000 elements, so that the say
can display the 0
held in that array's last element.
One issue underlying your question is that there's always more than one way to do it.5 If you encounter poor performance for code where speed is critical, coding it differently, as I have done, may make a dramatic difference.6
(Another really good option is to ask an SO question...)
Raku is carefully designed to be highly optimizable, i.e. able to one day run much faster given sufficient compiler work over coming years, than, say, Perl 5 or Python 3 can, in theory, ever run, unless they go through a ground-up redesign and years of corresponding compiler work.
A somewhat OK analogy is what happened with Java's performance over the last 25 years. Rakudo/NQP/MoarVM are about halfway through the maturing process that the Java stack has gone through.
1 I could have written my $table := ...
. But declarations of the form my \foo ...
eliminate consideration of sigils and allow use of =
rather than :=
which would be required with a sigil'd identifier. (As a bonus, "slashing out the sigil" results in a sigil free identifier, familiar to coders in the many languages that don't use sigils which of course includes Python and C#.)
2 Shaping may one day result in faster array operations for some code. In the meantime, as mentioned in comments on your question, it clearly currently does the opposite, significantly slowing it down. I imagine that's in large part because every array access is being naively dynamically bounds-checked at the moment, slowly everything down, and there's also been no effort to use the fixed size to help speed things up. In addition, when I tried to come up with a fast workaround for your code, I failed to find one using the fixed size array due to many operations on fixed size arrays being currently unimplemented. Again, these will hopefully be implemented one day but have presumably not been a sufficient pain point for anyone to work on implementing them to date.
3 At the time of writing this, TIO is using Python 3.7.4, from June 2019, and Rakudo v2018.12, from December 2018. Rakudo's performance is currently improving over time significantly faster than the official Python 3 interpreter is, so I would expect the gap between the latest Rakudo and the latest Python, when Rakudo is slower, to be significantly narrower than stated in this answer. In particular, current work is significantly improving the performance of assignments.
4xx
defaults to lazy processing but some expressions force eager evaluation due either to language semantics or current compiler limitations. In the the year old v2018.12 Rakudo, for an expression of the form [ [ foo xx bar ] xx baz ]
to remain lazy and not be forced to evaluate eagerly, both bar
and baz
must be Inf
. In contrast, my \table = [0 xx 100_000 for ^100_000]
is lazy with no use of Inf
. (The latter code is actually storing 100,000 Seq
s in the first dimension rather than 100,000 Array
s -- say WHAT table[0]
displays Seq
rather than Array
-- but most code won't be able to spot the difference -- say table[99_999;99_999]
will still display 0
.)
5 Some folk think that it's a weakness to accept that there's more than one way to think about and code solutions to given problems. In reality it's a strength in at least three regards. First, in general, any given non-trivial problem can be solved by many distinct algorithms with dramatic differences in performance profile. This answer includes an approach already available with a year old Rakudo that will be more than a thousand times faster than Python in practice in some scenarios. Second, a deliberately flexible and multi-paradigm language like Raku allows a coder (or team of coders) to express a solution that they consider elegant and maintainable, or that just gets the job done, based on what they think is best, not what the language imposes. Third, Rakudo's performance as an optimizing compiler is currently notably variable. Fortunately it has a great profiler6, so one can see where a bottleneck is, and great flexibility, so one can try alternate coding and this may produce vastly faster code.
6 When performance matters, or if you're investigating performance issues, consult the Raku doc page on performance; the page covers a range of options including use of the Rakudo profiler.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With