Consider the following code : <pre class="prettyprint"><code>a=rand(10000); b=rand(10000); tic; 2*(a<b); toc; tic; 2.*(a<b); toc; </code></pre> The result is : <pre class="prettyprint"><code>Elapsed time is 0.938957 seconds. Elapsed time is 0.426517 seconds. </code></pre> Why the second case is twice faster than the first case ? Edit : I obtain the same result with any size of the matrix, whatever the order you test it, with <pre class="prettyprint"><code>(a<b).*3.56 vs (a<b)*3.56 </code></pre> for example, but not with <pre class="prettyprint"><code>(a.*b)*2 vs (a.*b).*2 </code></pre> or <pre class="prettyprint"><code>(a*b)*2 vs (a*b).*2 </code></pre> It seems there is a link with the logical array, because I have the same result with <pre class="prettyprint"><code>(a&b)*2 vs (a&b).*2 </code></pre> Computer : R2015b, Windows 10 x64

I suggest performing a more strict check of performance. Put your test in a named function to let MATLAB optimize both pieces of code, and run both codes several times, choosing the quickest runtime. My hunch is that they should take the same amount of time, although I can't check right now with reasonable matrix sizes. Here's what I'd do: <pre class="prettyprint"><code>function product_timing(N) a=rand(N); b=rand(N); tmin=inf; for k=1:10 tic; res1=2*(a<b); t=toc; if t<tmin tmin=t; end end disp(tmin); tmin=inf; for k=1:10 tic; res2=2.*(a<b); t=toc; if t<tmin tmin=t; end end </code></pre> <h3>Update</h3> On my R2012b there doesn't seem to be a marked difference between the two methods. However, as others have indicated, R2015b with its new execution engine makes all the difference. While I'm still unsure about the answer, let me collect the feedback from @x1hgg1x (comments on both this answer and question) and @LuisMendo (in chat), just to elaborate on my ignorance: <ul> <li> <code>c*3.56</code> is an integer factor (number of threads?) of times slower than <code>c.*3.56</code> (with any scalar) if <code>c</code> is <code>logical</code>, but not if <code>c</code> is <code>uint8</code> or <code>double</code> </li> <li>the same holds true for vectors, not just square matrices</li> </ul> As it's stated on a MATLAB product page: <blockquote> Run your programs faster with the redesigned MATLAB® execution engine. The improved architecture uses just-in-time (JIT) compilation of all MATLAB code with a single execution pathway. The engine offers improved language quality and provides a platform for future enhancements. Specific performance improvements include those made to: ... Element-Wise Math Operations The execution of many element-wise math operations is optimized. These operations are element-by-element arithmetic operations on arrays such as the following: <code>>> b = ((a+1).*a)./(5-a);</code> </blockquote> However, looking at the docs of <code>.*</code> and <code>*</code>, I can't see too much information relating to the problem. A note from array vs matrix operations concerning array operations like <code>.*</code>: <blockquote> If one operand is a scalar and the other is not, then MATLAB applies the scalar to every element of the other operand. This property is known as scalar expansion because the scalar expands into an array of the same size as the other input, then the operation executes as it normally does with two arrays. </blockquote> And the doc of the matrix product <code>*</code> says <blockquote> If at least one input is scalar, then A*B is equivalent to A.*B and is commutative. </blockquote> As we see, the equivalence of <code>A*B</code> and <code>A.*B</code> is arguable. Well, they are equivalent mathematically, but something strange is going on. Due to the above notes, and the fact that the performance difference only arises for <code>logical</code> arrays, I would consider this an undocumented feature. I would've thought that it's related to <code>logical</code>s only occupying 1 byte each, but the speed-up doesn't manifest with <code>uint8</code> arrays. I suggest that since <code>logical</code>s actually contain information in a single bit, some internal optimization is possible. This still doesn't explain why <code>mtimes</code> doesn't do this, and it's surely related to the internal workings of <code>times</code> vs <code>mtimes</code>. One thing is sure: <code>times</code> doesn't actually fall back on <code>mtimes</code> for scalar operands (maybe it should?). Since in R2012b the whole effect is missing, I believe that the optimized array operations of the new execution engine mentioned above treat logical arrays separately, allowing the special case of <code>scalar.*logical_array</code> to be sped up, but the same optimization is missing from <code>mtimes</code>.

Why in Matlab .* operator is faster than * for a scalar in some case?

Tags:

performance

matlab

Consider the following code :

a=rand(10000); b=rand(10000);
tic; 2*(a<b); toc;
tic; 2.*(a<b); toc;

The result is :

Elapsed time is 0.938957 seconds.
Elapsed time is 0.426517 seconds.

Why the second case is twice faster than the first case ?

Edit : I obtain the same result with any size of the matrix, whatever the order you test it, with

(a<b).*3.56 vs (a<b)*3.56

for example, but not with

(a.*b)*2 vs (a.*b).*2

(a*b)*2 vs (a*b).*2

It seems there is a link with the logical array, because I have the same result with

(a&b)*2 vs (a&b).*2

Computer : R2015b, Windows 10 x64

280

asked Jan 16 '16 21:01

x1hgg1x

1 Answers

I suggest performing a more strict check of performance. Put your test in a named function to let MATLAB optimize both pieces of code, and run both codes several times, choosing the quickest runtime. My hunch is that they should take the same amount of time, although I can't check right now with reasonable matrix sizes. Here's what I'd do:

function product_timing(N)

a=rand(N);
b=rand(N);

tmin=inf;
for k=1:10
    tic;
    res1=2*(a<b);
    t=toc;
    
    if t<tmin
        tmin=t;
    end
end

disp(tmin);


tmin=inf;
for k=1:10
    tic;
    res2=2.*(a<b);
    t=toc;
    
    if t<tmin
        tmin=t;
    end
end

Update

On my R2012b there doesn't seem to be a marked difference between the two methods. However, as others have indicated, R2015b with its new execution engine makes all the difference.

While I'm still unsure about the answer, let me collect the feedback from @x1hgg1x (comments on both this answer and question) and @LuisMendo (in chat), just to elaborate on my ignorance:

c*3.56 is an integer factor (number of threads?) of times slower than c.*3.56 (with any scalar) if c is logical, but not if c is uint8 or double
the same holds true for vectors, not just square matrices

As it's stated on a MATLAB product page:

Run your programs faster with the redesigned MATLAB® execution engine.

The improved architecture uses just-in-time (JIT) compilation of all MATLAB code with a single execution pathway. The engine offers improved language quality and provides a platform for future enhancements.

Specific performance improvements include those made to:

...

Element-Wise Math Operations

The execution of many element-wise math operations is optimized. These operations are element-by-element arithmetic operations on arrays such as the following:

>> b = ((a+1).*a)./(5-a);

However, looking at the docs of .* and *, I can't see too much information relating to the problem. A note from array vs matrix operations concerning array operations like .*:

If one operand is a scalar and the other is not, then MATLAB applies the scalar to every element of the other operand. This property is known as scalar expansion because the scalar expands into an array of the same size as the other input, then the operation executes as it normally does with two arrays.

And the doc of the matrix product * says

If at least one input is scalar, then A*B is equivalent to A.*B and is commutative.

As we see, the equivalence of A*B and A.*B is arguable. Well, they are equivalent mathematically, but something strange is going on.

Due to the above notes, and the fact that the performance difference only arises for logical arrays, I would consider this an undocumented feature. I would've thought that it's related to logicals only occupying 1 byte each, but the speed-up doesn't manifest with uint8 arrays. I suggest that since logicals actually contain information in a single bit, some internal optimization is possible. This still doesn't explain why mtimes doesn't do this, and it's surely related to the internal workings of times vs mtimes.

One thing is sure: times doesn't actually fall back on mtimes for scalar operands (maybe it should?). Since in R2012b the whole effect is missing, I believe that the optimized array operations of the new execution engine mentioned above treat logical arrays separately, allowing the special case of scalar.*logical_array to be sped up, but the same optimization is missing from mtimes.

124

answered Sep 24 '22 20:09

Andras Deak -- Слава Україні

Related questions
                            
                                In IndexedDB, what is the difference between IDBObjectStore.put and IDBCursor.update?
                            
                                Does the execution of a method fetched by Reflection take longer?
                            
                                What is parentValueWatch in AngularJS?
                            
                                Entity Framework gets progressively slow with extra join added even though SQL generated is fast
                            
                                Java Server, TLSv1.1 fast, TLSv1.2 extremely slow (90MByte/sec versus 4MByte/sec)
                            
                                Getting issue "Repository test has failure" while trying to import project from bitbucket into android studio
                            
                                MySQL queries on two different indexes fast, but combined into one query slow. Why?
                            
                                Difference between new ClassName and new ClassName() in entity framewrok query [duplicate]
                            
                                Is there a way to map a value in an object to the index of an array in javascript?
                            
                                Performance Impact of logging class name , method name and line number
                            
                                Quickly split a large vector into chunks in R
                            
                                Why are exceptions detrimental to performance?
                            
                                What is the proper way to create a numpy array of transformation matrices
                            
                                Haproxy SNI vs HTTP Host ACL check performance
                            
                                Why is this Rcpp code slower than byte compiled R?
                            
                                Strategies for checking ISNULL on varbinary fields?
                            
                                How long does thread creation and termination take under Windows?
                            
                                How to profile pthread mutex in linux?
                            
                                Performance implications of context switches for 64-bit segment bases
                            
                                Creating instance of Entity Framework Context slows down under load

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With