Suppose I have map like this:
var map = {"a" : 100, "b" : 200, "c": 700};
And I want an Array consisting of "a" 100 times, "b" 200 times and "c" 700 times:
map_array = [a, a, a, a, ... a, b, b, b, ... b, c, c, c, ... c]
Simple solution is to just loop the frequency times and push in the array:
var map_array = []
for(key in map)
{
for(var i=1; i <= map[key] ; i++)
{
map_array.push(key)
}
}
But this will obviously take time to processes with large data, can we rework the above function to make it more efficient?
By using ArrayList as intermediate storage:Create an ArrayList with the original array, using asList() method. Simply add the required element in the list using add() method. Convert the list to an array using toArray() method.
JavaScript String repeat() The repeat() method returns a string with a number of copies of a string. The repeat() method returns a new string. The repeat() method does not change the original string.
With ES6, we have a javascript Set object which stores only unique elements. A Set object can be created with array values by directly supplying the array to its constructor. If the array has duplicate values, then they will be removed by the Set. This means that the Set will only contain unique array elements.
It seems to me that the real problem here is constructing the sub-arrays of repeated "a"
's, "b"
's, and "c"
's. Once you have them, you can just concat
them to make your final array.
So, what we really want is a function f(x, n)
which creates an array filled with n
x
's.
So, as a standard testbed, I'm going to define a pair of clock
functions. The first measures the time it takes some array-filling function to create 500000 arrays, each containing 2187 "a"
's. The second measures the time it takes some array-filling function to create 500 arrays, each containing 1594323 "a"
's. I chose powers of three because some of my algorithms are binary-based, and I wanted to avoid any coincidences. Regardless, all of the algorithms will for for any n
.
var clock1=function(f)
{
var m,t;
m=500000;
t=Date.now();
while(m--)
{
f("a", 2187);
}
t=Date.now()-t;
return t;
};
var clock2=function(f)
{
var m,t;
m=500;
t=Date.now();
while(m--)
{
f("a", 1594323);
}
t=Date.now()-t;
return t;
};
I'm running this test on my local machine running plain v8 in strict mode. Below are some candidates for f
:
As already suggested by Alex, you can do this using a linear loop. Simply define an array and run a loop which executes n
times, each time adding one x
to our array.
var f=function(x,n)
{
var y;
y=Array(n);
while(n--)
{
y[n]=x;
}
return y;
};
We can optimize by using a counting variable, n
to avoid calling push
or y.length
, as well as pre-initializing the array to the desired length. (Both suggested by Alex.) My backwards while
loop is just an old habit that may boost performance slightly.
This function takes 2200ms to pass clock1
, and 90658ms to pass clock2
.
We can also try constructing it using binary concatenation. The idea is that you start out with a single-element array, then , if its length is significantly less than the target length, you concat
it with itself, effectively doubling it. When you get it close to the target size, switch back to adding elements one at a time until it reaches its target size:
var f=function(x,n)
{
var y,m;
y=[x];
m=1;
while(m<n)
{
if(m*2<=n)
{
y=y.concat(y);
m*=2;
}
else
{
y[m]=x;
m++;
}
}
return y;
};
Here, m
is just a counting variable to keep track of the size of y.
This function takes 3630ms to pass clock1
, and 42591ms to pass clock2
, making it 65% slower than the linear method for small arrays, but 112% faster for large ones.
We can boost performance still further, however, by using full binary construction. The partial binary method suffers because it is forced to switch to element-by-element addition when it approaches its target length (on average, about 75% of the way through). We can fix this:
First, convert the target size into binary and save it to an array. Now, define y
to be a single-element array z
to be an empty array. Then, loop (backwards) through the binary array, for each element concat
ing y
with itself. In each iteration, if the respective binary digit is 1, save y
into z
. Finally, concat
all of the elements of z
together. The result is your complete array.
So, in order to fill an array of length 700, it first converts 700 to binary:
[1,0,1,0,1,1,1,1,0,0]
Looping backwards across it, it performs 9 concat
's and 6 element-additions, generating a z
which looks like this:
[0,0,4,8,16,32,128,512]
// I've written the lengths of the sub-arrays rather than the arrays themselves.
When it concat
's everything in z
together, it gets a single array of length 700, our result.
var f=function(x,n)
{
var y,z,c;
c=0;
y=[x];
z=[];
while(n>0)
{
if(n%2)
{
z[c++]=y;
n--;
}
if(n===0)
{
break;
}
n/=2;
y=y.concat(y);
}
return z.concat.apply([],z);
};
To optimize, I've compressed the binary conversion step and the loop together here. z.concat.apply([],z)
uses a bit of apply
magic to flatten z
, an array of arrays, into a single array. For some reason, this is faster than concating to z
on the fly. The second if
statement prevents it from doubling y
one last time after the computation is complete.
This function takes 3157ms to pass clock1
and 26809ms to pass clock2
, making it 15% faster than the partial binary method for small arrays and 59% faster for large ones. It is still 44% slower than the linear method for small arrays.
The concat
function is weird. The larger the arrays to be concatenated, the more efficient it becomes, relatively speaking. In other words, combining two arrays of length 100 is significantly faster than combining four arrays of length 50 using concat
. As a result, as the involved arrays get larger, concat
becomes more efficient than push
, or direct assignment. This is one of the main reasons why the binary methods are faster than the linear method for large arrays. Unfortunately, concat
also suffers because it copies the involved arrays each time. Because arrays are objects, this gets quite costly. Strings are less complex than arrays, so perhaps using them would avoid this drain? We can simply use string addition (analogous to concatenation) to construct our array, and the split
the resulting string.
A full binary method based on strings would look like this:
var f=function(x,n)
{
var y,z;
y=""+x;
z="";
while(n>0)
{
if(n%2)
{
z+=y;
n--;
}
if(n===0)
{
break;
}
y+=y;
n/=2;
}
return z.split("");
};
This function takes 3484ms to pass clock1
and 14534ms to pass clock2
, making it 10% slower than the array-based full binary method at computing small arrays, but 85% faster for large arrays.
So, overall, its a mixed bag. The linear method gets very good performance on smaller arrays and is extremely simple. The binary string method, however, is a whopping 524% faster on large arrays, and is actually slightly less complex than the binary array method.
Hope this helps!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With