I'm trying to solve a rather large linear programming problem in Mathematica, but for some reason the bottleneck is setting up the array of linear constraints.
My code for initializing the matrix looks like this:
AbsoluteTiming[S = SparseArray[{{i_, 1} -> iaa[[i]],
{i_, j_} /; Divisible[a[[j]], aa[[i]]] -> 1.}, {2*n, n}]]
Here n is 4455, iaa is a list of reals, and a, aa are lists of integers. The output I get for this line is
{2652.014773,SparseArray[<111742>,{8910,4455}]}
In other words, it takes 45 minutes to initialize this matrix, even though it only has 111,742 nonzero entries. For comparison, actually solving the linear program only takes 17 seconds. What gives?
Edit: Also, can anyone explain why this takes up so much memory as it is running? Because in user time, this calculation takes less than ten minutes... most of the computation time is spent paging through memory.
Is Mathematica for some reason storing this matrix as a non-sparse matrix while it is building it? Because that would be really really dumb.
You can surely do a lot better. Here is a code based on the low-level sparse array API posted here, which I will reproduce to make the code self - contained:
ClearAll[spart, getIC, getJR, getSparseData, getDefaultElement, makeSparseArray];
HoldPattern[spart[SparseArray[s___], p_]] := {s}[[p]];
getIC[s_SparseArray] := spart[s, 4][[2, 1]];
getJR[s_SparseArray] := Flatten@spart[s, 4][[2, 2]];
getSparseData[s_SparseArray] := spart[s, 4][[3]];
getDefaultElement[s_SparseArray] := spart[s, 3];
makeSparseArray[dims : {_, _}, jc : {__Integer}, ir : {__Integer}, data_List, defElem_: 0] :=
SparseArray @@ {Automatic, dims, defElem, {1, {jc, List /@ ir}, data}};
Clear[formSparseDivisible];
formSparseDivisible[a_, aa_, iaa_, chunkSize_: 100] :=
Module[{getDataChunkCode, i, start, ic, jr, sparseData, dims, dataChunk, res},
getDataChunkCode :=
If[# === {}, {}, SparseArray[1 - Unitize@(Mod[#, aa] & /@ #)]] &@
If[i*chunkSize >= Length[a],
{},
Take[a, {i*chunkSize + 1, Min[(i + 1)*chunkSize, Length[a]]}]];
i = 0;
start = getDataChunkCode;
i++;
ic = getIC[start];
jr = getJR[start];
sparseData = getSparseData[start];
dims = Dimensions[start];
While[True,
dataChunk = getDataChunkCode;
i++;
If[dataChunk === {}, Break[]];
ic = Join[ic, Rest@getIC[dataChunk] + Last@ic];
jr = Join[jr, getJR[dataChunk]];
sparseData = Join[sparseData, getSparseData[dataChunk]];
dims[[1]] += First[Dimensions[dataChunk]];
];
res = Transpose[makeSparseArray[dims, ic, jr, sparseData]];
res[[All, 1]] = N@iaa;
res]
Now, here are the timings:
In[249]:=
n = 1500;
iaa = aa = Range[2 n];
a = Range[n];
AbsoluteTiming[res = formSparseDivisible[a, aa, iaa, 100];]
Out[252]= {0.2656250, Null}
In[253]:= AbsoluteTiming[
res1 = SparseArray[{{i_, 1} :>
iaa[[i]], {i_, j_} /; Divisible[a[[j]], aa[[i]]] -> 1.}, {2*n, n}];]
Out[253]= {29.1562500, Null}
So, we've got 100 - fold speedup, for this size of the array. And of course, the results are the same:
In[254]:= Normal@res1 == Normal@res
Out[254]= True
The main idea of the solution is to vectorize the problem (Mod
), and build the resulting sparse array incrementally, in chunks, using the low-level API above.
EDIT
The code assumes that the lists are of the right length - in particular, a
should have a length n
, while aa
and iaa
- 2n
. So, to compare to other answers, the test code has to be slightly modified (for a
only):
n = 500;
iaa = RandomReal[{0, 1}, 2 n];
a = Range[ n]; aa = RandomInteger[{1, 4 n}, 2 n];
In[300]:=
AbsoluteTiming[U=SparseArray[ReplacePart[Outer[Boole[Divisible[#1,#2]]&,
a[[1;;n]],aa],1->iaa]]\[Transpose]]
AbsoluteTiming[res = formSparseDivisible[a,aa,iaa,100]]
Out[300]= {0.8281250,SparseArray[<2838>,{1000,500}]}
Out[301]= {0.0156250,SparseArray[<2838>,{1000,500}]}
In[302]:= Normal@U==Normal@res
Out[302]= True
EDIT 2
Your desired matrix size is done in about 3 seconds on my not very fast laptop (M8), and with a fairly decent memory usage as well:
In[323]:=
n=5000;
iaa=RandomReal[{0,1},2 n];
a=Range[ n];aa=RandomInteger[{1,4 n},2 n];
AbsoluteTiming[res = formSparseDivisible[a,aa,iaa,200]]
Out[326]= {3.0781250,SparseArray[<36484>,{10000,5000}]}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With