Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to pre-allocate array for matrix factorization?

Tags:

julia

My question is instead of F = svd(A), can one first allocate an appropriate memory for an SVD structure, and then do F .= svd(A) ?

What I had in mind is something like the following:

function main()
    F = Vector{SVD}(undef,10)
    # how to preallocate F?
    test(F)
end

function test(F::Vector{SVD})
    for i in 1:10
        F .= svd(rand(3,3))
    end
end
like image 796
Rain Avatar asked Oct 11 '20 02:10

Rain


People also ask

How do you pre allocate an array?

To pre-allocate an array (or matrix) of numbers, you can use the "zeros" function. To pre-allocate an array (or matrix) of strings, you can use the "cells" function. Here is an example of a script showing the speed difference. Make sure you "clear" the array variable if you try the code more than once.

What is Preallocate memory?

"Pre-allocated memory" means that a program should allocate all the required memory blocks once after startup (using the new operator, as usual), rather than allocate memory multiple times during execution and leave memory which is no longer needed for the garbage collector to free.


2 Answers

Your code almost works. But what you probably wanted was this:

using LinearAlgebra

function main()
    F = Vector{SVD}(undef, 10)
    test(F)
end

function test(F::Vector{SVD})
    for i in 1:10
        F[i] = svd(rand(3, 3))
    end
    return F
end

The line that you had in the for loop was this:

F .= svd(rand(3,3))

which does the same operation on every loop, since you were not indexing into F. In particular, this operation was trying to broadcast a single SVD object into all the fields of F on each iteration of the loop. (And that broadcast operation failed because by default structs are treated as iterable objects with a length method, but SVD does not have a length method.)

However, I would recommend against pre-allocating a vector in this situation. First, let's look at the type of F:

julia> typeof(Vector{SVD}(undef, 10))
Array{SVD,1}

The problem with this vector is that it is parameterized by an abstract type. There is a section in the Performance Tips chapter of the manual that advises against this. SVD is an abstract type because the types of its parameters have not been specified. To make it concrete, you need to specify the types of the parameters, like this:

julia> SVD{Float64,Float64,Array{Float64,2}}
SVD{Float64,Float64,Array{Float64,2}}

julia> Vector{SVD{Float64,Float64,Array{Float64,2}}}(undef, 2)
2-element Array{SVD{Float64,Float64,Array{Float64,2}},1}:
 #undef
 #undef

As you can see, it is difficult to correctly specify the concrete type when you are working with complicated types like SVD. Additionally, if you do so, your code will not be as generic as it could be.

A better approach for a problem like this is to use mapping, broadcasting, or a list comprehension. Then the correct output type will automatically be inferred. Here are some examples:

List comprehension

julia> [svd(rand(3, 3)) for _ in 1:2]
2-element Array{SVD{Float64,Float64,Array{Float64,2}},1}:
 SVD{Float64,Float64,Array{Float64,2}}([-0.6357040496635746 -0.2941425771794837 -0.7136949667270628; -0.45459999623274916 -0.6045700314848496 0.654090147040599; -0.6238743500629883 0.7402534845042064 0.2506104028424691], [1.4535849689665463, 0.7212190827260345, 0.05010669163393896], [-0.5975505057447164 -0.588792736048385 -0.5442945039782142; 0.7619724725128861 -0.6283345569895092 -0.15682358121595258; -0.2496624605679292 -0.5084474392397449 0.8241054891903787])
 SVD{Float64,Float64,Array{Float64,2}}([-0.5593632049776268 0.654338345992878 -0.5088753618327984; -0.6687620652652163 -0.7189576326033171 -0.18936003428293915; -0.4897653570633183 0.23439550227070827 0.8397551092645418], [1.8461274187259178, 0.21226179692488983, 0.14194607536315287], [-0.29089551972856004 -0.7086270946133293 -0.6428276887173754; -0.9203610429640889 0.023709029028269546 0.390350397126212; 0.2613720474647311 -0.7051847436823973 0.6590896221923739])

Map

julia> map(_ -> svd(rand(3, 3)), 1:2)
2-element Array{SVD{Float64,Float64,Array{Float64,2}},1}:
 SVD{Float64,Float64,Array{Float64,2}}([-0.5807809149601634 0.5635242755434755 0.5874809951745127; -0.6884131975465821 0.0451903888051729 -0.7239095925620322; -0.43448912329507794 -0.8248625459025509 0.3616918330643316], [1.488618654040125, 0.4122166626927311, 0.004235624485479941], [-0.6721098925787947 -0.2684664121709399 -0.6900681689759235; -0.7384292974335966 0.31185073633575333 0.5978890289498324; -0.05468514413847799 -0.9114136842196914 0.4078414290231468])
 SVD{Float64,Float64,Array{Float64,2}}([-0.3677873424759118 0.8090638526628051 -0.4584191892023337; -0.43071684640222546 -0.5851169278783189 -0.6871107472129654; -0.8241452960126802 -0.055261768200600137 0.5636760310989947], [1.6862363968739773, 0.5899255050748418, 0.24246688716190598], [-0.3751742784957875 -0.7172409091515735 -0.5872050229643736; 0.8600668700980193 -0.505618838823938 0.06807766730822862; -0.3457300098559026 -0.4794945964927631 0.8065703268899])

Broadcasting

julia> g = (rand(3, 3) for _ in 1:2)
Base.Generator{UnitRange{Int64},var"#17#18"}(var"#17#18"(), 1:2)

julia> svd.(g)
2-element Array{SVD{Float64,Float64,Array{Float64,2}},1}:
 SVD{Float64,Float64,Array{Float64,2}}([-0.7988295268840152 0.5443221484534134 -0.256095266807727; -0.5436890668169485 -0.8354777569473182 -0.0798693700362902; -0.257436566171119 0.07543418554831638 0.963346302244777], [1.8188722412547844, 0.3934389096422389, 0.2020398396772306], [-0.7147404794808727 -0.37763644211761316 -0.5886737335538281; -0.6944558966482991 0.4830041206449164 0.5333273169925189; -0.08292800854873916 -0.7899985677359054 0.607474450798845])
 SVD{Float64,Float64,Array{Float64,2}}([-0.5910620103531503 0.3599866268397522 0.7218416228050514; -0.7367495542691711 0.12340124384185132 -0.664809918173956; -0.3283988340440176 -0.9247603805931685 0.1922821996018057], [1.826019614357666, 0.5333148215847028, 0.11639139812894106], [-0.6415954756495915 -0.6888196183142843 -0.33746522643279503; -0.5845558664639438 0.7239484700883465 -0.3663236978948133; -0.4966383841474222 0.037764349353666515 0.8671356118331964])

Furthermore, mapping, broadcasting, and list comprehensions should be just as efficient as pre-allocating the vector. If you're doing a simple mapping, then it's usually easier and more readable to use mapping, broadcasting, or list comprehensions. Pre-allocating vectors is a tool I reserve for writing custom algorithms from scratch.

A final note. In most cases, type parameters are considered an implementation detail and are not a part of the public API for a type. As such, it's best to use generic programming approaches that do not rely on fixing the types for type parameters. Of course there are some exceptions to this rule of thumb, like Array{T,N} and Dict{K,V}.

like image 174
Cameron Bieganek Avatar answered Jan 04 '23 04:01

Cameron Bieganek


There's a differnent way of preallocation -- you can reuse the input array by always overwriting it, with both the rand call and svd's internal needs:

function test!(F::Vector{SVD})
    A = Matrix{Float64}(undef, 3, 3)
    for i in 1:10
        rand!(A)
        F[i] = svd!(A)
    end
end

Cameron's advice still holds. I'd probably use something like

function test()
    A = Matrix{Float64}(undef, 3, 3)
    return map(1:10) do i
        svd!(rand!(A))
    end
end

given that the number of loops seems not be the critical part.

like image 23
phipsgabler Avatar answered Jan 04 '23 06:01

phipsgabler