Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

merge two sorted arrays in julia

Is there a neat function in julia which will merge two sorted arrays and return the sorted array for me? I have written:

c=1
p=1
i=1
n=length(tc)+length(tp)
t=Array{Float64}(n)
while(c<=length(tc) && p<=length(tp))
    if(tp[p]<tc[c])
        t[i]=tp[p]
        p=p+1;
        i=i+1;
    else
        t[i]=tc[c]
        c=c+1;
        i=i+1;
    end
end
while(p<=length(tp))
    t[i]=tp[p]
    i=i+1
    p=p+1
end
while(c<=length(tc))
    t[i]=tc[c]
    i=i+1
    c=c+1
end

but is there no native function in base julia to do this?

like image 800
Lindon Avatar asked Mar 07 '16 00:03

Lindon


People also ask

How do you combine arrays in Julia?

The hvcat() is an inbuilt function in julia which is used to concatenate the given arrays horizontally and vertically in one call. The first parameter specifies the number of arguments to concatenate in each block row.

When to use merge sort?

Mergesort is used when we want a guaranteed running time of O ( n log ⁡ n ) O(n \log n) O(nlogn), regardless of the state of the input. Mergesort is a stable sort with a space complexity of O ( n ) O(n) O(n).


1 Answers

Contrary to the other answers, there is in fact a method to do this in base Julia. BUT, it only works for arrays of integers, AND it will only work if the arrays are unique (in the sense that no integer is repeated in either array). Simply use the IntSet type as follows:

a = [2, 3, 4, 8]
b = [1, 5]
union(IntSet(a), IntSet(b))

If you run the above code, you'll note that the union function removes duplicates from the output, which is why I stated initially that your arrays must be unique (or else you must be happy to have duplicates removed in the output). You'll also notice that the union operation on the IntSet works much faster than union on a sorted Vector{Int}, since the former exploits the fact that an IntSet is pre-sorted.

Of course, the above is not really in the spirit of the question, which more concerns a solution for any type for which the lt operator is defined, as well as allowing for duplicates.

Here is a function that efficiently finds the union of two pre-sorted unique vectors. I've never had a need for the non-unique case myself so have not written a function that covers that case I'm afraid:

"union <- Return the union of the inputs as a new sorted vector"
function union_vec(x::Vector{T}, y::Vector{T})::Vector{T} where {T}
    (nx, ny) = (1, 1)
    z = T[]
    while nx <= length(x) && ny <= length(y)
        if x[nx] < y[ny]
            push!(z, x[nx])
            nx += 1
        elseif y[ny] < x[nx]
            push!(z, y[ny])
            ny += 1
        else
            push!(z, x[nx])
            nx += 1
            ny += 1
        end
    end
    if nx <= length(x)
        [ push!(z, x[n]) for n = nx:length(x) ]
    elseif ny <= length(y)
        [ push!(z, y[n]) for n = ny:length(y) ]
    end
    return z
end

Another option is to look at sorted dictionaries, available in the DataStructures.jl package. I haven't done it myself, but a method that just inserts all observations into a sorted dictionary (checking for key duplication as you go) and then iterates over (keys, values) should also be a fairly efficient way to attack this problem.

like image 68
Colin T Bowers Avatar answered Nov 14 '22 16:11

Colin T Bowers