I am trying to find an efficient way to create a new array by repeating each element of an old array a different, specified number of times. I have come up with something that works, using array comprehensions, but it is not very efficient, either in memory or in computation:
LENGTH = 1e6
A = collect(1:LENGTH) ## arbitrary values that will be repeated specified numbers of times
NumRepeats = [rand(20:100) for idx = 1:LENGTH] ## arbitrary numbers of times to repeat each value in A
B = vcat([ [A[idx] for n = 1:NumRepeats[idx]] for idx = 1:length(A) ]...)
Ideally, what I would like would be a structure akin to the sparse matrix apparatus that Julia has but that would instead store data efficiently based on the indices where repeated values occur. Barring that, I would at least like an efficient way to create a vector such as B in the example above. I looked into the repeat() function, but as far as I can tell from the documentation and my experimentation with the function, it is just for repeating slices of an array the same number of times for each slice. What is the best way to approach this?
Sounds like you're looking for run-length encoding. There's an RLEVectors.jl package here: https://github.com/phaverty/RLEVectors.jl. Not sure how usable it is. You could also make your own data type fairly easily.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With