I'm writing some code to generate and process large amounts of video data. At first I intend only to work with randomized data.
My technique is to treat a pixel as a map of R, G, B, A integer values, to treat a frame of video as a vector of these pixel maps, and to treat video across time as a vector of these vectors of pixel maps. I've written three functions that do this reliably but am running into performance issues when they are scaled.
(defn generateFrameOfRandomVideoData
"Generates a frame of video data which is a vector of maps of pixel values."
[num-pixels-in-frame]
(loop [num-pixels-in-frame num-pixels-in-frame
pixels-added 0
frame '[]]
(if (> num-pixels-in-frame pixels-added)
(recur num-pixels-in-frame
(inc pixels-added)
(conj frame (assoc '{}
:r (rand-int 256)
:g (rand-int 256)
:b (rand-int 256)
:a (rand-int 256))))
frame)))
(defn generateRandomVideoData
"Generates a vector of frames of video data."
[number-of-frames frame-height frame-width]
(loop [number-of-frames number-of-frames
frame-height frame-height
frame-width frame-width
frames '[]]
(if (> number-of-frames (count frames))
(recur number-of-frames
frame-height
frame-width
(conj frames (generateFrameOfRandomVideoData (* frame-height frame-width))))
frames)))
(defn generateRandomizedVideo
"Generates video data based on the specified parameters."
[number-of-frames frame-height frame-width]
(assoc '{}
:number-of-frames number-of-frames
:frame-height frame-height
:frame-width frame-width
:frames (generateRandomVideoData number-of-frames frame-height frame-width)))
Call this to use the functions to generate 60 frames of 1920X1080p video:
(generateRandomizedVideo 60 1920 1080)
When I run this call to generate 10 frames worth of 1920X1080p video the algorithm completes quite quickly. When I call it to produce 60 frames of video it bogs down, does not complete, and generates huge amounts of memory. I watched it take up 16gb worth of memory.
This doesn't really make any sense to me. My algorithm is O(number of frames * (height of frame * width of frame)). Number of frames is O(n) and (height of frame * width of frame is constant at O(height * width). These arguments resolve to O(n).
Now that I've convinced myself and hopefully you that my algorithm isn't simply intractable, I think I have some coherent questions:
How much memory does an integer in Clojure take up in bits? I cant seem to find this information anywhere.
What kind of overhead does storing Integers bound to map keys cause? Is it costlier in terms of memory than just keeping them in a vector?
Why is the algorithm bogging down in terms of time and memory for large numbers of frames? What is Clojure doing to hog so much memory?
Thanks!
How much memory does an integer in Clojure take up in bits?
16 bytes, according to clj-memory-meter:
(mem/measure (rand-int 256))
=> "16 B"
Only 4 bytes are used to represent a 32-bit integer value, but a java.lang.Integer
in Clojure is the same as in Java, and there's additional storage "overhead" for every java.lang.Object
:
(type (rand-int 256))
=> java.lang.Integer
What kind of overhead does storing Integers bound to map keys cause? Is it costlier in terms of memory than just keeping them in a vector?
Yes, almost twice as much in this case:
(mem/measure [(rand-int 256) (rand-int 256) (rand-int 256) (rand-int 256)])
=> "320 B"
(mem/measure {:r (rand-int 256)
:g (rand-int 256)
:b (rand-int 256)
:a (rand-int 256)})
=> "544 B"
Each frame is going to be quite large:
(mem/measure
(into [] (repeatedly (* 1920 1080)
(fn [] {:r (rand-int 256)
:g (rand-int 256)
:b (rand-int 256)
:a (rand-int 256)}))))
=> "232.2 MB"
Why is the algorithm bogging down in terms of time and memory for large numbers of frames? What is Clojure doing to hog so much memory?
Storing a hash map per pixel is going to add up very quickly, if each 1920x1080 frame is ~232 MB that's ~1 GB every 4 frames. I don't think this is specific to Clojure — this is an expensive storage scheme for any language. I'd consider a few things:
Store the individual pixel values more efficiently e.g. represent each pixel as as four unsigned bytes packed into a single 32-bit integer. An open hash map is probably one of the least space-efficient structures when you have this many data points, all in the same structure.
Since your map shape is well-defined, you could use a record to save space and have map-like semantics:
(defrecord Pixel [r g b a])
(mem/measure (->Pixel (rand-int 256)
(rand-int 256)
(rand-int 256)
(rand-int 256)))
=> "112 B" ;; similar deftype is 96 B
A primitive integer array of four is only slightly larger than a single Integer
object:
(mem/measure (int-array (range 4)))
=> "32 B"
A similar vector is 10x larger:
(mem/measure [(int 0) (int 1) (int 2) (int 3)])
=> "320 B"
You could try an array of bytes, but JVM doesn't have unsigned byte primitives:
(mem/measure (byte-array 4))
=> "24 B"
There's a lot of immutable-data-structure-changing happening where each pixel and frame is getting conj
'd onto an existing vector, and that doesn't come "for free" with Clojure's persistent data structures. A more efficient way to do this is using transients, but...
Do you need to store all these frames in-memory? If not, you could stream these lazily without holding them all. If you have to build them into a large, realized collection, maybe use transients, JVM arrays, etc.
(defn gen-frame [num-pixels]
(repeatedly num-pixels
#(->Pixel (rand-int 256) (rand-int 256) (rand-int 256) (rand-int 256))))
(defn frame-op [frame] ;; not very interesting for random pixels
(let [num-pixels (count frame)
avg #(double (/ (apply + (map % frame)) num-pixels))]
(->Pixel (avg :r) (avg :g) (avg :b) (avg :a))))
(time
(->> (repeatedly #(gen-frame (* 1920 1080)))
(map frame-op)
(take 60)
(doall)))
"Elapsed time: 240527.803662 msecs"
=>
(#sandbox.core.Pixel{:r 127.4540152391975, :g 127.4542722800926, :b 127.3754962384259, :a 127.4886294367284}
#sandbox.core.Pixel{:r 127.4727488425926, :g 127.4447955246914, :b 127.4472164351852, :a 127.4626080246914}
...
This example is lazily analyzing each frame of an infinite sequence and taking the first 60 results; the analyzed frame/pixel data is getting garbage collected as this runs, so it won't run out of memory (but the GC will be busy).
These arguments resolve to O(n).
Large constants matter, sometimes!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With