I need to send video from a Kinect camera through a network. I'm capturing video from the following two Kinect sources:
This amounts to a bandwidth of at least roughly 53 MB/s. That is why I need to encode (compress) both video sources at origin and then decode at target. The RGB-D data will be processed by an object tracking algorithm at target.
So far I've found many papers discussing algorithms to achieve this task, like, for instance, this one: RGB and depth intra-frame Cross-Compression for low bandwidth 3D video
The problem is that the algorithms described in such papers do not have a public access implementation. I know, I could implement them myself, but they make use of many other complex image processing algorithms I do not have a sufficient knowledge about (edge detection, contour characterization, ...).
I actually also found some C++ libraries based on the use of a Discrete median filter, delta (avoid sending redundant data), and LZ4 compression: http://thebytekitchen.com/2014/03/24/data-compression-for-the-kinect/
My question is: is there a simpler and/or more efficient way of compressing RGB-D data from a Kinect source?
PS: I'm coding in C++.
In a recent search on the problem I found a paper that describes compressing depth images using the h264 video codec. The authors also provide basic software:
A problem is that h264 can introduce compression artifacts. To minimize the errors introduced by the codec the depth image is split into multiple channels which represent different ranges of distances.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With