Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Working with depth data - Kinect

Tags:

c#

kinect

I just started learning about Kinect through some quick start videos and was trying out the code to work with depth data.

However, I am not able to understand how the distance is being calculated using bit-shifting and various other formulas that are being employed to calculate other stuff too while working with this depth data.

http://channel9.msdn.com/Series/KinectSDKQuickstarts/Working-with-Depth-Data

Are these the particulars which are Kinect-specifics explained in the documentation etc.? Any help would be appreciated.

Thanks

like image 284
Cipher Avatar asked Nov 10 '11 17:11

Cipher


People also ask

How does Kinect depth work?

The depth sensor contains a monochrome CMOS sensor and infrared projector that help create the 3D imagery throughout the room. It also measures the distance of each point of the player's body by transmitting invisible near-infrared light and measuring its "time of flight" after it reflects off the objects.

What is Kinect depth camera?

The depth camera supports 2x2 binning modes to extend the Z-range in comparison to the corresponding unbinned modes. Binning is done at the cost of lowering image resolution. All modes can be run at up to 30 frames-per-second (fps) with exception of the 1 megapixel (MP) mode that runs at a maximum frame rate of 15 fps.

Can you use Kinect in the dark?

And if all of this wasn't enough, the Kinect for Xbox One can even see in the dark. We did a full lights-out test — no light in the room at all — and it had no problem perfectly tracking a subject's movements.


1 Answers

Pixel depth

When you don't have the kinect set up to detect players, it is a simply array of bytes, with two bytes representing a single depth measurement.

So, just like in a 16 bit color image, each sixteen bits represent a depth rather than a color.

If the array were for a hypothetical 2x2 pixel depth image, you might see: [0x12 0x34 0x56 0x78 0x91 0x23 0x45 0x67] which would represent the following four pixels:

AB
CD

A = 0x34 << 8 + 0x12
B = 0x78 << 8 + 0x56
C = 0x23 << 8 + 0x91
D = 0x67 << 8 + 0x45

The << 8 simply moves that byte into the upper 8 bits of a 16 bit number. It's the same as multiplying it by 256. The whole 16 bit numbers become 0x3412, 0x7856, 0x2391, 0x6745. You could instead do A = 0x34 * 256 + 0x12. In simpler terms, it's like saying I have 329 items and 456 thousands of items. If I have that total of items, I can multiply the 456 by 1,000, and add it to the 329 to get the total number of items. The kinect has broken the whole number up into two pieces, and you simply have to add them together. I could "shift" the 456 over to the left by 3 zero digits, which is the same as multiplying by 1,000. It would then be 456000. So the shift and the multiplication are the same thing for whole amounts of 10. In computers, whole amounts of 2 are the same - 8 bits is 256, so multiplying by 256 is the same as shifting left by 8.

And that would be your four pixel depth image - each resulting 16 bit number represents the depth at that pixel.

Player depth

When you select to show player data it becomes a little more interesting. The bottom three bits of the whole 16 bit number tell you the player that number is part of.

To simplify things, ignore the complicated method they use to get the remaining 13 bits of depth data, and just do the above, and steal the lower three bits:

A = 0x34 << 8 + 0x12
B = 0x78 << 8 + 0x56
C = 0x23 << 8 + 0x91
D = 0x67 << 8 + 0x45

Ap = A % 8
Bp = B % 8
Cp = C % 8
Dp = D % 8

A = A / 8
B = B / 8
C = C / 8
D = D / 8

Now the pixel A has player Ap and depth A. The % gets the remainder of the division - so take A, divide it by 8, and the remainder is the player number. The result of the division is the depth, the remainder is the player, so A now contains the depth since we got rid of the player by A=A/8.

If you don't need player support, at least at the beginning of your development, skip this and just use the first method. If you do need player support, though, this is one of many ways to get it. There are faster methods, but the compiler usually turns the above division and remainder (modulus) operations into more efficient bitwise logic operations so you don't need to worry about it, generally.

like image 199
Adam Davis Avatar answered Nov 08 '22 04:11

Adam Davis