Working with depth data - Kinect

Tags:

kinect

I just started learning about Kinect through some quick start videos and was trying out the code to work with depth data.

However, I am not able to understand how the distance is being calculated using bit-shifting and various other formulas that are being employed to calculate other stuff too while working with this depth data.

http://channel9.msdn.com/Series/KinectSDKQuickstarts/Working-with-Depth-Data

Are these the particulars which are Kinect-specifics explained in the documentation etc.? Any help would be appreciated.

Thanks

284

asked Nov 10 '11 17:11

Cipher

1 Answers

Pixel depth

When you don't have the kinect set up to detect players, it is a simply array of bytes, with two bytes representing a single depth measurement.

So, just like in a 16 bit color image, each sixteen bits represent a depth rather than a color.

If the array were for a hypothetical 2x2 pixel depth image, you might see: [0x12 0x34 0x56 0x78 0x91 0x23 0x45 0x67] which would represent the following four pixels:

AB
CD

A = 0x34 << 8 + 0x12
B = 0x78 << 8 + 0x56
C = 0x23 << 8 + 0x91
D = 0x67 << 8 + 0x45

The << 8 simply moves that byte into the upper 8 bits of a 16 bit number. It's the same as multiplying it by 256. The whole 16 bit numbers become 0x3412, 0x7856, 0x2391, 0x6745. You could instead do A = 0x34 * 256 + 0x12. In simpler terms, it's like saying I have 329 items and 456 thousands of items. If I have that total of items, I can multiply the 456 by 1,000, and add it to the 329 to get the total number of items. The kinect has broken the whole number up into two pieces, and you simply have to add them together. I could "shift" the 456 over to the left by 3 zero digits, which is the same as multiplying by 1,000. It would then be 456000. So the shift and the multiplication are the same thing for whole amounts of 10. In computers, whole amounts of 2 are the same - 8 bits is 256, so multiplying by 256 is the same as shifting left by 8.

And that would be your four pixel depth image - each resulting 16 bit number represents the depth at that pixel.

Player depth

When you select to show player data it becomes a little more interesting. The bottom three bits of the whole 16 bit number tell you the player that number is part of.

To simplify things, ignore the complicated method they use to get the remaining 13 bits of depth data, and just do the above, and steal the lower three bits:

A = 0x34 << 8 + 0x12
B = 0x78 << 8 + 0x56
C = 0x23 << 8 + 0x91
D = 0x67 << 8 + 0x45

Ap = A % 8
Bp = B % 8
Cp = C % 8
Dp = D % 8

A = A / 8
B = B / 8
C = C / 8
D = D / 8

Now the pixel A has player Ap and depth A. The % gets the remainder of the division - so take A, divide it by 8, and the remainder is the player number. The result of the division is the depth, the remainder is the player, so A now contains the depth since we got rid of the player by A=A/8.

If you don't need player support, at least at the beginning of your development, skip this and just use the first method. If you do need player support, though, this is one of many ways to get it. There are faster methods, but the compiler usually turns the above division and remainder (modulus) operations into more efficient bitwise logic operations so you don't need to worry about it, generally.

199

answered Nov 08 '22 04:11

Adam Davis

Related questions
                            
                                Search ActiveDirectory using full name in C#?
                            
                                A control with ID could not be found for the trigger in UpdatePanel
                            
                                How do languages Scala which need covariant return types and "real" class variance run on the CLR?
                            
                                Retrieve data from mongodb using C# driver
                            
                                Changing csproj OutputType based on project configuration
                            
                                In WPF do DependencyProperty's cause lots of boxing/unboxing when used with value types?
                            
                                Razor Layout doesn't work if file is called _ViewStart.cshtml
                            
                                How to detect if another audio is playing in background? (Windows Phone 7)
                            
                                Why is thread not interrupted when sleeping in finally block
                            
                                c# mvc 3, action overloading?
                            
                                TFS Automated Build Strategy Questions
                            
                                Creating an Uri in .NET automatically urldecodes all parameters from passed string
                            
                                Why does my ScrollViewer destroy my Grid Layout? WPF
                            
                                Can't set the position of a contextmenustrip?
                            
                                EF and Linq OrderBy using two Parameters
                            
                                Trouble saving a collection of objects in Application Settings
                            
                                Is SaveChanges() Necessary with Function Imports (Stored Procedures)?
                            
                                How can I simulate a low memory condition in Windows 7
                            
                                Programmatically assign the permission to a registry subkey
                            
                                Error in Cascade : deleted object would be re-saved by cascade

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With