Question

What is the output stream format of Kinect cameras? For instance it is said like a conventional video, it is 640x480, 30 fps, 4 bytes for each pixel (3RGB+depth)? so 1 second of the raw stream will have 640x480x30x4 bytes. Is there any ways to provide a layered compression for the streams ?

Was it helpful?

Solution

The video frames stream is given as 4 bytes per pixel in BGRA format (blue-green-red-alpha) and the pixels are scanned line by line horizontally in the image domain. A full uncompressed frame of size 640x480 has 640x480x4 bytes.

The depth frames stream is given as 2 bytes per depth pixel in unsigned short format. The value of the unsigned shorts represent the distance from the camera plane in millimeters (if you ignore the 4 least significant bits). The 4 least significant bits contain the identity of the player at that particular pixel. A full uncompressed frame of size 320x240 has 320x240x2 bytes.

You can compress the images using standard image compression algorithms in Java using a Java library for the Kinect SDK.

OTHER TIPS

The point cloud is an uncompressed 12 bit image. It's a format unique to the Kinect, as it has additional user tracking data in the 4 least significant bits.

However, there are a number of different image types, which will depend on your configuration, whether you're using near mode, what your video res is, etc:

http://msdn.microsoft.com/en-us/library/nuiimagecamera.nui_image_type.aspx

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top