Kinect (codenamed in development as Project Natal) is a line of motion sensing input devices by Microsoft for Xbox 360 and Xbox One video game consoles and Windows PCs. The first-generation Kinect was first introduced in November 2010. Microsoft released the Kinect software development kit for Windows 7 on June 16, 2011.
The Kinect is a depth camera. Normal cameras collect the light that bounces off of the objects in front of them. They turn this light into an image that resembles what we see with our own eyes. The Kinect, on the other hand, records the distance of the objects that are placed in front of it. It uses infrared light to create an image (a depth image) that captures not what the objects look like, but where they are in space.
Why we’d actually want a depth image. What can we do with a depth image that we can’t with a conventional colour image? First of all, a depth image is much easier for a computer to “understand” than a conventional colour image. Any program that’s trying to understand an image starts with its pixels and tries to find and recognize the people and objects represented by them. If we’ve a computer program and we’re looking at colour pixels, it’s very difficult to differentiate objects and people. So much of the colour of the pixels is determined by the light in the room at the time the image was captured, the aperture and colour shift of the camera, and so on. How would you even know where one object begins and another ends, let alone which object was which and if there were any people present? In a depth image, on the other hand, the colour of each pixel tells you how far that part of the image is from the camera. Since these values directly correspond to where the objects are in space, they’re much more useful in determining where one object begins, where another ends, and if there are any people around. Also, because of how the Kinect creates its depth image it is not sensitive to the light conditions in the room at the time it was captured. The Kinect will capture the same depth image in a bright room as in a pitch black one. This makes depth images more reliable and even easier for a computer program to understand.
A depth image also contains accurate three-dimensional information about whatever’s in front of it. Unlike a conventional camera, which captures how things look, a depth camera captures where things are. The result is that we can use the data from a depth camera like the Kinect to reconstruct a 3D model of whatever the camera sees. We can then manipulate this model, viewing it from additional angles interactively, combining it with other pre-existing 3D models, and even using it as part of a digital fabrication process to produce new physical objects. None of this can be done with conventional colour cameras.
And finally, since depth images are so much easier to process than conventional colour images, we can run some truly cutting-edge processing on them. Specifically, we can use them to detect and track individual people, even locating their individual joints and body parts. In many ways, this is the Kinect’s most exciting capability. Tracking users’ individual body parts creates amazing possibilities for our own interactive applications. We have access to software that can perform this processing and simply give us the location of the users. We don’t have to analyze the depth image ourselves in order to obtain this information, but it’s only accessible because of the depth image’s suitability for processing.
If we remove the black plastic casing from the Kinect. The Kinect seems to have three eyes: the two in its center and one off all the way to one side. That “third eye” is the secret to how the Kinect works. Like most robot “eyes,” the two protuberances at the center of the Kinect are cameras, but the Kinect’s third eye is actually an infrared projector. Infrared light has a wavelength that’s longer than that of visible light so we cannot see it with the naked eye. Infrared is perfectly harmless-we’re constantly exposed to it every day in the form of sunlight. The Kinect’s infrared projector shines a grid of infrared dots over everything in front of it. These dots are normally invisible to us, but it is possible to capture a picture of them using an IR camera. One of those two cameras (one of the Kinect’s two “eyes”) is an IR camera. It’s a sensor specifically designed to capture infrared light. The IR camera is the one on the right. This camera’s lens has a greenish iridescent sheen as compared with the standard visible light camera next to it.
Kinect has four microphones to retrieve spatial sound and attenuate noise, interferences and compensate for room acoustics. The kinect camera has colour camera of 640 x 480 resolution. It can track up to six people including two active people for motion analysis and tracking.
Kinect is mainly used to implement Augmented Reality.Augmented reality (AR) is a live direct or indirect view of a physical, real-world environment whose elements are augmented (or supplemented) by computer-generated sensory input such as sound, video, graphics or GPS data.
Kinect has the advantage of requiring no data input device, voice recognition, facial recognition, portable and it can capture full body 3D motion capture. It has some disadvantage as well like privacy issues, not enough research available, cannot detect crystalline or highly reflective objects and it is sensitive to external infrared source (sunlight). Kinect is presently implemented in making of virtual dressing room, virtual piano, health cares, military applications and many other fields.
This technique aims to provide an application that uses gestures to interact with virtual objects in an augmented reality application. It also provides a way for using the gesture-based interactions to manage operations in a virtual walk through environment.