KinectFlix: A Hand Gesture Recognition Application for Kinect to Watch NetFlix

KinectFlix is a hand gesture recognition application which uses a Xbox 360 Kinect to interact with NetFlix while watching online movies and TV shows. This project is designed and developed with the intention of providing an easily adaptable interface for the Xbox 360 Kinect to be used with Windows Presentation Foundation (WPF) applications to provide a controller-less interactive user interface.


Introduction
For decades, researchers have been exploring new approaches to physical and digital space integration [1]. Recent years have been particularly exciting due to the relatively inexpensive, gesture-based control interfaces now commonplace with gaming consoles. Although the Microsoft Wii pioneered sensing and gensture recognition technology, the Microsoft Kinect is arguably becoming the platform of choice by gamers and researchers alike [2]. Microsoft released the Xbox 360 Kinect to North America and the UK in November 2010. The Kinect is a controller-free entertainment console that has been a monumental success for Microsoft. With only hand gestures and voice commands, this hardware allows users to interact with software without needing assistance from any other input device. Input devices are a significant part of every computer system but with this sort of implementation you essentially are the input device [3]. The idea for KinectFlix came after the Kinect was released.
Since Presentation Foundation (WPF) based application, which is a next generation presentation system for building Windows client applications with visually stunning user experiences. NetFlix Odata, which streams and updates information from the NetFlix database, was used in creating the user interface for the application. KinectFlix functions by processing information being read by two cameras on the Xbox 360 Kinect. That data is then manipulated to correspond to the users hand to mouse coordinates of the user's computer. Depth perception of the users hand is used to create functionality for mouse clicks and hand gestures. This functionality is then used to navigate and search a user interface that contains a repository of movies and TV shows that are available on NetFlix.

The Problem
Although there are many applications that have been created that use the Kinect to provide some functionalities, there has not been an application that allows one to plug in a user interface and already have that basic functionality of the Kinect for use in NetFlix at their disposal. As tangible interfaces are increasingly becoming commercially available products in the current generation [4], this will eliminate a developer from having to repetitively "Reinvent" functionality for the Kinect.
On the other end, the user interface provided allows one user to search through the NetFlix database for movies without having to use the remote. This is convenient in the sense that it saves time in the situation where the remote canot be found or if the user of the product does not feel like getting up to get the remote control.
It is the intention of this application to provide software developers with a convenient means of starting an application which implements technologies of the Microsoft Kinect as well as a convenient means for users to browse the NetFlix repository in a controller-less environment. While this application is not yet complete, the following sections of this report will be a guide through the implementation and development process of the application that makes these functionalities possible.

Combinations of ManagedNite and OpenNI with Xbox Kinect
A combination of the ManagedNite Library with the OpenNI device drivers are used with the Microsoft Xbox 360 Kinect to enable a PC to use the Kinect and provide a .NET library for accessing and controlling motor functions and data being processed by the Kinect.
OpenNI is an open source framework that provides necessary API's for writing applications that will utilize natural interactions. This framework is being used for tracking the hands of the user with the Kinect. Once the user's hand is registered with this API, a point is created that then corresponds to the movements and gestures of the user's hand until that point is lost. The state of the API tracking a user's hand is called a session.
The ManagedNite library is a C# wrapper to the OpenNI API's that allows the application to integrate the Kinect with a series of object classes to read and manipulate data. This library allows the developer to create session states and events that will correspond with hand movements. The combination of session states and events are what makes hand gestures and functionality through an application possible in a controller-less environment.
The Kinect for Xbox360 is equipped with two cameras, 4 microphones, and a motor that is used for pivoting the cameras vertically. Of the two cameras, one of them is used with an infrared emitter which creates a depth map of what that camera is seeing. This depth map reports distance to the nearest surface at each pixel [5]. The other camera captures the visual spectrum at a 640 X 480 resolution. The combination of these two cameras are used to create a three dimensional model of everything that it sees. In essence, this sensor model observes the user's behavior and generates data [6]. This enables the identification of hand gestures regardless of the orientation of the user to the camera [7].

KinectFlix Back End Structure
KinectFlix has an elaborate back end that is used to handle the interactions between the Kinect and the user as well as the interaction between the Kinect and the user interface. This section will go into detail about the implementation of the earlier mentioned libraries and the streaming of content using the NetFlix Odata.
When the application starts, the Kinect is initialized and sits in an idle state until a gesture is recognized that will tell the Kinect to start tracking and filtering data. Upon initialization, the cameras and the motors are started, and the monitor area is determined and scaled accordingly. Once the initializing gesture is detected, various event handlers and listeners are created such as Swipe and Click shown in Figure 1. These listeners are guidelines for what the Kinect is looking for. Initialized event handlers and listeners created when initializing gesture is detected In Figure 2, when the Kinect sees that guidelines are met, an event handler for a gesture is fired. When a gesture is triggered, a predefined programmed instruction set will be executed [8]. This method of interaction can easily be altered so that the gesture will interact differently based on the needs of the application. No single method for an 8 KinectFlix: A Hand Gesture Recognition Application for Kinect to Watch NetFlix automatic hand gesture is suitable for every application. Each gesture recognition algorithm depends on user cultural background, application domain, and environment [9].
When a specific hand gesture is not being executed, the application is tracking the hand coordinates and using them to control the mouse. This is done by constantly relaying the coordinates of the hand to the coordinates of the mouse shown in Figure 3. An algorithm is used to transition from hand coordinates to mouse coordinates.  The above mentioned interactions work together to perform complete human computer interaction without the use of a physical input device. The innovative aspects introduced by hands free gaming systems are an indication that technology is progressively reaching more natural ways of human interaction with machines [10]. These indications are also true with the future of software applications.

Graphical User Interface
The user interface was developed with the intentions of allowing the user to browse the NetFlix repository by genre of movie. The following section provides a brief walkthrough of the user interface and what it provides.
This application is started at the Welcome Screen shown in Figure 4. The user is prompted to begin waving. As long as the Kinect has detected the user's hand, this application automatically navigates to an alternate screen to notify the user that his/her hand is now being tracked by the Kinect. The user is then prompted to click the "Browse" button, which then brings the user to the NetFlix Repository.
Upon reaching the actual NetFlix repository within the application, the user is presented with a multitude of options. By default, a category is already selected and displays movies for that particular genre as shown in Figure 5.
In order to switch the category, the user should move his hand from right to left or from left to right. This registers within the application as a "Swipe". When this swipe is detected the storyboard will activate and automatically change the genre. Once the user has found a genre they would like to explore. They may either scroll up and down through the list of movies or select the movie and view more information about it. To scroll up and down, click the up and down arrows located on the right side of the list of available movie titles. To perform a click, push the hand 9 that is guiding the mouse forward. This will then perform a mouse click within the application and scroll up or down depending on which button you are pressing. Similarly, to view more details about a title, click on the title you wish to view. A window will then pop up that will display details about the title that the user has chosen as shown in Figure 6.
While the interface that has been provided is not too intricate with details, it does a good job of displaying the functionality of the Kinect. This interface also provides a simple and well developed starter application using NetFlix Odata. The basic functionalities for hand gestures include: •

Conclusion
This project successfully integrates the Microsoft Kinect into a modern business application, implements a Kinect controlled mouse pointer, and uses hand-gestures as "short-cuts." Multiple hand recognition and mouse controls will be the future work.