Kinect 9
Kinect 9
Kinect 9
Abstract-Reliable depth estimation is a cornerstone of many results. Section ill introduces quadrotor helicopters and the
autonomous robotic control systems. The Microsoft Kinect is experimental platform and control system against which the
a new, low cost, commodity game controller peripheral that
Kinect was tested. The paper concludes with Section IV, a
calculates a depth map of the environment with good accuracy
and high rate. In this paper we calibrate the Kinect depth and
discussion of experimental flight results using the sensor.
image sensors and then use the depth map to control the altitude
of a quadrotor helicopter. This paper presents the first results of A. Computation of Depth Maps
using this sensor in a real-time robotics control application.
Attempts to compute depth maps can be grouped into
Index Terms-quadrotor, visual flight control, Microsoft
passive or active methods [5]. Passive depth sensing tries
Kinect, depth map
to infer depth from 2D images from multiple cameras, for
I. INTRODUCTION example through stereo correspondence algorithms [6] or
optical flow [7]. Active methods usually employ additional
The Microsoft Kinect (Figure 1) is a low cost peripheral, physical sensors such as lasers, lighting or infra-red illumi
released November 20 10, for use as a game controller with the nation cast on the scene. Structured light [9] based sensors
Xbox 360 game system. The device can be modified to obtain, use triangulation to detect the ranges of points within their
simultaneously at 30 Hz, a 640 x 480 pixel monochrome field of view [ 10]. This solves the correspondence problem
intensity coded depth map and a 640 x 480 RGB video stream. of stereo vision via the constraints induced by the structure
of the light source. Once determined that a camera pixel
contains primary laser return (e.g. not laser light returned
from secondary reflections), the range of the reflecting surface
viewed in the direction of the pixel is immediately determined
to within the resolution capabilities of the system. Thus the
correspondence problem which consumes a great deal of the
CPU time in stereo vision algorithms is replaced with the
computationally much simpler problem of determining which
pixels of the sensor detect the primary laser return. Time-of
Fig. 1. The Microsoft Kinect sensor. From left to right,the sensors shown flight (TOF) cameras too avoid the correspondence problem,
are; the IR projector,the RGB camera,the monochrome camera used for the instead utilising the time--of-flight principle. They illuminate
depth computation. the scene for a short period of time, for example by using
a brief pulse of light, and measure the duration before the
Computation of depth maps are common in visual robotic illumination pulse is reflected back and detected on the image
control systems. They are used in autonomous navigation [ 1], sensor. TOF cameras typically have high power consump
map building [2], [3] and obstacle avoidance [4]. tion due to the high illumination switch currents required,
Due to the importance of depth maps in robotics, this while only achieving moderate resolution [8]. The Primesense1
paper attempts to quantify the accuracy, performance and chipset in the Kinect uses a form of structured light; a
operation of the Kinect sensor, as no public information is proprietary Light Coding™technique to compute depth. The
available. Furthermore, we test the Kinect sensor and its Kinect sensor consists of an infrared laser projector combined
suitability for use in dynamic robotic environments by using with a monochrome CMOS camera, and a second RGB video
the computed depth map to control the altitude of a flying camera. Both cameras provide 640x 480 pixel images at 30 Hz.
quadrotor helicopter. Little information is available on the Light
This paper will proceed as follows. Section I introduces the Coding™technology, or the accuracy of the depth map
Kinect sensor hardware and the use of depth maps in research.
Section II describes the calibration procedure and calibration 1 http://www.primesense.coml?p=535
358
from the Kinect. This paper quantifies the absolute accuracy Let f(x) be the true depth from the image sensor, and x the
of the Kinect depth map and verifies its performance in the raw range value, then
dynamic environment of quadrotor helicopter flight.
f(x) a1 * exp( -((x - bd/C1)2) ( 1)
a2 * exp( -((x - b2)/C2)2),
B. Kinect Hardware Details
+
The Kinect sensor connects to a PClXbox using a modified
USB cable2• The physical USB interface remains unchanged, where
however subsequent to the Kinect release the protocol3 was
decoded and software to access the Kinect was enabled. a1 = 3.169 x 104
The Kinect features two cameras, a Micron MT9Ml 12 b1 = 1338.0
640 x 480 pixel RGB camera, and a 1.3 megapixel
C1 = 140.4
monochrome Micron MT9M00 1 camera fitted with an
IR pass filter. Accompanying the monochrome IR camera a2 = 6.334 x 1018
is a laser diode for illuminating the scene. Through reverse b2 = 2.035 X 104
engineering it was determined the depth map has II-bit C2 = 3154.0.
resolution, and the video 8-bit. Despite the monochrome IR
camera having higher resolution, both cameras only deliver It can be seen from the calibration results (Figure 3) that
640 x 480 pixel images. Both image sensors have an angular the depth map is accurate and repeatable over the 0.4 ... 7.0 m
field of view of 57° horizontally and a 43° vertically. The range. Additionally, if the Kinect is unable to estimate the
Kinect also features a microphone array and a motorized depth to certain regions in the image, those pixels are filled
pivot, although neither of these features were required for with the value 2047, making it easy to ignore these pixels
visual flight control nor subsequently tested as part of this from further image analysis.
evaluation.
B. Camera Calibration and Alignment
II. CALIBR ATION OF THE KINECT SENSORS
A. Depth Camera Calibration Future research may involve combining the images from
both cameras. In order to do so accurately, the intrinsic
The depth camera returns an II-bit number (raw values
and extrinsic parameters of both cameras must be known
in the range 0 ... 2047 ) which needs further processing in
so their images may be represented in a single co-ordinate
order to extract the true depth from the sensor. A calibration
system. Standard stereo computer vision techniques illustrated
procedure was performed whereby a number of reference
in Figure 4 were used to perform this calibration [ 1 1].
images were captured at known distances (Figure 2).
This process was repeated a multiple times over varied
ambient light conditions in order to check the insensitivity
of the depth measurement to environmental conditions. The
results of this calibration procedure is shown in Figure 3.
(a) (b)
Fig. 4. The standard chessboard approach for calculating the camera intrinsic
parameters for the two cameras. (a) and (b) Manual matching of 4 points in
both images (the comer of the chessboard) in order to calculate R and T
matrices.
2necessary to provide additional current. the cameras, was computed by manually matching the outline
3gratefully started by the OpenKinect project; https:/Igithub.comJ of the chessboard between the frames. The rotation (R) and
OpenKinect. translation (T) matrices were thus computed to be;
359
(a) (b)
Fig. 2. The calibration environment for testing. The board in the centre of the frame is placed 650 rnrn from the image plane. (a) Image captured from RGB
camera. (b) False coloured depth map from depth camera.
2.09 X 10-2
[ ] risk posed by the rotors should they collide with people or
T = -7.12 X 10-4
objects. These combined, greatly accelerate the design and
-1.34 X 10-2
test flight process by allowing testing to take place indoors,
III. QUADROTOR HELICOPTER EXPERIMENTAL PL ATFORM by inexperienced pilots, with a short turnaround time for
The first quadrotors for UAV research were developed by recovery from incidents. Finally, the improvement of Lithium
Pounds et al. [ 12], Bouabdallah et al. [ 13], and are now a polymer battery technology has enabled longer flight times
popular rotorcraft concept for unmanned aerial vehicle (UAV) with heavier payloads, increasing the computational power
research platforms. The vehicle consists of four rotors in total, that can be carried onboard, thus increasing the complexity
with two pairs of counter-rotating, fixed-pitch blades located of visual algorithms that can be experimented in real time.
at the four comers of the aircraft (Figure 5).
A. Experimental Hardware
Due to its specific capabilities, quadrotors also provide a
good basis for visual flight control research. First, quadrotors The quadrotor is of custom design and construction [ 14].
do not require complex mechanical control linkages for rotor It features a real time embedded attitude controller running
actuation, relying instead on fixed pitch rotors and using on a 32-Bit ARM 7 Microprocessor at 60 MHz. An inertial
variation in motor speed for vehicle control. This simplifies measurement unit (IMU) is also present, and contains a 3-
both the design and maintenance of the vehicle. Second, the axis accelerometer, 3x single-axis gyroscopes, and a 3-axis
use of four rotors ensures that individual rotors are smaller magnetometer.
in diameter than the equivalent main rotor on a helicopter, The quadrotor hardware consists of a cross-frame made
relative to the airframe size. The individual rotors, there of square aluminum tubes joined by plates in the center.
fore, store less kinetic energy during flight, mitigating the This design has proved to be robust, usually only requiring a
360
propeller replacement after a crash. On this frame are mounted -MtpOinl
-measurldvalutl
four brushless motor / propeller combinations, four brushless
motor controllers, the avionics and the battery. The two pairs
of opposite rotating propellers are mounted at a distance of
400 mm.
The Kinect sensor is mounted under the craft, pointing
towards the ground (Figure 5). Using the calibrated output
from the depth camera, a control system was developed to
maintain a constant altitude during flight of the quadrotor
helicopter and thus evaluate the suitability of the Kinect sensor 1100;-- ;--
-----, - ;--
- -;-
- -;,----!-
, ---;----;----;-------!
AI�ll1me(.)
, ,
V. CONCLUSION
The successful control of quadrotor altitude using the Kinect
depth map demonstrates that the sensor is capable of operation
in dynamic environments. Its low cost, high frame rate and
absolute depth accuracy over a useful range make it suitable
for use on robotic platforms.
Fig. 6. Co-ordinate system for the quadrotor helicopter and Kinect camera
Further work will involve integrating the Kinect into the
(represented here by the arrow adjacent to Zk).
navigation layer of the quadrotor system. It will likely be
moved into a traditional forward pointing orientation and the
and ¢ be the roll angle. Let Zk be the depth observed in the
depth and RGB images combined in the manner described
Kinect body frame, that is the depth perpendicular to the image
in Section II-B. The forward orientation of the depth camera
sensor. The true depth, Zb, corrected for the craft attitude is
will require more robust methods to detect the ground plane;
given by;
the Hough transform and RANSAC will be explored for this
(2)
purpose.
A proportional-integral (PI) controller was implemented to
REFERENCES
control Zb, the quadrotor altitude. The commanded output, c,
from the controller is given by; [1) D. Murray and J. J. Little, "Using real-time stereo vision for mobile
robot navigation," Autonomous Robots, vol. 8,no. 2,p. 161,2000.
361
[7) S. C. Diamantas, A. Oikonomidis, and R. M. Crowder, "Depth es
timation for autonomous robot navigation: A comparative approach,"
in Imaging Systems and Techniques (1ST), 2010 IEEE International
Conference on, july 2010,pp. 426-430.
[8) A. Medina, F. Gaya, and F. d. Pozo, "Compact laser radar and three
dimensional camera," 1. Opt. Soc. Am. A, vol. 23, no. 4,pp. 800-805,
Apr 2006.
[9) S. Yi, J. Suh, Y. Hong, and D. Hwang, "Active ranging system based
on structured laser light image," in SICE Annual Conference 2010,
Proceedings oj, aug. 2010,pp. 747-752.
[10) E. G. H. nstrup, David, "Single Frame Processing for Structured
Light Based Obstacle Detection," in Proceedings of the 2008 National
Technical Meeting of The Institute of Navigation, San Diego,CA,2008,
pp. 514-520.
[II) R. I. Hartley and A. Zisserman,Multiple View Geometry in Computer
Vision, Second ed. Cambridge University Press,ISBN: 0521540518,
2004.
[12) P. Pounds,R. Mahony,P. Hynes,and J. Roberts,"Design of a Four-Rotor
Aerial Robot," Auckland,New Zealand, November 2002.
[13) S. Bouabdallab,P. Murrieri,and R. Siegwart,"Design and control of an
indoor micro quadrotor," vol. 5,apr. 2004,pp. 4393-4398.
[14) J. Stowers,M. Hayes,and A. Bainbridge-Smith,"Quadrotor Helicopters
for Visual Flight Control," in Proceedings of Electronics New Zealand
Conference 2010, Hamilton,New Zealand,2010,pp. 21-26.
362