Face As Mouse Through Visual Face Tracking

Face as Mouse Through Visual Face Tracking
Jilin Tu, Thomas Huang Hai Tao
Elec. and Comp. Engr. Dept. Elec. Engr. Dept.

Univ. of Ill. at Urbana and Champ. Univ. of Calif. at Santa Cruz
Urbana, IL 61801 Santa Cruz, CA 95064
jilintu,[email protected] [email protected]
Abstract By retrieving human motion parameters in video cap-

tured at realtime by a camera hooked to computer, mouse
This paper introduces a novel camera mouse driven by cursor can be controlled by the motion parameters. This new
3D model based visual face tracking technique. While cam- mouse tool is called camera mouse[13]. A camera mouse is
era becomes standard configuration for personal com- usually composed of a visual tracking module and a mouse
puter(PC) and computer speed becomes faster and control module. Visual tracking module retrieves motion
faster, achieving human machine interaction through vi- parameters from the video, and mouse control module spec-
sual face tracking becomes a feasible solution to hand-free ifies the rule of control. The framework can be illustrated by
control. The human facial movement can be decom- Fig. 1.
posed into rigid movement, e.g. rotation and translation,
and non-rigid movement, such as the open/close of mouth,
eyes, and facial expressions, etc. We introduce our visual
face tracking system that can robustly and accurately re-
trieve these motion parameters from video at real-time.
After calibration, the retrieved head orientation and trans-
lation can be employed to navigate the mouse cursor, and
the detection of mouth movement can be utilized to trig-
ger mouse events. 3 mouse control modes are investigated
and compared. Experiments in Windows XP environ-
ment verifies the convenience of navigation and operations
using our face mouse. This technique can be an alter-
native input device for people with hand and speech
disability and for futuristic vision-based game and inter-
face.
Figure 1. The framework of camera mouse

1. Introduction
People have long been speculating the possibility of in- Among human body parts, face has been the most stud-
teracting with computer in an natural way, instead of us- ied for visual human tracking and perceptual user interface,
ing hand-controlled input devices, e.g. mouse and keyboard. because face appearance is more statistically consistent in
However in order to do that, the computer has to accom- color, shape and texture, and thus allow computer to de-
modate to human’s natural sensing and perceptual behav- tect and track with robustness and accuracy. With differ-
iors which is usually not trivial. One potential solution is ent assumptions, people have proposed to navigate mouse
to track human body movement in video input, from which with the movement of eyes[14][12][4][16], nose[8][9], and
the intention of human can be inferred for computer to re- nostrils[6][1], etc. For eye tracking, people usually employ
spond to. This kind of technology is called perceptual user infrared lighting cameras, and take advantage of the fact
interface[21][20]. that the iris of human eye has large infra-red light reflec-
Proceedings of the Second Canadian Conference on Computer and Robot Vision (CRV’05)
0-7695-2319-6/05 $ 20.00 IEEE
tion. For nose tracking, [8] claimed that defining nose as face tracking approach[17] to retrieve facial motion para-
an extremum of the 3D curvature of the nose surface makes meters for mouse control. This approach utilizes only one
nose the most robust feature for tracking with high accuracy. camera as video input, but is able to retrieve 3D head mo-
For nostril tracking, skin-color region is usually extracted tion parameters. It is also not merely a facial feature tracker,
first, and nostril can be distinguished by its dark color and the non-rigid facial deformation is also formulated as a lin-
unique contour shape. By tracking the X-Y facial feature ear model and can be retrieved. Based on the motion pa-
coordinates, mouse cursor can be navigated. However, we rameters retrieved by our system, we designed 3 different
also notice the movement and location of facial feature in mouse control modes. In the experiments, the controllabil-
video usually does not coincide with people’s focus of at- ity of the 3 mouse control modes is compared. At last, we
tention on screen. This indeed makes the navigation oper- demonstrate how our system can be utilized to play com-
ations un-intuitive and inconvenient. In order to avoid that puter card game in Windows XP environment.
problem, people have proposed to navigate mouse cursor by The rest of the paper is organized as follows: Section
3D head pose. The estimation of 3D head pose usually re- 2 summarizes our face model. Section 3 explains how our
quires tracking of more than one feature. Head pose can tracking system works. Section 4 describes the design of
be inferred by stereo triangulation if more than one cam- our mouse control strategies. Section 5 shows the experi-
era are employed[16][3] or by inference from anthropolog- mental evaluation of the controllability of our system. And
ical characteristics of face geometry[18][10]. Based on the Section 6 summarizes the paper and gives some analysis of
technical developments in this area, some commercial prod- future directions.
ucts have been developed in recent years[15] [11][7].
For mouse control module, the conversion from 2. 3D face modelling
human motion parameters, i.e. position and/or rota-
tion(orientation), etc., to mouse cursor navigation can The 3D geometry of human facial surface can be rep-
be categorized into direct mode, joystick mode, and dif- resented by a set of vertices {(xi , yi , zi )|i = 1, ..., N } in
ferential mode. For direct mode, a one-to-one mapping space, where N is the total number of vertices. In order to
from the motion parameter domain to screen coordi- model facial articulations, a so-called Piecewise Bezier Vol-
nates is established by calibration off-line or by design ume Deformation Model is developed[17]. With this tool,
based on the a prior knowledge about the human-monitor some pre-specified 3D facial deformations can be manually
setting[18]. Joystick mode navigate mouse cursor by the di- crafted. These crafted facial deformations are called Action
rection(or the sign) of the motion parameters. And the Units(AU)[5]. For our tracking system, 12 action units are
speed of the cursor motion is determined by the magni- crafted as shown in Fig. 1.
tude of the motion parameters[9]. In differential mode, the
cumulation of displacement of the motion parameter dis-
placements drives the navigation of the mouse cursor, AU Description
and some extra motion parameter switches on/off the cu- 1 Vertical movement of the center of upper lip
mulation mechanism so that the motion parameter can 2 Vertical movement of the center of lower lip
be shifted backwards without influence the cursor posi- 3 Horizontal movement of left mouth corner
tion. Therefore this mode is very similar to standard mouse 4 Vertical movement of left mouth corner
mode as user can lift mouse and move back to the ori- 5 Horizontal movement of right mouth corner
gin on mouse pad after performing a mouse dragging 6 Vertical movement of right mouth corner
operation[9]. 7 Vertical movement of right eyebrow
8 Vertical movement of left eyebrow
After mouse cursor is navigated to desired location, the
9 Lifting of right cheek
execution of mouse operations, such as mouse button clicks,
10 Lifting of left cheek
is carried out according to further interpretation of user’s
11 Blinking of right eye
motion parameter changes. The most straightforward in-
12 Blinking of left eye
terpretation is to threshold some specified motion parame-
ters. In [13], the mouse clicks is generated based on ”dwell Table 1. Action Units
time”, e.g. a mouse click is generated if the user keeps the
mouse cursor still for 0.5s. In [4], the confirmation and can-
celation of mouse operations is conveyed by head nodding If human face surface is represented by concate-
and head shaking. Timed finite state machine is designed to nating the vertices coordinates into a long vector
detect the nodding and shaking from raw motion parame- V = (x1 , y1 , z1 , x2 , y2 , z2 , ..., xN , yN , zN )T , the Action
ters. Units can be modeled as the vertices displacement of the de-
In this paper, we proposed to use 3D model-based visual formed facial surface from a neutral facial surface, i.e.,
0-7695-2319-6/05 $ 20.00 IEEE
(k) (k) (k) (k) (k) (k)
∆V (k) = (∆x1 , ∆y1 , ∆z1 , ∆x2 , ∆y2 , ∆z2 , ..., 3. Visual face tracking
(k) (k) (k)
∆xN , ∆yN , ∆zN )T , where k = 1, ..., K with K be-
ing the total number of key facial deformations. There- 3.1. Initialization of face tracking
fore an arbitrary face articulation can be formulated
as The face modeling equation 3 defines a highly nonlin-
ear system. Fortunately visual face tracking is not a process
V = V̄ + Lp (1)
of finding solution with random initial guess. A initial so-
where V̄ is the neutral facial surface, and L3N ×K = lution is usually provided with high accuracy by manual la-
{∆V (1) , ∆V (2) , ..., ∆V (K) } is the Action Unit ma- beling or by automatic detection of the face to be tracked in
trix, and pK×1 is the AU coefficients that define the articu- the first frame of the video.
lation. An illustration of such synthesis is shown in Fig. 2. Our system provided an automatic tracking initialization
procedure. In the first frame of the video, we do face de-
tection using Adaboosting algorithm[22]. After face detec-
tion, the location of facial features are identified by ASM
techniques[2][19]. And finally, the generic 3D face model
is adapted and deformed to fit to the detected 2D facial fea-
tures. In the worst case, if the automatic procedure goes
awry, user is also provided with GUI tool to fine tune the
initialization result. A initialization result is show in Fig-
ure 3.
(a) (b)
Figure 2. The modeling of facial articula-

tions.(a)Neutral face; (b)A smiling face syn-
thesized by linear combination of AUs.
Taking head rotation and translation into account, the

motion at each vertex Vi , i = 1, ..., N on face surface in
video can be formulated as
Vi = R(θ, φ, ψ)Vi + T (2)
where R(θ, φ, ψ) is rotation matrix, and T = [Tx Ty Tz ] Figure 3. The initialization of our tracking
defines head translation. system
Assuming pseudo-perspective camera model with pro-
jection matrix

f s/z 0 0
M=
0 f s/z 0
3.2. Tracking by iteratively solving differen-
where f denotes the focal length, s denotes scaling factor, tial equations.
the projection of the face surface to image plane can be de-
scribed as Though the system is nonlinear, at each video frame it
can be approximated locally by first order Taylor expan-
ViImage = M R(θ, φ, ψ)(V̄i + Li p) + T (3) sion with the solution at previous frame as origin. There-
fore tracking can be formulated as a iterative process of
where ViImage is the projection of the i − th vertex node on solving linearized differential equations. The displacement
face surface. Therefore a face motion model is character- of ViImage , i = 1, ..., N can be estimated from video se-
ized by rigid motion parameters including rotation θ, φ, ψ, quence as optical flow ∆ViImage = [∆Xi , ∆Yi ]T ,i =
translation T = [Tx Ty Tz ], and non-rigid facial motion pa- 1, ..., N . The displacement of model parameters, i.e. dW =
rameter, the AU coefficient vector p. [dθ, dφ, dψ]T , dp, dT = [dTx , dTy , dTz ]T can be computed
0-7695-2319-6/05 $ 20.00 IEEE
with Least-Mean-Square error once the Jacobian matrix faster for multiplication operations comparing to using Eq.
∆ViImage 4.
dW,dp,dT for each vertex node Vi , i = 1, ...N is computed.
More details of the computation can be found in [17].
The whole tracking procedure is illustrated by Fig. 4. At 3.4. Performance of our tracking system
the first frame of the video, the model parameter is initial-
ized. From the second frame on, the optical flow at each Our tracking system currently runs on a Dell worksta-
vertex on face surface is computed, the displacement of the tion with dual CPU of 2GHz and SCSI RAID hard drive.
model parameters is estimated by solving the system of lin- The setup of our camera system is shown in Figure 5. The
ear equations and the model parameters are updated accord- camera is mounted beneath the screen, and looks upward
ingly. This procedure iterates for each frame, and for each to the user’s face. Sequences(120 frames for about 10 sec-
frame, it also iterates in a coarse-to-fine fashion. onds) of the head pose parameters Tx , Ty , Tz , Rx , Ry , Rz
and AU coefficients related to mouth and eyes estimated
by the tracker are shown in Figure 6. The figure indicates
the user moved his head horizontally (in x coordinate) and
back-forth(in z coordinate), rotated his head about y axis
(yaw movement) in the first 40 frames, and opened his
mouth wide at about the 80-th frame, and blinked his eyes
at about the 20-th, 40-th, and 85-th frame.
Figure 4. The flowchart of our tracking proce-

dure
3.3. Fast computation of optical flow
One of the most time-consuming operation for our track-

ing procedure is the computation of optical flow. We com-
pute optical flow by template matching through normalized
correlation. Given the texture template centered at (X, Y ) at
frame t−1, the location of the maximal normalized correla-
tion of the templates in searching window at frame t defines
the optical flow at (X, Y ). Denote the template centered at
(X, Y ) as {t(i, j)}. and the template centered at (x, y) as Figure 5. The setup of our camera mouse sys-
{f (x + i, y + j)}. The standard normalized correlation is tem
defined by Eq. 4. After some manipulation of the equation
with the purpose of reducing double decision computations
and intermediate variable storage, the formula for fast com-
putation is Eq. 5. We notice the intermediate variables in The frame rate of the tracker achieves 22 fps when only
Eq. 5 are summations in the template windows which can rigid motion parameters are computed and 13 fps when all
be sped up by using integral image techniques[22]. Sup- motion parameters are computed. Because the optical flow
pose the template searching area is of size M by N, and is computed by normalized correlation, the tracker is ro-
the template size is m by n, about 2mnMN multiplication bust to illumination changes during tracking. Because the
and 5mnMN additions/subtraction are required for using LMSE solution of the tracker is globally optimal in na-
Eq. 4. While the number of operations of Eq. 5 reduces to ture, it can handle partial occlusions of the face. Besides,
mnMN+6((M+m)(N+n)+MN) summation/subtractions and the LMSE model fitting error and the confidence of the op-
mnMN+(M+m)(N+n) multiplications. Therefore comput- tical flow computation(derived from the sum of the normal-
ing normalized correlation using Eq. 5 is about 5 times ized correlation at all vertices) indicates whether the tracker
faster for addition/subtraction operations and about twice has failed or not, and can trigger automatic re-initialization
0-7695-2319-6/05 $ 20.00 IEEE
m2 n2
(f (x + i, y + j) − µf )(t(i, j) − µt )
i=− m
j=− n
N Corr(x, y) = m2 n2
2 2
m2 n2 (4)
{( i=− m j=− n (f (x + i, y + j) − µf )2 )( i=− m
2 1/2
j=− n (t(i, j) − µt ) )}
2 2 2 2
where
m2 n2 m2 n2
i=− m j=− n f (x + i, y + j) i=− m j=− n t(i, j)
µf (x, y) = 2 2
, µt (x, y) = 2 2
mn mn
m2 n2
mn i=− m j=− n f (x + i, y + j)t(i, j) − µf µt
N
Corr(x, y) = m2 n2
2 2
m2 n2 (5)
{(mn i=− m j=− n f 2 (x + i, y + j) − µ2f )(mn i=− m
2 2 1/2
j=− n t (i, j) − µt )}
2 2 2 2
where
m n m n

2
2
2
2
µf (x, y) = f (x + i, y + j), µt (x, y) = t(i, j)

i=− m n
2 j=− 2 i=− m n
2 j=− 2
of the tracker. All these advantages make our tracking sys- else y=0;
tem a good candidate for the visual tracking module in cam- return y; }
era mouse. The ∆(x) function is a step function in which the constants
are specified empirically. We found it is easier for the user
4. Mouse cursor control to learn to move the mouse cursor with desired speed and
to keep cursor still at desired location by changing his/her
The direct mode, joystick mode and differential mode head pose with such a step control function.
are implemented for the mouse control module. For the di- For differential mouse control mode, we have the follow-
rect mode, the face orientation angle Rx , Ry (the rotation ing control rules
angle with respect to x and y coordinate)are mapped to the
mouse cursor coordinates (X, Y ) on the screen. As the re- X t+1 = X t + α∆Ryt bt
liable tracking range of Rx and Ry is about 40o , and the Y t+1 = Y t + β∆Rxt bt
resolution of the screen is 1600 × 1200, we therefore em-
pirically let the mapping function to be where

0 Tzt < Tz0
X = 40(Ry − Ry0 ) bt =
1 Tzt >= Tz0
Y = 30(Rx − Rx0 )
Therefore the mouse is navigated by the cumulation of
where Rx0 and Ry0 are the initial face orientation angles.
head orientation displacements ∆Rxt and ∆Ryt with head
For joystick mouse control mode, the following control
moving toward to the camera turns on the mouse drag-
rule is employed. ging state, head moving away from the camera turns on the
0 mouse lifting state.
X t+1
= X + ∆(Ry − Ry )
t
0
The variations in nonrigid motion parameters trigger
Y t+1
= Y + ∆(Rx − Rx )
t
mouse button events. While there are 12 AUs for selec-
tion, not all of them are good for triggering mouse event.
with the rule function ∆(x) defined as
Ideally the detection of AU should be robust against head
double Delta(double x){ pose change and noisy outliers. AU7 and AU8(eyebrows
double y; raising) are not good because eyebrow movements are rela-
if(fabs(x)>15) y=sgn(x)*64; tive subtle for detection. AU9, AU10(cheek lifting) are not
if(fabs(x)>10) y=sgn(x)*16; good because the lack of texture on cheek makes the esti-
else if(fabs(x)>5) y=sgn(x)*4; mation unreliable. And AU11 and AU12(eye blinking) are
else if(fabs(x)>3) y=sgn(x)*1; not good because user may blink his eye unintentionally
0-7695-2319-6/05 $ 20.00 IEEE
of 1600 by 1200. The clicking results are illustrated in Fig.
7. From the figure, we can notice the average localization er-
ror for direct mode is about 10 pixels, that for joystick mode
is about 3 pixels, and that for differential mode is about 5
pixels. The localization error is mostly caused by measure-
ment noise introduced during the tracking process, and par-
tially due to human can not really hold his head tightly still.
We then evaluated the reliability of detection of mouth
opening and stretching. When the user is assuming near-
frontal view(within ±30o ) and the tracker is in benign track-
ing status, the detection of mouth opening and stretching
achieves 100% with zero false alarm.
(a) (b) (c)
Figure 7. The illustration of clicking ac-

curacy using 3 control modes. (a)Direct
mode;(b)Joystick mode; (c) Differential mode
Since our system is seamlessly integrated to Windows

XP environment, we can use our camera mouse to navigate
in Windows environment. Fig. 8 shows the user is activat-
ing the windows Start menu by a mouth opening. Fig. 9 and
Fig. 10 show how the user plays Solitaire and Minesweeper
using the camera mouse.
Figure 6. The motion parameters of head After experimenting with the 3 mouse control modes, we
moving with eye blinking and mouth opening list the comparisons as follows:
facial deformation
Direct mode Mouse cursor is correlated with focus of at-
tention. However, the noise in motion parameter esti-
and user may have difficulty to do click-and-drag opera- mation is amplified on screen and the localization vari-
tion with their eyes closed. We chose using the detection ance is large.
of mouth opening to trigger left-button-click event, and the Joystick mode The cursor localization variance is small.
detection of mouth corner stretching to trigger right-button- The mouse cursor is still correlated with the direction
click event. of focus of attention, but not as strongly as direct mode.
Our camera mouse system is seamlessly integrated with However such inconsistency can be compensated by
Windows XP operating system. Once the camera mouse eye gaze if the inconsistency is not too large.
mode is turned on, user can comfortably navigate in Win-
Differential mode This mode is supposed to match
dows and interact with commercial softwares using his face.
most with human’s mouse-control habit. How-
ever, this mode require user to move head frequently,
5. Experiments and mouse cursor is not correlated with human’s fo-
cus of attention which could be disturbing if the
We implemented the three control modes and compared user wish to read the screen at the same time. Be-
their controllability by clicking on a target center of a pic- sides, its localization error is not comparable to Joy-
ture 10 times in a painting software on a screen of resolution stick mode.
0-7695-2319-6/05 $ 20.00 IEEE
Figure 9. Using camera mouse to play Win-
dows card game Solitaire. The user is drag-
Figure 8. Using camera mouse to navigate in ging Spade-7 from right toward Heart-8 in the
Windows XP. The user is opening his mouth left by turning his head with mouth opened.
to activate Start menu.
tion. We still need to further improve our system so that it

6. Conclusion can run faster and can achieve more robust tracking perfor-
mance in the context of camera mouse application.
In this paper, we investigated the state of the art of cam-
era mouse techniques. In particular, we proposed to use a
3D model based visual face tracker to control the mouse
References
and carry out mouse operations. The implementation of our [1] V. Chauhan and T. Morris. Face and feature tracking for
face trackers is detailed. Based on the estimated rigid and cursor control. In 12th Scandinavian Conference on Image
non-rigid facial motion parameters, 3 mouse control modes, Analysis, Bergen, 2001.
direct mode, joystick mode and differential mode, are im- [2] T. F. Cootes and C. J. Taylor. Active shape model search
plemented for the mouse control. The experiments verified using local grey-level models: A quantitative evaluation. In
the effectiveness of our camera mouse system. In particular, 4th British Machine Vision Conference, page 639648. BMVA
the accuracy of mouse navigation using the 3 mouse con- Press, 1993.
trol modes is evaluated numerically, and the pros and cons [3] T. Darrel, N. Checka, A. Oh, and L.-P. Morency. Exploring
of each control mode is summarized. vision-based interfaces: How to use your head in dual point-
We believe an optimal solution for mouse control is a ing tasks. Technical Report AIM-2002-001, Artificial Intel-
combination of direct mode and joystick mode. The direct ligence Laboratorty@MIT, 2002.
mode can navigate the mouse cursor to the area of inter- [4] J. W. Davis and S. Vaks. A perceptual user interface for
est on screen, then joystick mode can be triggered to fine recognizing head gesture acknowledgements. In Workshop
for Perceptive User Interface, pages 1–7, Orlando, FL, 2001.
tune the cursor location according to user’s desire. The two
[5] P. Ekman and W. Friesen. Facial action coding system. Psy-
modes can be switched on/off by the detection of some non-
chologist Press., Palo Alto, 1978.
rigid facial deformation, i.e., the blinking of eye.
[6] L. El-Afifi, M. Karaki, and J. Korban. ’hand-free interfacee’-
An interesting future direction is how to estimate human a fast and accurate tracking procecdure for real time human
emotion based on the visual tracking of user facial motions. computer interaction. In FEA Student Conference, American
Some mouse events could be triggered by user emotions, University of Beirut, 2004.
and the application(such as computer tutoring software) can [7] S. Eye. http://www.smarteye.se/.
respond to user in a proactive way according to user’s emo- [8] D. Gorodnichy. On importance of nose for face tracking. In
tion(or spiritual state). 5th Intern. Conf. on Automatic Face and Gesture Recogni-
We also realize the visual face tracking system also play tion, Washington, DC, USA., 2002.
a very important role in camera mouse. Our current system [9] D. Gorodnichy, S. Malik, and G. Roth. Nouse ’use your nose
is still not fully optimized in algorithm and in implementa- as a mouse’ - a new technology for hands-free games and in-
0-7695-2319-6/05 $ 20.00 IEEE
[19] J. Tu, Z. Zhang, Z. Zeng, and T. Huang. Face localization via
hierarchical condensation with fisher boosting feature selec-
tion. In IEEE Computer Society Conference on Computer
Vision and Pattern Recognition (CVPR’04), volume 2, pages
719–724, 2004.
[20] M. Turk, C. Hu, R. Feris, F. Lashkari, and A. Beall. Tla
based face tracking. In Intern. Conf. on Vision Interfacec,
pages 229–235, Calgary, 2002.
[21] M. Turk and G. Robertson. Perceptual user interfaces.
In Communications of the ACM, volume 43, pages 32–34,
2000.
[22] P. Viola and M. Jones. Fast and robust classification using
asymmetric adaboost and a detector cascade. In Advances
in Neural Information Processing System, volume 14. MIT
Press, Cambridge, MA, 2002.
Figure 10. Using camera mouse to play Win-

dows game Minesweeper. The user is sweep-
ing the remaining squares at the cursor lo-
cation(where the 2 mines have been marked)
by simultaneous mouth opening and stretch-
ing which corresponds to clicking the left and
right mouse button simultaneously.
terfaces. In Intern. Conf. on Vision Interface, pages 354–361,

Calgary, 2002.
[10] Y. Hu, L. Chen, Y. Zhou, and H. Zhang. Estimating face pose
by facial asymmetry and geometry. In the 6th Intl. Conf. on
Automatic Face and Gesture Recognition, 2004.
[11] M. V. Inc. www.mousevision.com.
[12] R. J. Jacob. What you look at is what you get: Eye
movement-based interaction techniques. In Human Factors
in Computing Systems, pages 11 – 18, 1990.
[13] B. M., G. J., and F. P. The camera mouse: Visual tracking
of body features to provide computer access for people with
severe disabilities. IEEE Trans Neural Syst Rehabil Eng.,
10(1):1–10, 2002.
[14] C. Morimoto, D. Koons, A. Amir, and M. Flickner. Real-
time detection of eyes and faces. In Workshop on Percep-
tural User Interfacecs, San Fransisco, 1998.
[15] C. Mouse. http://www.cameramouse.com/.
[16] R. Ruddarraju and et al. Perceptual user interfaces using
vision-based eye tracking. In the 5th international confer-
ence on Multimodal interfaces, pages 227 – 233, Vancouver,
British Columbia, Canada, 2003.
[17] H. Tao and T. Huang. Explanation-based facial motion track-
ing using a piecewise bezier volume deformation model. In
Proc. IEEE Comput. Vision and Patt. Recogn., volume 1,
pages 611–617, 1999.
[18] K. Toyama. Look, ma-no hands! hands-free cursor control
with real-time 3d face tracking. In Workshop on Perceptual
User Interfaces, San Fransisco, 1998.
0-7695-2319-6/05 $ 20.00 IEEE

Face As Mouse Through Visual Face Tracking

Uploaded by

Copyright:

Available Formats

Face As Mouse Through Visual Face Tracking

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Face As Mouse Through Visual Face Tracking

Uploaded by

Copyright:

Available Formats

Face as Mouse Through Visual Face Tracking

Jilin Tu, Thomas Huang Hai Tao

Elec. and Comp. Engr. Dept. Elec. Engr. Dept.

Abstract By retrieving human motion parameters in video cap-

Figure 1. The framework of camera mouse

Figure 2. The modeling of facial articula-

Taking head rotation and translation into account, the

Vi = R(θ, φ, ψ)Vi + T (2)

Figure 4. The ﬂowchart of our tracking proce-

3.3. Fast computation of optical ﬂow

One of the most time-consuming operation for our track-

µf (x, y) = f (x + i, y + j), µt (x, y) = t(i, j)

(a) (b) (c)

Figure 7. The illustration of clicking ac-

Since our system is seamlessly integrated to Windows

tion. We still need to further improve our system so that it

Figure 10. Using camera mouse to play Win-

terfaces. In Intern. Conf. on Vision Interface, pages 354–361,

You might also like