Mobile AR 2
Mobile AR 2
Mobile AR 2
1. Introduction
Modern camera phones have become a compelling platform for
mobile augmented reality applications. They almost contain all
the equipment typically required for video based AR. In the past
few years, a lot of researchers have tried to implement
algorithms such as natural features matching (Wagner et al.,
2010a, b) and online mapping (Klein and Murray, 2009) on
mobile phones to realize real time registration. By contrast, little
attention has been paid to realize wide area registration on
camera phones for mobile AR use. With the use of small and
flexible mobile devices, mobile AR allows for more unrestricted
user movement. Thus, the requirement of wide area registration
is more urgent for mobile AR than for traditional PC based AR.
In this paper, we especially focus on camera phones based
wide area registration which is one of the most important
problems in the mobile AR field and has not yet been solved
properly. While great strides have been made to address wide
area registration problems for PC based AR, this is not the
case for camera phones based mobile AR systems.
There are mainly two factors that limit the usability of camera
phones as platforms to realize wide area registration (Wagner
and Schmalstieg, 2009a, b): first, instead of increasing
processing speed, mobile phones units are primarily designed
The current issue and full text archive of this journal is available at
www.emeraldinsight.com/0260-2288.htm
Sensor Review
33/3 (2013) 209 219
q Emerald Group Publishing Limited [ISSN 0260-2288]
[DOI 10.1108/02602281311324663]
209
Sensor Review
2. Related work
Sensor Review
M X
F
1 X
P^ d f ; hm
M m1 f 1
Rhm z
If the user thinks that the recognized key-frame belongs to a valid
scene, he can make a confirmation to enable the system to load the
corresponding map into the memory for real time registration use.
P~ d Rh i
"
#
minP^ d Rh i; P^ 95 P^ min 8
2 2 1
P^ 95 2 P^ min
where P^ min denotes the minimum value occurring over all leaves
in the current Fern and P^ 95 is the corresponding 95 percent
percentile. With the above modification, we can reduce memory
requirements by a factor of 4. Moreover, since the recognition
process can be performed by using integer values instead of
floating point values, we can speedup the recognition process by
a factor of at least 3 on camera phones, even some of which do
not support floating point arithmetic.
Second, as can be seen from the left part of Figure 2 (we use
a fern which contains three random tests for simplicity), with
the first modification, we need to store all the compressed
211
Sensor Review
Compressed
Probabilities
0
0
0
Inverted file
0
0
000
001
010
1
0
011
100
101
110
111
1
1
0
0
1
0
1
1
Threshold
1
0
1
1
1
by using equation (4) for this class. Only the value larger
than the computed threshold will be stored in the inverted file,
while the others will be discarded to save the memory usage. We
take an experiment by using ZUBUD image database to prove
the efficiency of the above improvements in reducing memory
usage and recognition time. The compression ratios and
recognition time are shown in Figure 3(a) and (b), respectively,
from which we can see that the memory usage and recognition
time decrease obviously with the increasing of 1/a. However, as
shown in Figure 3(c), the recognition performance does not
degrade sharply with the changing of 1/a. In our system, we set
1/a as 0.08 by which we can get satisfying compression ratio
together with reasonable recognition performance.
Third, we also compress the built inverted files by using index
compression which is an established method in the literature of
text retrieval to further reduce the memory consumption. To
our knowledge, index compression has never been considered in
compressing the key-frame recognition engine for mobile AR
use. We use the improved rice coding algorithm proposed in
Sensor Review
the above method, we can process about two to three times more
key-frames compared with a standard inverted file when using
the same amount of memory.
(a)
(b)
(c)
Sensor Review
7. Results
7.1 Preliminary feedback on usability
We recruited eight users (one female/seven male, age
22-35 year) with no previous knowledge of AR to test the
usability of the proposed application prototype. For each user,
we take two tests, in which the first is carried out immediately
after the building of the prototype system, and the other one is
carried out 4 hours later to test the performance of the
proposed method in scene structure and illumination
changes. After the tests we take an unofficial interview to
collect user feedback.
The results show that 61 out of 64 scenes can be recognized
(95.31 percent) and 59 out of 64 virtual objects can be
augmented (92.19 percent) in the tests which are carried out
immediately after the building of the prototype system. The
recognition rate is slightly lower (57 out of 64, 89.06 percent)
for the tests carried out 4 hours later. However, this is not the
case for the augmentation. Only 37 out of 64 virtual objects
can be correctly augmented (57.81 percent). It is because that
the illumination and scene structure changes lead to the
failures of feature point matching.
All users agree that the registration is stable and fast. They
experience occasional registration failures when camera
moves out of the scope of the target scene. However, these
failures can always be recovered when camera returns to the
target scene. Moreover, all the users stated that as they
became more familiar with the prototype system, they can
avoid nearly all the problems.
Finally, the user interface generally received positive
comments, especially in the map selecting function. All
the users agree that selecting the corresponding map by a
simple point-and-click interface does not make any impact on
the usability of our prototype system.
6. Applications
We build a virtual museum exhibition prototype to prove the
usability of the proposed method for wide area mobile AR
applications. In mobile AR based virtual museum
applications, we should enable users to walk anywhere they
want to observe different virtual objects like antiques,
calligraphies and paintings superimposed in different places
in the exhibition hall. We built a prototype system by using
the proposed method on a HTC camera phones (with 1G
CPU) to meet the wide area localization and tracking
requirements for these mobile AR applications.
The prototype system is built in our laboratory which
covers an area of about 80 square meters. We build eight
maps and each map contains nine to 18 key-frames and
295-833 mapped points. Thus, the result system contains
totally 121 key-frames and 4045 3D points. 20 random ferns
each with the depth of 12 are used in our system. While the
original ferns take about 38.72 M bytes, the compressed ferns
only take about 0.87 MB bytes which make them suitable for
running on our low memory mobile phones. Each map
(including key frames and 3D features) takes 1.3-3.1 MB
memory and will be loaded or unloaded according to the
users confirmation. Our system requires , 20 ms to perform
scene recognition and , 40 ms to track and augment a single
frame, thus the application system can run at interactive
frame rates (12 Hz).
If the user decides to browse a virtual object, he can select a
map according to the recognition results returned by the
compressed ferns. The selecting operation will automatically
trigger the loading of the corresponding map into the main
memory, and then the initializing process will be started.
With the initial pose obtained, the corresponding virtual
object will be superimposed by using camera poses provided
by tracking process. If the user decides to browse another
Sensor Review
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
Sensor Review
Figure 5 Results of illumination changes, camera shaking, view angles and volume changes
(a)
(b)
(c)
(d)
(e)
(f)
8. Discussion
Sensor Review
(a)
(b)
(c)
(d)
(e)
(f)
Sensor Review
(a)
(b)
References
Sensor Review
Further reading
219