Implicit Measures of Lostness and Success in Web Navigation
Jacek Gwizdka 1 2, Ian Spence 3
2)
School of Communication, Information, and Library Studies,
Rutgers University,
4 Huntington St, New Brunswick, NJ 08901, USA
[email protected]
3)
Department of Psychology
University of Toronto
100 St. George St, Toronto, Ontario M5S 3G3, Canada
[email protected]
This is the authors' version of the manuscript accepted for publication in Interacting with Computers.
1
Corresponding author
Implicit Measures of Lostness and Success in Web Navigation
Abstract
In two studies, we investigated the ability of a variety of structural and temporal measures
computed from a web navigation path to predict lostness and task success. The user’s task was to
find requested target information on specified websites. The web navigation measures were
based on counts of visits to web pages and other statistical properties of the web usage graph
(such as compactness, stratum, and similarity to the optimal path). Subjective lostness was best
predicted by similarity to the optimal path and time on task. The best overall predictor of success
on individual tasks was similarity to the optimal path, but other predictors were sometimes
superior depending on the particular web navigation task. These measures can be used to
diagnose user navigational problems and to help identify problems in web site design.
Keywords
Web navigation; Web navigation graph; Navigation path similarity; Implicit measures; Lostness;
Compactness; Stratum; User studies.
Classification
2240.000 Empirical data; 4200.000 Metrics; 7230.000 User studies; 7520.000 Web & hypertext
navigation
Predicting Lostness and Success in Web Navigation
1.
Introduction and Motivation
Navigating large, complex websites is frequently difficult. The task places a heavy cognitive
burden on users who often become lost or disoriented. Indeed, cognitive overload and lostness
have long been recognized as major barriers experienced by users in hypermedia navigation
(Conklin, 1987). Disoriented searchers seem to have difficulty forming a cognitive model of the
information structure (Kim & Hirtle, 1995; Dieberger, 1997; Boechler, 2001). Since the structure
of the information space is usually not transparent it is often difficult for users to navigate in a
goal-directed way (Dieberger, 1995). Users can become lost because of the non-linear nature of
hypertext systems (Chen & Macredie, 2002) and, if there is considerable cross-referencing
among pages, looping behavior may result (Boechler, 2001). However, despite a long history of
research on hypertext and more recent studies in the area of web navigation, relatively little is
known about the statistical relationships among web navigation patterns, lostness, and success on
information-seeking tasks. In our view, a better understanding of the characteristics of successful
and unsuccessful navigation will be assisted by the computation of structural and temporal
measures that quantify different aspects of navigation behaviors. An important potential benefit
of our structural approach is that it may be possible to suggest strategies for improving the user
experience that do not depend on an analysis of the content of web pages
We present the results of two exploratory studies that examine the relationships among structural
measures that characterize web navigation paths, lostness, and task success. Related previous
work on the structural aspects of web navigation is discussed in section two and our approach is
described in section three. The empirical studies designed to evaluate the various measures—old
and new—are presented in sections four to six. The paper ends with a discussion of the results,
conclusions, and future directions.
2.
2.1
Related Work
Web-navigation Graph
Hypertext is traditionally conceptualized (and visualized) using the traditional node-and-link
model to present both the structure and the use of the web. Graphs that represent user navigation
2 of 31
Gwizdka & Spence
on the web are called web-navigation graphs, or web-usage graphs. Visited web pages are
represented by graph nodes and traversed links are represented by the edges of the graph. The
focus of this paper is on quantifying the structural and temporal aspects of a user’s navigational
history. While clearly also important, we do not study the influence of the content of web pages
on the user’s navigational choices (such influence was studied by, for example, Chi et. al., 2003;
Berendt, 2002).
Recently, the structural properties of web-navigation graphs have been correlated with user task
outcomes. McEneaney (McEneaney, 2001) demonstrated that learning task success (using a task
which required a broad exploration of on-line material) was correlated with shallow and broad
hierarchical navigation (reflected in high compactness1 of the navigation graph), while task
failure was related to a linear style of navigation (high stratum) (Botafogo et al., 1992). Shih and
his colleagues (Shih et al., 2004) used web-based courseware to study navigation behavior,
finding that the navigation paths of people who had greater prior experience with web-based
instructional tools were more linear and less dispersed (high stratum & low compactness). They
also found that stratum and compactness for those more experienced people differed according to
the task phase: (1) exploration, (2) resolution, and (3) completion. Significant relationships
among stratum, compactness and navigation task outcomes have not been found in all studies.
For example, (Herder, 2003) reported that no correlation was found between these two graph
measures and user disorientation. However, the tasks employed in that study were of mixed type;
some were open-ended while others were goal-oriented. It is thus natural to enquire how the
observed relations among stratum, compactness, lostness, and navigation task success change for
different types of information-seeking tasks.
2.2
Navigation Styles
In a study reported in (Juvina & van Oostendorp, 2004; Herder & Juvina, 2004), compactness,
stratum, path density, and average connected distance were used to characterize user navigation
styles. The navigation styles were second-order constructs derived from observable and
measurable user behavior. Factor analysis was used to create aggregate measures and two
1
Compactness and stratum are defined in section 3.1.2.
Predicting Lostness and Success in Web Navigation
navigation styles, flimsy navigation and laborious navigation, were proposed. These two styles
accounted for 27% and 23% of the total variance, respectively In the flimsy style, users often
returned to previously visited pages, including the start page. The preferred return mechanism
was the web browser’s back button, rather than via direct links. In contrast, the laborious
navigation style involved thorough exploration of the website making extensive use of the
navigational mechanisms provided by the site. On page revisits, users typically followed a
different link and thus explored a different branch of the website. It seems likely that this
variation in navigation styles might be helpful in predicting navigation task outcomes.
2.3
Navigation Sequence Similarity
Web pages arranged in the order of user visits form a navigation sequence or surfing path. This
path has been found to be useful in a variety of applications. For example, (Pitkow & Pirolli,
1999) used a longest repeated sequence algorithm to predict user surfing paths. They also used
the similarity of the navigation sequence to the most common sequence to facilitate efficient
caching of web pages. Path similarity has also been used in the analysis and clustering of user
navigation behavior: (Wang & Zaïane, 2002) employed a sequence alignment algorithm to
cluster user web navigation sessions. We use a similar algorithm to assess similarity between the
user navigation path and the optimal navigation path.
2.4
Implicit Measures of User Behavior
Implicit measures of user behavior may be used to predict subjective user preferences. This
approach has a long history in Information Retrieval (IR), where relevance feedback is used to
indicate a user’s information interests and preferences (Kelly & Teevan, 2003; Oard & Kim,
2001). While older systems were based on explicit feedback (Spink & Losee, 1996), more recent
work in IR employs implicit measures. Implicit measures are observable measures of user
behavior that can be used to infer or predict user attitudes, interests, preferences, or user
performance on a task. In the context of web search and navigation, implicit measures can be
used to predict user satisfaction, user lostness, or task success. In a recent paper, (Fox et al.,
2005) used implicit measures of user interest and satisfaction on web search tasks. They found
that time on a web page, clickthrough, and what a user did after visiting a search result or how a
user ended a search session were good predictors of user satisfaction. Some implicit measures
4 of 31
Gwizdka & Spence
seem to be sensitive to task context. For example, several studies found document reading time
(e.g. time on a web page) to be a good indicator of document relevance to the user (Masahiro &
Yoichi, 1994; Oard et al., 2001), but other studies have not confirmed this finding (Kelly &
Belkin, 2001; Kellar et al., 2004). Furthermore, (Herder & Juvina, 2004) found that time spent
on a web page was a good indicator of user lostness on web navigation tasks. While these results
may seem to be contradictory, they likely demonstrate that the nature of the user task and
environment, as well as the website information architecture and its content may have large
effects on predictive models. These results show that the usefulness of one-measure-models is
questionable. Establishing reliable and generalizable relationships between implicit measures
and task outcomes holds the best promise for building predictive models that could be used in
real-time, in a variety of contexts.
2.5
Getting Lost in Hypertext
Getting lost, or disoriented, is known to be one of the most important problems in hypertext
navigation, yet there have been but a few attempts to assess and quantify lostness. Smith (1996)
proposed an objective measure of lostness based the ratios of visited and optimal node counts as
shown in equation (2) (section 3.1.1). Larson and Czerwinski (1998) compared user performance
on information search in three different hypertext hierarchies: 8x8x8 (eight links at three
hierarchy levels), 32x16 (thirty two links at the top level, and sixteen at the bottom level), and
16x32. Using Smith’s measure Larson and Czerwinski showed that users were more lost on a
hypertext with the 8x8x8 hierarchy than on either the 16x32 or the 32x16 hierarchies. Otter &
Johnson (2000) described two measures designed to assess user lostness. The first of their
measures combines previous work by (Smith, 1996) with the effects of different types of links.
Their second measure is concerned with the accuracy of users' mental models of websites. The
authors suggested that to capture lostness in hypertext, a battery of measures was needed,
Herder’s (Herder, 2003) work supported this viewpoint. (Ahuja & Webster, 2001) conducted an
experiment demonstrating that user perceived disorientation in web navigation (assessed by a
questionnaire developed by the authors) is only weakly related to user behavior, and that
perceived disorientation is a better predictor of performance (time) than user behavior (such as
the number of visited web pages, and page revisits). In a study that examined perceived user
disorientation in hypermedia (Herder, 2003) found that perceived disorientation (measured using
Predicting Lostness and Success in Web Navigation
the Ahuja & Webster’s instrument) was correlated with a combined page return rate (average
rate of revisits to pages which were visited at least twice) with median page view times, but not
with the page revisitation ratio. Thus, in contrast to Ahuja & Webster, Herder’s work
demonstrated that user lostness was correlated with diverse measures of user behavior. These
findings suggest that lostness is not a simple unidimensional construct. Since using
questionnaire-based measures of lostness, as proposed by Ahuja & Webster (2001), is difficult,
if not impossible, in most studies conducted in real-world contexts, the successful assessment of
lostness based on observable real-time user behavior would be of great practical value. However,
previous research has not yet provided the basis for deciding on whether lostness can be assessed
in this way. To advance the field, the identification of measures that can predict lostness with
accuracy over a wide variety of search tasks is desirable.
3.
Research Objectives
The discovery of appropriate quantitative measures of navigational behavior is fundamental to
advancing our understanding of the phenomenon of lostness in web navigation. While anecdotes
and informal observations may be valuable and suggestive, we believe that an empirical
comparison of quantitative measures of user behavior will provide an improved characterization
of user navigation paths and that this may, in turn, inform the use of such measures in diagnosing
user web navigation problems and in evaluating web sites. Previous research (Chi et. al., 2003;
Berendt, 2002) has frequently focused on approaches and measures that are informed by the
content of the web pages. Our study (along with, for example, McEneaney, 2001; Herder, 2003;
Herder & Juvina, 2004) complements this work by investigating how navigational efficiency and
success may be assessed by considering the clickstream alone, without reference to the meaning
or content of the web pages and links.
Our first goal is thus to review and select appropriate structural and temporal measures that
characterize user navigation in information-seeking tasks on websites. We present the measures
that we have selected for review later in this section. Our second goal is to improve
understanding of the commonalities and differences among these measures, and to determine
whether we can identify navigation styles similar to those suggested by (Juvina & van
Oostendorp, 2004; Herder & Juvina, 2004). Our third goal is to determine which measures are
6 of 31
Gwizdka & Spence
the best predictors of (i) becoming lost on a website, and (ii) of task success (i.e. success in
finding information). All measures are based on observable user behavior. In one of the studies
reported in this paper, we used these measures to predict lostness, which was assessed in a posttask evaluation of the information-seeking session (details are given in the Methodology
section). We attempted to determine whether lostness and task success can be predicted using
measures such as:
•
•
the time spent on the navigation task and the speed of clicking;
the number of visited pages, the number of re-visited pages, and the ratio of revisited to
visited pages;
the “shape” of the web navigation graph;
the similarity of the user navigation path to the optimal path.
•
•
3.1
The measures
3.1.1
Page-count measures
The logged time-stamped URLs were used to calculate: (i) the number of web pages visited in a
session (N); the number of unique web pages visited (U); the number of web pages on the
optimal path (O) (section 4.3.3); and (ii) the time spent on each web page (Time_per_page); and
the total time on each question (Total_time). The measures were obtained directly from the
recorded web session logs by a Python script running on a user computer. Two derived
measures, “Revisits” (Tauscher & Greenberg, 1997) and “Objective lostness” (Lostness_obj)
(Smith, 1996), were calculated using the ratio of visited and optimal node counts as shown
below:
Revisits
=
1 – U/N ,
Lostness_obj = √ (U/N-1)2 + (O/U-1)2 ,
3.1.2
(1)
(2)
Global properties of the web navigation graph
If we consider the individual web pages visited by searcher to be the nodes of a graph and the
links followed by the searcher to be the edges of the graph, we can compute global properties
Predicting Lostness and Success in Web Navigation
such as stratum and compactness (Botafogo et al., 1992). Stratum and compactness were used
to characterize searcher’s behavior on similar web navigation tasks by McEneaney (2001), Shih
et al. (2004), Herder (2003) and Herder & Juvina (2004) thus a comparison with our results is
possible.
Compactness is a measure of the connectedness of a graph. It varies between zero and one; it is
close to zero for sparsely linked graphs and close to one for highly connected graphs.
Stratum measures how close the navigation path is to a linear ordering. This statistic is based on
the notion of the status, contra-status and prestige of the nodes in the path. Nodes that are hard to
reach, but from which other nodes can be easily reached, have high status. Nodes that are easy to
reach, but from which it is hard to reach other nodes, have high contra-status. A node’s prestige
is the difference between its status and contra-status. The stratum measure is defined as the sum
of the absolute prestige of all nodes divided by the maximum possible value of prestige for a
fully linear ordering. Like compactness, stratum varies between zero and one. A value close to
zero indicates a less linear navigation path; a value close to one indicates a more nearly linear
navigation path.
3.1.3
Similarity to the optimal path
These measures assessed the similarity between the user’s path and the optimal path. The
optimal path is the shortest path leading to the web page containing the sought-for information.
For factual information tasks (section 4.1) the optimal path exists by definition. In our studies,
we also assumed that the optimal path was unique.
Two similarity measures were calculated based on a well-known dynamic programming
procedure by Needleman and Wunsch (1970). The method uses a global sequence alignment
algorithm with a non-zero gap cost and an arbitrary distance function. The non-zero gap cost was
used to apply a penalty for diversion in web navigation. Our use of the N-W algorithm assumed
that (i) nodes were uniquely identified by webpage URLs composed of three parts:
<host>,<path>,<query>, and (ii) the distance between two nodes was calculated based on
similarity between their three-part URLs, where matching was done between each corresponding
URL part (e.g., between <paths>).
8 of 31
Gwizdka & Spence
Two measures—LCSMax and LCSlenMax—were derived using the N-W algorithm. LCSMax is
a measure of similarity between the user path and the optimal path, normalized to the maximum
possible score (an ideal match) for a path length equal to the user path length. LCSlenMax is the
length of the longest common subsequence between the user path and the optimal path divided
by the length of the user path. If no gaps in the matching paths were permitted, LCSlenMax
would be equal to the number of visited pages on the optimal path. Since the N-W algorithm
uses a non-zero gap cost, the common subsequence may contain pages which appear only on one
of the paths, thus interpretation of the LCSlenMax measure is not as straightforward.
Table 1. Characteristics of the navigation measures used.
Measure
Group
Symbol
Short Description
Description / Equation
Lower
Bound
Upper
Bound
O
Optimal path length
section 3.1.1
1
-
U
Number of unique pages
section 3.1.1
0
-
N
Number of total pages
section 3.1.1
1
-
Revisits
Ratio of revisited pages to all
section 3.1.1 / (1)
0
1
Lostness_Obj
Objective Lostness (Smith)
section 3.1.1 / (2)
0
+∞
Web graph
metrics
Compactness
Graph connectedness
section 3.1.2
0
1
Stratum
Graph linearity
section 3.1.2
0
1
Similarity to
optimal
navigation
path
LCSMax
section 3.1.3
-1
1
LCSlenMax
section 3.1.3
0
1
Web page
count
metrics
Web page
revisit
metrics
Our approach is to build predictive models using empirical data collected from web navigation
sessions using large, complex, natural web sites. In the studies presented below, the user
navigation tasks were specified before the search began.
4.
Methodology
We conducted two question-driven, web-based, information-finding studies in a controlled
experimental setting. Both studies used the same type of user task and the same apparatus. We
first present the common elements shared by the two studies and then describe each study
separately.
Predicting Lostness and Success in Web Navigation
4.1
Navigation Task
A factual information task was used. According to Morrison et al. (Morrison et al., 2001) these
types of tasks belong to the most common type on the Web and account for 25% of web search
activity. A factual task is defined as an information finding task where the user seeks a specific
piece of data (e.g., the name of a person or an organization, product information, a numerical
value; a date; an address; etc.).
4.2
Procedures and Apparatus
Participants of the two studies performed question-driven, information-seeking tasks using two
large Canadian government websites (a Government of Canada home page and a Health Canada
home page). The websites were selected based on their complexity (over 10,000 web pages), and
familiarity of their content to Canadian citizens and residents. The studies were conducted in a
university laboratory using a PC running the Microsoft Windows 2000 operating system.
Participants in each study were asked to perform a series of factual information finding tasks.
Within each study, the tasks were the same for all participants and were presented in the same
order. By keeping the order of tasks constant, any differential effects of learning website content
were the same for all participants. At the beginning of each task, the participant started at the
home page. Participants were instructed to find a single web page containing information that
was specified by each task. The information sought was intended to be representative of the set
of common possible information questions that citizens or residents of Canada might ordinarily
ask when visiting these sites. While the formulation of the particular tasks was necessarily
subjective, the list was arrived at after discussion among four members of the Engineering
Psychology Lab at the University of Toronto, all of whom were either native Canadians or had
lived in Toronto for many years. The tasks varied in difficulty, with the range of difficulty
intended to be similar to that experienced by a typical user of the sites. Participants were asked
to navigate in a single browser window, without using a search engine. The URLs of all visited
web pages were logged to a local file, along with timestamps, and the screen coordinates of the
link, or the button, or the graphic, that was clicked.
Here are two examples of the tasks (see Appendix A for the complete list):
10 of 31
Gwizdka & Spence
4.3
•
“Find a listing of addresses for passport offices in Ontario” (Government of Canada site)
•
“Find a page that describes how to deal with stress for women” (Health Canada site)
Navigation Task Outcomes
Task success and user lostness were the two major outcomes (dependent variables) in both
studies. Task success was defined as finding a web page with the information specified in a
question. Lostness can be defined as an objective property (Smith, 1996) or, alternatively, it can
be viewed as the subjective feeling of user disorientation on the web navigation task. We
describe how these measures were operationalized in the descriptions of each study below.
5.
Talk-Aloud Web Navigation Study (TA Study)
The talk-aloud method is a widely used method of studying cognitive processes, such as problem
solving, learning, decision making, human-computer interaction, and cognitive task analysis.
Participants carry out a task, while verbalizing their thoughts (Newell & Simon, 1972; Ericsson
& Simon, 1980; Russo, Johnson, & Stephens, 1989). Most studies have found that the talk-aloud
method does not alter task outcomes although it may increase the time spent on task (see
Krahmer & Ummelen, 2004 for a review). We used the talk-aloud procedure to infer whether
participants felt that they were becoming lost as they worked on the task.
Fourteen adults (six females and eight males; average age group 24-30) took part in 14
individual information-seeking sessions. Participants had an average of 9 years of Internet use
experience. Their current average daily use of the Internet was ranged from 1 to 4 hours.
Each participant was asked to perform ten search tasks. Participants were asked to talk aloud
while they were navigating the websites. Participants received the following instructions:
“While navigating the website please speak your thoughts out loud. This may feel a bit
unnatural at first but please feel assured that we are not judging you but the usability of
the website. Take your time and be thorough but still try to be efficient.”
There was no time limit on finding an answer to each of the ten questions. All sessions were
recorded using the Camtasia screen cam software for capturing computer screens (along with
Predicting Lostness and Success in Web Navigation
ambient sound). In addition, each participant’s talk-aloud session was recorded on a tape
recorder. Participants were paid $20 for their time.
5.1
5.1.1
Navigation Task Outcomes
Task success
Task success was scored true or false. Task success was also evaluated subjectively after the
session. The participant provided a self-assessment of success and this judgment was verified by
the experimenter, who checked the content of the final webpage visited by the participant. In
cases of disagreement, if the participant declared success but the final page did not contain the
required information, the task success score was adjusted by the experimenter to false.
5.1.2
Subjective evaluation of lostness
Lostness can be considered as an objective measure or as a subjective measure. The first
approach was proposed by (Smith, 1996) who calculated lostness from observable user actions,
such as the number of pages visited, the number of unique pages visited and the minimal
(optimal) number of pages that need to be visited to complete the task. The second approach was
advocated by (Ahuja et al., 2001) who measured perceived lostness by a post-task questionnaire.
Similarly, Czerwinski et. al., (2001) demonstrated that subjectively estimated time on task
(relative subjective duration, or RSD) is related to the user’s success on the task; the time spent
on failed tasks tended to be overestimated while the time spent on successful tasks tended to be
underestimated. RSD is not a direct measure of lostness, but it would seem to be strongly
correlated. Both of these subjective measures require the participants to make a judgment after
the task has been completed.
We obtained a simple measure of subjective lostness based on the participant’s verbal behavior
throughout the session. Moreover, we did so without explicitly asking for subjective judgments
of lostness. We used an independent rater to rate lostness in a post-task examination of the user’s
behavior. Participants occasionally expressed feelings of being lost (e.g. “I’m not in the right
place”, “I’m not sure what to do now”). Later, a trained human rater watched the audio-video
record of information finding sessions and assessed, every 30 seconds, how lost the participant
12 of 31
Gwizdka & Spence
appeared to be. Lostness was rated on a 4-point scale: 1-“Definitely Not Lost”, 2-“Probably Not
Lost”, 3-“Probably Lost”, 4-“Definitely Lost”. Average values of subjective lostness were then
calculated for each participant, for all questions. The reliability of the rater was verified by a
second judge, who rated all tasks performed by a randomly selected study participant. Inter-rater
reliability was checked by calculating intraclass correlation coefficient for the average values of
subjective lostness. The obtained intraclass correlation coefficient was 0.94 and we concluded
that ratings assigned by the principal rater were highly reliable. The average values of the
subjective lostness are denoted by Lostness_R, which could range from 1 to 4..
Lostness_R = 1.3
Lostness_R = 2.8
Figure 1. Differently shaped navigation paths with different subjective lostness ratings.
Figure 1 shows navigation graphs for two different participants on the same task. Nodes in these
navigation graphs (rounded rectangles) represent visited web pages, while directed edges (lines
ending with arrows) represent user traversal between web pages (by clicking on links or on the
back button). The graphs were created by processing logged URL information using a version of
Graphviz (North & Koutsofios, 1994; Gansner & North, 1999) with Pathalizer (Open Source,
2005) (the software was modified by us to meet our requirements). The visual representation
Predicting Lostness and Success in Web Navigation
uses annotations and color-coding. The annotations provide temporal information and
information about the link or button that has been clicked. Color-coding denotes speed of
clicking (i.e. light green means quick clicking < 4s; darker green means between 4-8s per click,
black means medium speed (8-13s per click); orange means slow clicking (13-20s); and red
means a very slow clicking rate (>20s). The representation here is necessarily small and cannot
show the level of detail in our originals; however, the reader should note that the shape of the
navigation graph may differ considerably and that the shape seems to be related to feelings of
lostness. In general, navigation paths that have a simple linear shape are associated with low
values of our subjective lostness measure.
5.2
Results
Our overall goal was to build statistical models that would predict lostness and task success.
5.2.1
Prediction of User Lostness
Linear regression was used to help discover which measures best predicted subjective lostness
(n=140). Due to exploratory nature of our study, we adopted a very conservative approach to
regression model building and retained only variables that had p < .001. LCSMax and
Total_time were found to the best predictors of subjective lostness and these two variables
together accounted for over 51% of the total variance in the fitted regression model. The fitted
model with standardized parameter estimates was:
Lostness_R_Predicted = -.46*LCSMax +0.35*Total_time,
14 of 31
(3)
Gwizdka & Spence
Predicted Lostness_R
4
3
2
1
R2 = 0.51
0
0
1
2
3
4
Lostness_R
Figure 2. Predicted vs. actual Lostness_R for the TA study.
4
Lostness_R
3
2
1
R2 = 0.46
0
-0.2
0
0.2
0.4
0.6
0.8
1
LCSMax
Figure 3. Relationship between subjective lostness (Lostness_R) and similarity to the optimal
path (LCSlenMax) for the TA study.
Predicting Lostness and Success in Web Navigation
4
Lostness_R
3
2
1
2
R = 0.37
0
0
200
400
600
800
1000
1200
Total time [s]
Figure 4. Relationship between subjective lostness (Lostness_R) and total time on task
(Tot_time) for the TA study.
The further participants deviated from the optimal path, the more likely they were to be
subjectively lost (Figure 3), and the more time participants spent on a task, the more likely they
were to be subjectively lost (Figure 4).
It is worth noting that “objective” lostness, ratio of revisits and other variables were not retained
in the “best” models. These variables, although occasionally significant at the .05 level when
included in the regression models, contributed very little additional prediction variance.
5.2.2
Prediction of Task Success
Task success was defined as finding a web page with the information specified in a question and
was rated on a binary scale (true/false). Consequently, we used logistic regression to find the
best predictors of task success (n=140). The best regression model that explained the most
variance included LCSMax as the only significant independent variable. The Wald chi-squared
statistic associated with LCSMax was 10.3, p=.0013; 86.8% of predicted successes and failures
agreed with the observed values. The R2 of the model was .28. Thus, the similarity to the optimal
path (LCSMax) was the best predictor of subjective task success.
16 of 31
Gwizdka & Spence
6.
Time-Limit Web Navigation Study (TL Study)
While the Talk-Aloud (TA) study was designed to shed light on the cognitive processes used by
participants, we also wished to study the search behaviors of users who were not required to
vocalize. Forty eight adults (29 females and 19 males; average age 20.5 years) took part in thirty
eight individual information-seeking sessions. The participants used a computer for 21 hours per
week on average, including 18 hours of the Internet use per week.
Each participant was asked to perform eight tasks. The tasks partially overlapped with those used
in the TA study (see Appendix A). In contrast to the TA study, the time allowed for answering
each question was limited to three minutes. If the requested information was not found in that
time, the participant stopped and moved on to the next question. TL study participants were
recruited from an undergraduate psychology class (PSY100) and received course credits for their
time.
6.1
6.1.1
Navigation Task Outcomes
Task success
Task success was scored true or false. Task success was inferred from the time spent on each
question. When participants spent more than three minutes and twenty seconds on a question,
this was considered to be a failure (the time limit to answer each question was three minutes and
we allowed a twenty second period of grace). The validity of this procedure was checked by
examining a random sample (n=15) of the sessions that lasted less than three minutes and twenty
seconds—in 95% of cases a page containing the appropriate information was the user’s final
selection.
6.2
Results
The data analysis was conducted with the two general objectives in mind: (i) to gain an improved
understanding of the commonalities and differences among the various navigational measures;
and (ii) to find predictors of lostness and task success.
Predicting Lostness and Success in Web Navigation
6.2.1
Space of Web Measures – Second-Order Navigation Factors
To gain a better understanding of the structure underlying the space of the selected web
measures, principal component analysis with varimax rotation was applied to the data from the
TL study (n=384).
Table 2. Principal component loadings on the first three factors (after varimax rotation).
Variable
Factor1
Factor2
Factor3
Tot_time
0.84
0.40
0.16
U (unique pages)
0.80
0.03
-0.48
N (total pages)
0.73
0.38
-0.46
Lostness_Obj
0.69
0.41
-0.20
LCSMax
-0.83
-0.27
-0.01
LCSlenMax
-0.85
-0.33
0.11
Compact
0.20
0.96
-0.02
Revisits
0.45
0.79
-0.13
Stratum
-0.30
-0.89
0.12
Time_per_page
-0.08
-0.09
0.96
Figure 5 shows factor loadings of the navigation measures represented in the two-dimensional
space defined by the first two (varimax rotated) factors, which we interpreted as follows:
Factor 1. Navigational inefficiency: characterized by a high number of visited pages, more time
spend on the task, low task success, low similarity with optimal path, higher (objective) lostness.
This factor explained 48% of the variance.
Factor 2. Laborious navigation: characterized by high compactness, high proportion of revisited
web pages, and a low stratum (i.e., low linearity of the user path). This factor is very similar to a
factor discussed by Juvina & Herder (2004). It also bears some similarity to a factor of the same
name identified (Herder & Juvina, 2004). Laborious navigation explained 35% of the variance.
All three components of this factor (compactness, stratum and ratio of revisits) are related to the
shape of the navigation graph, that is, they are related to the user’s navigation pattern.
Factor 3. Navigation speed. The remaining 17% of the variance was explained by the third
factor, on which only one variable loaded highly (Time_per_page).
18 of 31
Gwizdka & Spence
These three factors represent three aspects of user actions on the web navigation task: (1) total
time and amount of clicking (total and “unnecessary” clicks), (2) user navigation patterns (e.g.,
forward paths, loops, rings, (Clark et al., 2006)), and (3) speed of clicking.
1.0
Compact
Revisits
0.5
Tot_time
Laborious Navigation
Lostness_Obj
N (total pages)
U (unique pages)
0.0
Time_per_page
LCSMax
LCSlenMax
-0.5
Stratum
-1.0
-1.0
-0.5
0.0
0.5
1.0
Navigational Inefficiency
Figure 5. Web navigation measures represented in 2D space defined by factor loadings on the
two extracted factors “Inefficiency” and “Laborious Navigation”.
6.2.2
Prediction of Task Success
Task success was scored true or false, similarly as in the TA study. Thus logistic regression was
used to find the best predictors of task success (n=384).
The Wald chi-squared statistic for the regression model (with the following predictors: LCSMax,
Stratum, Compact) was 58.6, p<.0001; 81.3% of predicted successes and failures agreed with the
observed values. The R2 of the model was .30. LCSMax was the best predictor of task success,
Predicting Lostness and Success in Web Navigation
the other two variables (Stratum and Compact) used in the model were not significant (p>.2).
The Wald chi-squared statistic associated with LCSMax in this model was 32.6, p<.0001.
Significant models were also obtained for LCSMax, stratum and compactness individually. The
R2 for these three models was .27, .17, and .15 respectively.
Thus, LCSMax was the best predictor of task success in both the TA study and the TL study.
This result confirmed our a priori intuitions: the higher the similarity to an optimal path, the
better the chances for success on an information finding task. Figure 6 presents the relationship
between average level of task success and similarity to an optimal path calculated for each
question in study TL. Figure 7 and Figure 8 present the relationship between average level of
task success and, respectively, average level of stratum and compactness calculated for each
question in study TL.
Found (Avg)
1
0.5
Q7
Q3
0
-0.1
0
0.1
0.2
0.3
0.4
LCSMax (Avg)
Figure 6. Relationship between task success (Found) and similarity to the optimal path
(LCSMax) for average values calculated for each question in study TL. Points corresponding to
questions Q3 and Q7 are marked.
20 of 31
Gwizdka & Spence
1
Found (Avg)
0.75
Q7
0.5
Q3
0.25
0
0.5
0.6
0.7
0.8
0.9
Stratum (Avg)
Figure 7. Relationship between task success and stratum (for average values calculated for each
question in study TL).
1
Found (Avg)
0.75
Q7
0.5
Q3
0.25
0
0.4
0.5
0.6
Compactness (Avg)
Figure 8. Relationship between task success and compactness (for average values calculated for
each question in study TL).
Predicting Lostness and Success in Web Navigation
6.2.3
Prediction of Task Success on Individual Tasks
All information-seeking tasks used in this study (i.e. each of the eight questions) were of the
fact-finding type. The tasks, however, differed in terms of how difficult it was to find the
requested information. To assess the effect of the task, we examined whether LCSMax was also
the best predictor of task success for each question.
Significant regression models (n=48) were obtained for five out of eight questions. Three of
those models confirmed LCSMax to be the best task success predictor. In two of those five
significant models, however, either compactness or stratum turned out to be slightly better
predictors2. In the first case (question Q3), lower values of compactness (sparsely linked web
usage graphs corresponding to fewer returns to previously visited pages) were related to higher
task success. In the second case (question Q7), higher values of stratum (more linear user
navigation path) were related to higher task success.
2
LCSMax was still a significant predictor if fitted alone.
22 of 31
Gwizdka & Spence
100%
0.4
avgRevisits
avgFound
80%
LCSMax (avg)
60%
40%
0.2
0.0
20%
1
2
3
4
5
6
7
8
QNo
0%
1
2
3
4
5
6
7
8
QNo
Avg. number of revisits and task success per question
-0.2
Average similarity to the optimal path per question
0.9
Stratum
Compactness
0.6
0.5
0.7
0.5
0.4
1
2
3
4
5
6
7
8
1
2
3
4
5
6
7
QNo
Average compactness per question
8
QNo
Average stratum per question
Figure 9. Characteristics of question 3 and 7 (for study TL)
These two questions (Q3 and Q7) were characterized by some of the most extreme values of the
web navigation measures – the highest revisit ratios, some of the lowest task success levels,
some of the lowest similarity levels to an optimal path, the highest compactness, and the lowest
stratum (Figure 9).
6.2.4
Prediction of Task Success by The Second Order Navigation Factors
The two factors (Inefficiency and Laboriousness) established by principal component analysis
(section 6.2.1) were used in another logistic regression model. Both variables were found to be
significant predictors of task success. The Wald chi-squared statistic of the regression model was
64.6, p<.0001; 89.4% of predicted successes and failures agreed with the observed values and
the R2 of the model was .43. The Wald chi-squared statistic for Inefficiency was 61.4, p<.0001,
Predicting Lostness and Success in Web Navigation
while for Laboriousness the Wald chi-squared statistic was 50.1, p<.0001. Lower Inefficiency
and lower Laboriousness predicted a greater chance of task success.
1.0
Q7
0.5
Laborious Navigation
Q8
Q1
Q3
0.0
Q5
Q6
Q4
-0.5
Q2
-1.0
-1.0
-0.5
0.0
0.5
1.0
Navigational Inefficiency
Figure 10. Eight questions from the TL study in 2D space defined by the two extracted factors
Inefficiency and Laborious Navigation.
As can be seen in Figure 10, question Q3 loaded high on Navigational Inefficiency, while Q7
loaded high on the Laborious Navigation factor. Highly Laborious and Inefficient surfing paths
are “far” from the optimal paths, hence similarity to the optimal path may not be differentiating
among them very well. As our results indicate, these navigation paths are indeed better
differentiated by the shape of the navigation graph (expressed by stratum and compactness).
Although LCSMax was, in general, the best single predictor of task success, on particular
questions it was sometimes the case that measures of graph compactness or linearity proved to be
better predictors.
24 of 31
Gwizdka & Spence
7.
DISCUSSION
We first reviewed a variety of structural and temporal measures that may be used to characterize
user web navigation (section 3.1). We then examined the measures by applying them in the
analysis of two empirical studies (sections 5 and 6). Our evaluation was made in the context of
factual information-seeking tasks. We also identified aggregate measures that can be used to
characterize user navigation styles and, by experiment, we appraised each of the measures as a
predictor of lostness and task success.
7.1
Aggregate Web Navigation Measures.
We identified two navigation styles (Navigational inefficiency and Laborious navigation) that
bear some similarities to those suggested in previous studies (Juvina & van Oostendorp, 2004;
Herder & Juvina, 2004; Juvina & Herder 2005). The differences between our aggregate measures
and those identified in the other studies, probably stem from the fact that those studies employed
different sets of first-order measures.
We applied the two identified aggregate measures to the characterization of individual tasks and
showed how combining these measures differentiates among the tasks. A combination of
measures may more successfully characterize user navigational behavior than any single measure
alone. Juvina & Herder (2005) used an aggregate measure (highly similar to our Laborious
Navigation factor) to evaluate a new navigational mechanism (link suggestion). Their aggregate
measure discriminated between the old and the new navigational mechanisms. Aggregate
measures and their combinations have considerable potential in the diagnosis and evaluation of
web navigation behavior.
7.2
Predictive Ability of Web Navigation Measures.
Since the findings of the Talk-Aloud (TA) and Time-Limited (TL) studies are similar—and
compatible—this discussion draws on the results of both studies.
We first consider the predictive capabilities of stratum and compactness (S&C) and we compare
our results with previous studies that also used S&C. We then consider measures of similarity to
the optimal path and their predictive capabilities. We conclude with a discussion of lostness.
Predicting Lostness and Success in Web Navigation
Our results indicate that lower values of compactness and higher values of stratum tend to be
associated with a higher probability of task success. This relation is opposite to the one shown by
(McEneaney, 2001). However, there is no necessary contradiction; McEneaney used a different
navigational task (learning from a hypertext handbook vs. factual information finding). It seems
that the navigational strategies which are successful may be quite different in the two situations.
Shih and his colleagues (Shih et al., 2004) found that S&C differentiated between expert and
novice navigation paths, and that, for experts, S&C differentiated also among the navigation task
phases (exploration, resolution, completion). Results from our study show that, when compared
with other measures (e.g., LCSMax), S&C are good predictors of task success for tasks on which
user behavior tends to be inefficient and more laborious. Since inefficiency and laboriousness are
associated with more difficult tasks, the predictive power of S&C seems to depend on task
difficulty. The findings of the three relevant studies are summarized in Table 3.
Table 3. The predictive ability of stratum and compactness in different contexts.
task attribute
task success
and task type
task phases
task success
context
description
higher stratum was associated with
success in other studies;
n/a
lower stratum was associated with
success in our study;
stratum and compactness differed
expert users
among three task phases
lower stratum (or higher compactness)
difficult tasks
predicted task success
study
learning task–(McEneaney, 2001; Shih
et al., 2004);
factual information finding–this study
task phases: exploration, resolution, and
completion–(Shih et al., 2004)
this study
Direct comparisons may be problematic, since each study used different combinations of
measures to predict task success or lostness. However, the results are sufficient to suggest a
number of plausible reasons for the observed differences among the studies.
1. The effects of different tasks may affect the sensitivity of predictive models; prediction may
only be useful, if specific contextual factors are known and their relationships understood.
The relationships between stratum/compactness and task outcomes are complex and are
likely mediated by contextual factors that vary with the task.
2. Variation in the potency of the same predictors, depending on the particular task (question),
further supports the conclusion that the success of a search strategy is dependent on the
nature of the information-seeking task. In particular, while they are generally good
26 of 31
Gwizdka & Spence
predictors, the stratum and compactness measures may not be as effective in predicting task
success with easier navigational tasks.
We showed that similarity to the optimal path is a good predictor of both lostness and task
success for information-seeking tasks. For other tasks the results may differ. Depending on the
task, the notion of a optimal path may not make sense, or an optimal path may not exist and
therefore the similarity measures would be ill-defined. Success on other tasks may be better
predicted by the shape of the exploration path, using measures like stratum and compactness.
Since we used existing complex websites, we did not control for the differences in website
hierarchies, and thus we cannot compare our results with the work of Larson and Czerwinski
(1998).
Like Herder (2003), we found that lostness can be predicted by observing user actions. However,
the most effective predictors of lostness in our studies (similarity to the optimal path and total
time on task), are different from those found by Herder. Also, it is important to be aware of slight
differences among the definitions of lostness and how the measures of lostness were
operationalized in the two studies.
As Otter and Johnson (2000) have argued, lostness is a complex phenomenon and a diverse set
of quantitative measures is likely needed to characterize lostness in different circumstances and
on different tasks. Since several studies have demonstrated the merits of different measures in
different circumstances, it is critically important to consider the user’s goals and the nature of the
task required to achieve these goals.
8.
CONCLUSIONS & FUTURE WORK
Previous work (Herder, 2003; McEneaney, 2001; Otter & Johnson, 2000; Shih et al., 2004)
has shown that the notion of lostness is useful in predicting success in information-seeking tasks.
Furthermore, these studies showed that a variety of easily computed measures could be useful in
characterizing and predicting lostness, and that lostness, in turn, is strongly associated with task
success. While we strongly endorse this approach to a better understanding of web navigation
Predicting Lostness and Success in Web Navigation
behavior, we believe that more empirical work is required to refine and select the best measures
for a variety of search tasks.
Appropriate measures can provide useful characterizations of user web navigation behavior and
can help to diagnose a variety of problems (such as getting lost) that users encounter when
navigating hypertext documents. Such measures can also help to identify the local web structures
that are conducive to successful navigation. Thus the basic goal in our research is to establish
measures that provide an objective basis for diagnosing and evaluating the information
architecture and information design of websites.
Our results showed that three first-order measures (similarity to the optimal path, navigation
graph linearity and compactness) and two second-order measures can be useful diagnostics of
user web navigation behavior. Further research is needed to determine whether user lostness and
success can be identified on tasks other than those used in our studies.
We have not considered individual differences (such as level of web familiarity, domain
knowledge, gender, verbal ability, spatial ability). While it is reasonable to expect that individual
differences would play a role in the development of feelings of lostness, this aspect was beyond
the scope of the present investigation. Future studies should examine such effects.
One of our next goals is to investigate the possibility of real-time automatic detection of when
users are becoming lost. The ability to predict lostness and task success would be extremely
useful in real-time. The prediction of could be based on behavior of one user or on aggregate
behavior of many visitors to a website. An effective diagnostic capability could be used to help
to build adaptive web structures. For example, user-help could be created dynamically based on
a real-time recognition of increasing lostness.
28 of 31
Gwizdka & Spence
APPENDIX A. Web Navigation Tasks from Study TL and TA
TA
TL
Task Goal
x
Find a listing of addresses for passport offices in Ontario.
x
What is the history of the West Nile virus?
x
x
x
x
Find a listing of documents on the topic of Dealing With Abuse.
Why are foods irradiated? Find the page that describes this process.
x
(Q3)
Find a short description of Ottawa that lists population and area covered, among other information.
x
Find the page that graphs energy consumption in Canada.
x
Find a listing of “Travel Health Advisories” listed by date.
x
Find a page that describes how to deal with stress for women.
x
x
Find a brief (two sentences) listing of Canadian health expenditures for 2000-2001.
x
x
Find the page that discusses “Maternity and Newborn Care”. This page includes a chapter listings for a
book.
x
Find the page that defines and describes Smog.
x
Find the official web-page for Saskatchewan that lists that province’s population.
x
Find the page that describes precautions for “Minimizing your risk” of contracting Hepatitis C.
x
(Q7)
Find the page that lists the key health care issues.
References
Ahuja, J. S. & Webster, J. (2001). Perceived disorientation: an examination of a new measure to assess web design
effectiveness. Interacting with computers, 14, 15-29.
Berendt, B. (2002). Using Site Semantics to Analyze, Visualize, and Support Navigation. Data Min. Knowl. Discov.
6(1): 37-59.
Boechler, P. M. (2001). How spatial is hyperspace? Interacting with hypertext documents: cognitive processes and
concepts. CyberPsychology and Behavior, 4, 23-46.
Botafogo, R. A., Rivlin, E., & Shneiderman, B. (1992). Structural analysis of hypertexts: Identifying hierarchies and
useful metrics. ACM Transactions on Information Systems, 10, 142-180.
Chen, S. Y. & Macredie, R. D. (2002). Cognitive style and hypermedia navigation: development of a learning
model. Journal of the American Society for Information Science and Technology, 53, 3-15.
Chi, E. H., Cousins, S., Rosien, A., Supattanasiri, G., Williams, A., Royer, C. et al. (2003). The Bloodhound
project: automating discovery of Web usability issues using the InfoScent™ simulator. In Proceedings of the ACM
Conference on Human Factors in Computing Systems CHI'2003.
Predicting Lostness and Success in Web Navigation
Clark, L., Ting, I., Kimble, C., Wright, P. & Kudenko, D. (2006) "Combining ethnographic and clickstream data to
identify user Web browsing strategies" Information Research, 11 (2) paper 249 [Available at
http://InformationR.net/ir/11-2/paper249.html]
Conklin, J. (1987). Hypertext: An introduction and survey. Computer, 20, 17-41.
Czerwinski, M., Horvitz, E. and Cutrell, E. (2001). Subjective Duration Assessment: An Implicit Probe for Software
Usability. Proceedings of IHM-HCI 2001, Lille, France, September, 2001, pp. 167-170.
Dieberger, A. (1995). Providing spatial navigation for the world wide web. Spatial Information Theory. In Spatial
Information Theory - Proceedings of COSIT'95 (pp. 93-106). Semmering, Austria: Springer.
Dieberger, A. (1997). A city metaphor to support navigation in complex information spaces. In Spatial Information
Theory - Proceedings of COSIT'95 (pp. 53-67). Springer.
Ericsson, K. & Simon, H. (1980). Verbal Reports as Data. Psychological Review. 87, 215-251.
Fox, S., Karnawat, K., Mydland, M., Dumais, S., & White, T. (2005). Evaluating implicit measures to improve web
search. ACM Transactions on Information Systems, 23, 147-168.
Gansner, E. R. & North, S. (1999). An open graph visualization system and its applications to software engineering.
Software: Practice and Experience, 30, 1203-1233.
Herder, E. (2003). Revisitation Patterns and Disorientation. In Proceedings of the German Workshop on Adaptivity
and User Modeling in Interactive Systems ABIS 2003 (pp. 291-294).
Herder, E. & Juvina, I. (2004). Discovery of Individual Navigation Styles. In Proceedings of Workshop on
Individual Differences in Adaptive Hypermedia at Adaptive Hypermedia 2004 (AH2004).
Juvina, I. & van Oostendorp, H. (2004). Individual Differences and Behavioral Aspects Involved in Modeling Web
Navigation. LECTURE NOTES IN COMPUTER SCIENCE, 3196, 77-95.
Juvina, I. & Herder, E., (2005). The Impact of Link Suggestions on User Navigation and User Perception. UM2005
User Modeling: Proceedings of the Tenth International Conference.
Kellar, M., Watters, C., Duffy, J., & Shepherd, M. (2004). Modeling information content using observable behavior.
In Proceedings of the ASIST Annual Meeting.
Kelly, D. & Belkin, N. J. (2001). Reading time, scrolling and interaction: Exploring implicit sources of user
preferences for relevance feedback. SIGIR Forum (ACM Special Interest Group on Information Retrieval), 408409.
Kelly, D. & Teevan, J. (2003). Implicit feedback for inferring user preference: a bibliography. SIGIR Forum, 37,
18-28.
Kim, H. & Hirtle, S. C. (1995). Spatial metaphors and disorientation in hypertext browsing. Behaviour &
Information Technology, 14, 239-250.
Krahmer, E.& Ummelen, N. (2004). Thinking about Thinking Aloud: A Comparison of Two Verbal Protocols for
Usability Testing, IEEE Transactions on Professional Communication, 47, 105-117.
Larson, K. & Czerwinski, M. (1998). Web page design: Implications of memory, structure and scent for information
retrieval. In Proceedings Form CHI 98. 25-32.
Masahiro, M. & Yoichi, S. (1994). Information filtering based on user behavior analysis and best match text
retrieval. In Proceedings of the 17th annual international ACM SIGIR conference on Research and development in
information retrieval (pp. 272-281). Dublin, Ireland: Springer-Verlag New York, Inc.
30 of 31
Gwizdka & Spence
McEneaney, J. E. (2001). Graphic and numerical methods to assess navigation in hypertext. International Journal of
Human Computer Studies, 55, 761-786.
Morrison, J., Pirolli, P., & Card, S. K. (2001). A taxonomic analysis of what world wide web activities significantly
impact people's decisions and actions. In Proceedings of CHI' 2001. Extended abstracts (pp. 161-162).
Needleman, S. B. & Wunsch, C. D. (1970). A general method applicable to the search for similarities in the amino
acid sequence of two proteins. Journal of molecular biology, 48, 443-453.
Newell, A. & Simon, H. (1972). Human Problem Solving, Englewood Cliffs, NJ: Prentice Hall,
North, S. C. & Koutsofios, E. (1994). Applications of graph visualization. In (pp. 235-246).
Oard, D. & Kim, J. (2001). Modeling information content using observable behavior. In Proceedings of the ASIST
Annual Meeting (pp. 481-488).
Open Source (2005). Pathalizer [Computer software].
Otter, M. & Johnson, H. (2000). Lost in hyperspace: metrics and mental models. Interacting with computers, 13, 140.
Pitkow, J. E. & Pirolli, P. (1999). Mining Longest Repeating Subsequences to Predict World Wide Web Surfing. In
USENIX Symposium on Internet Technologies and Systems The USENIX Association.
Russo, J.E., Johnson, E.J. & Stephens, D.L. (1989). The Validity of Verbal Methods. Memory and Cognition, 17,
759-769
Shih, P.-C., Mate, R., Sanchez, F., & Munoz, D. (2004). Quantifying user-navigation patterns: a methodology
proposal. In Poster presented at the 28th International Congress of Psychology in Bejing Bejing 2004.
Smith, P. A. (1996). Towards a practical measure of hypertext usability. Interacting with computers, 8, 365-381.
Spink, A. & Losee, R. M. (1996). Feedback in Information Retrieval. Annual Review of Information Science and
Technology, 31, 33-78.
Tauscher, L. & Greenberg, S. (1997). How people revisit web pages: Empirical findings and implications for the
design of history systems. International Journal of Human Computer Studies, 47, 97-137.
Wang, W. & Zaïane, O. R. (2002). Clustering Web Sessions by Sequence Alignment. In Proceedings of DEXA
Workshops 2002 (pp. 394-398).