Chapter 1 Web Engineeering

Introduction to Information Architecture
Information Architect: 1) the individual who organizes the patterns inherent

in data, making the complex clear; 2) a person who creates the structure or
map of information which allows others to find their personal paths to
knowledge; 3) the emerging 21st century professional occupation
addressing the needs of the age focused upon clarity, human understanding
and the science of the organization of information.
--Richard Saul Wurman, Information Architects, ed. Peter Bradford (Zurich:

Graphis Press Corp, 1996).
2.1. The Role of the Information Architect
Now that you know right from wrong from the web consumer's perspective,
you're in a much better position to develop a web site. But besides needing a
sophisticated knowledge of what works for consumers of the Web, what's
actually involved in creating a web site?
Obviously, you need HTML pages. Maybe you'll grab a good HTML book
or a decent HTML editing package. Maybe a high school kid can do the trick
for peanuts. What about the copy for those pages? It needs to come from
somewhere -- perhaps existing brochures and documentation; perhaps it
needs to be written from scratch. You'll also need some graphic design
expertise to make sure that the pages are laid out with effective use of text,
white space, and attractive images. Of course you'll need a server that is
connected to the Internet; this you can lease, or you can buy one of your
own. If you do, just be sure to hire someone sufficiently technically astute to
administer that server. Perhaps that person should also write the CGI, Perl,
ActiveX, Java, and other scripts that make the site interactive. What's
missing? Maybe a project manager to make sure all these folks work
together to develop the site without running behind schedule and over
budget.
So now you're all set to design your web site, right?
Well, not quite. What's missing from this picture is a definition of what the
site will actually be, and how it will work.
This may sound obvious, but for most web sites, it's true: design and
production storm ahead without any unifying principle to guide the site's
development. A web site essentially can be anything you want it to be and
could cost millions of dollars, take years to complete, and cost thousands of
lives to develop. To avoid such overkill, it will need to be defined somehow:
it will need a definition.
That's the main job of the information architect, who:
 Clarifies the mission and vision for the site, balancing the needs of its
sponsoring organization and the needs of its audiences.
 Determines what content and functionality the site will contain.
 Specifies how users will find information in the site by defining its
organization, navigation, labeling, and searching systems.
 Maps out how the site will accommodate change and growth over
time.
Although these sound obvious, information architecture is really about

what's not obvious. Users don't notice the information architecture of a site
unless it isn't working. When they do notice good architectural features
within a site, they instead attribute these successes to something else, like
high-quality graphic design or a well-configured search engine. Why? When
you read or hear about web site design, the language commonly used
pertains to pages, graphic elements, technical features, and writing style.
However, no terms adequately describe the relationships among the
intangible elements that constitute a web site's architecture. The elements of
information architecture -- navigation systems, labeling systems,
organization systems, indexing, searching methods, metaphors -- are the
glue that holds together a web site and allows it to evolve smoothly. To a
novice, this terminology is not very clear. These elements are extremely
difficult to measure, and therefore even harder to compare. You really have
to spend time using a site and get a feel for it before you can confidently talk
about a site's information architecture.
Yet, we know these things are important. How? Well, consider your
responses to the Boot Camp exercise in Chapter 1, "What Makes a Web Site
Work". How many of the likes and dislikes are not related to technical
issues, copy editing, or graphic design? Remaining issues are probably tied
to information architecture. Although perhaps indirectly, a poorly planned
information architecture will adversely affect those other areas.
Well-planned information architectures greatly benefit both consumers and

producers. Accessing a site for the first time, consumers can quickly
understand it effortlessly. They can quickly find the information they need,
thereby reducing the time (and costs) wasted on both finding information
and not finding information. Producers of web sites and intranets benefit
because they know where and how to place new content without disrupting
the existing content and site structure. Perhaps most importantly, producers
can use an information architecture to greatly minimize the politics that
come to the fore during the development of a web site.
2.1.1. The Consumer's Perspective
Consumers, or users as we more commonly refer to them, want to find

information quickly and easily. Contrary to what you might conclude from
observing the architectures of many large, corporate web sites, users do not
like to get lost in chaotic hypertextual webs. Poor information architectures
make busy users confused, frustrated, and angry.
Because different users have varying needs, it's important to support

multiple modes of finding information. Some users know exactly what
they're looking for. They know what it's called (or labeled), and they know it
exists. They just want to find it and leave, as quickly and painlessly as
possible. This is called known-item searching.
Other users do not know what they're looking for. They come to the site with
a vague idea of the information they need. They may not know the right
labels to describe what they want or even whether it exists. As they casually
explore your site, they may learn about products or services that they'd never
even considered. Iteratively, through serendipity and associative learning,
they may leave your site with knowledge (or products) that they hadn't
known they needed.
These modes of finding information are not mutually exclusive. In a well-

designed system, many users will switch between known-item searching and
casual browsing as they explore the site. If you care about the consumer,
make sure your architecture supports both modes. While attractive graphics
and reliable technologies are essential to user satisfaction, they are not
enough.
2.1.2. The Producer's Perspective
Since few organizations are completely altruistic, they usually want to know
the return on their investment for information architecture design. In other
words, what's in it for them? First, a disclaimer. Buying information
architecture services is not like investing in a mutual fund. You can't
calculate hard and fast numbers to show the exact benefit of your investment
over time.
Nonetheless, you can demonstrate the value to the organization through less
scientific means. Depending upon the goals and nature of your site, you may
even be able to defend your investment with some not-so-hard numbers.
Consideration of value to the producer takes us back to the consumer. If

you're producing an external web site, this involves actual and prospective
customers, investors, employees, and business partners, not to mention the
media and senior executives within your organization. Do you really want to
frustrate any of these people? What is the value of quickly and easily helping
them find the information they need?
If you're producing an intranet, the employees of your organization are the

consumers. What is the cost of their time spent to find the information they
need? What is the cost when employees don't find the information they
need?
Finally, we need to consider the actual costs of designing and implementing

the architecture. A well-designed, diplomatic architecture can prevent costly
political battles that can stop a project in its tracks. The cost of time spent by
high-level executives arguing over which department's information belongs
on the main page can skyrocket if you're not careful. A well-designed
scaleable architecture can prevent doing it all over a year later. Far too many
architectures are crushed under the weight of their own content. Redesign of
the information architecture impacts all other aspects of the web site, from
graphical navigation bars to the content itself, and it can be a very costly
adventure.
Let's illustrate with a real-life example. Recently, we met with about ten
members of a large client's web site development team. Because we were in
the early stages of the planning process, we had just reviewed the client's
likes and dislikes, and were determining their web design philosophy. Now
we were ready to begin defining what their site would be.
In discussing the site's likely users, around seven or eight audiences were
suggested. Five or six major goals of the site were determined. Finally, we
talked about the main areas of content and functionality that the site would
include. This wish list included thirty or forty items. We now had a lot of
useful lists and ideas, but was the web site ready to be designed yet?
At this point, many site designers would happily dive in head first. Their
work would be a site headed by a main page that included thirty or forty
items and links, tried to please seven or eight different audiences, and
ultimately failed at achieving its five or six goals. This is what happens
when the big picture of a site is ignored.
Consider what happens to a site with a single designer who sees only the
trees, not the forest. Now add an order of magnitude: large organizations,
rife with complex goals and messy politics, often have sites designed by ten
individuals with their own vision of the site, their own deadlines and goals to
meet, and their own politics to play. Is it any wonder that these sites often
work so poorly, even when huge investments of time and money are made in
them?
Succinctly, information architecture is about understanding and conveying

the big picture of a web site.
Back to our client's committee of ten tree-people. They were still struggling
over what the site would ultimately be. Which goals are the most important?
Should the site be informational, entertaining, or educational? Should there
be one main page for all audiences, or one for each audience? Should we
design an architecture that organizes the site's information by topic, by
function, or in some other way? Who within their organization should own
and maintain the information in the site? What kind of navigation and
wording would make the most sense?
Our last meeting ended in frustration, as the committee members argued but
never resolved these points. They were especially unhappy, as they'd thought
that designing a web site was supposed to be fun, without the haggling over
audience definitions, dredging up of organizational politics, and dealing with
other unpleasantries that had come up in the discussion. Some even
expressed concern that we shouldn't even bother wading into this swamp and
instead should start doing something, like gathering together the site's
content, pushing forward on the graphic design, and so on.
Having exposed so much frustration, we were obviously on the right track.
Why?
Because these thorny and confounding issues of information architecture

must be resolved during the design process, before the site is built. If we
were to avoid answering these questions and the site's development was to
proceed, these issues wouldn't go away. Instead, the burden would be on the
site's users to understand how to use and find information in a confusing,
poorly-designed web site. Of course, we know that a frustrated user will
click and leave with a bad memory of the site, likely to never return. Without
a clear information architecture, the site's maintainers wouldn't know where
to locate the new information that the site would eventually include; they'd
likely begin to quarrel over whose content was more important and deserved
visibility on the main page, and so on.
Collaboration and Communication
The information architect must communicate effectively with the web site
development team. This is challenging, since an information architecture is
highly abstract and intangible. Besides communicating the architecture
verbally, documents (such as blueprint diagrams) must be created in ways
that can be understood by the rest of the team regardless of their own
disciplinary backgrounds.
In the early days of the Web, web sites were often designed, built, and
managed by a single individual through sheer force of will. This webmaster
was responsible for assembling and organizing the content, designing the
graphics, and hacking together any necessary CGI scripts. The only
prerequisites were a familiarity with HTML and a willingness to learn on the
job. People with an amazing diversity of backgrounds suddenly became
webmasters overnight, and soon found themselves torn in many directions at
once. One minute they were information architects, then graphic designers,
then editors, then programmers.
Then companies began to demand more of their sites and, consequently, of

their webmasters. Simple home pages quickly evolved into complex web
sites. People wanted more content, better organization, greater function, and
prettier graphics. Extensions, plug-ins, and languages proliferated. Tables,
VRML, frames, Shockwave, Java, and ActiveX were added to the toolbox.
No mortal webmaster could keep up with the rising expectations and the
increasing complexity of the environment.
Increasingly, webmasters and their employers began to realize that the

successful design and production of complex web sites requires an
interdisciplinary team approach. An individual cannot be an expert in all
facets of the process. Rather, a team of individuals with complementary
areas of expertise must work together. The composition of this team will
vary, depending upon the needs of a particular project, available budget, and
the availability of expertise. However, most projects will require expertise in
marketing, information architecture, graphic design, writing and editing,
programming, and project management.
Marketing
The marketing team focuses on the intended purposes and audiences

for the web site. They must understand what will bring the right
people to the web site and what will bring them back again.
Information Architecture
The information architects focus on the design of organization,

indexing, labeling, and navigation systems to support browsing and
searching throughout the web site.
Graphic Design
The designers are responsible for the graphic design and page layout
that defines the graphic identity or look of the web site. They strive to
create and implement a design philosophy that balances form and
function.
Editorial
Editors focus on the use of language throughout the web site. Their
tasks may involve proofreading and editing copy, massaging content
to ensure a common voice for the site, and creating new copy.
Technical
The technical designers and programmers are responsible for server
administration and the development or integration of site production
tools and web site applications. They advise the other teams regarding
technology-related opportunities and limitations.
Project Management
The project manager keeps the project on schedule and within budget.
He or she facilitates communication between the other teams and the
clients or internal stakeholders.
The success of a web site design and production project depends on

successful communication and collaboration between these specialized team
members. A linear, black-box, throw-it-over-the-wall methodology just won't
work. Everyone needs to understand the goals, perspectives, and approaches
of the other members of the team. For example, while the marketing
specialist may lead the audience analysis process, he or she needs to
anticipate the types of questions about the audience that the specialists will
have. Otherwise, each will need to start from scratch in learning about that
audience, wasting substantial time and resources.
For the information architect, communication is a special challenge because

of the intangible nature of the work. Anyone who has played Pictionary
knows that it is much harder to draw an abstract concept such as science
than a physical object such as moon. As an information architect, you face
the daunting challenge of helping others visualize such abstract concepts as a
metaphor-based architecture and indexing systems.
The information architect has to identify both the goals of the site and the
content that it will be built on. This means getting the people who drive the
business, whether bosses or clients, to articulate their vision of the site and
who its users are. Once you've collected the data and developed a plan, you
need to present your ideas for an information architecture and move the
group toward consensus. All in all, this significantly burdens the architect to
communicate effectively.
This is the point of the rest of this book. The next four chapters introduce the
foundations of information architecture to support your efforts to
communicate an information architecture by providing useful terms,
definitions, and concepts. Chapter 7, "Research" through Chapter 10,
"Information Architecture in Action" provide a framework for these
communications, and for the role of architecture in site development as a
whole.
Organizing Information
Our understanding of the world is largely determined by our ability to

organize information. Where do you live? What do you do? Who are you?
Our answers reveal the systems of classification that form the very
foundations of our understanding. We live in towns within states within
countries. We work in departments in companies in industries. We are
parents, children, and siblings, each an integral part of a family tree.
We organize to understand, to explain, and to control. Our classification

systems inherently reflect social and political perspectives and objectives.
We live in the first world. They live in the third world. She is a freedom
fighter. He is a terrorist. The way we organize, label, and relate information
influences the way people comprehend that information.
As information architects, we organize information so that people can find

the right answers to their questions. We strive to support casual browsing
and directed searching. Our aim is to apply organization and labeling
systems that make sense to users.
The Web provides us with a wonderfully flexible environment in which to

organize. We can apply multiple organization systems to the same content
and escape the physical limitations of the print world. So why are many
large web sites so difficult to navigate? Why can't the people who design
these sites make it easy to find information? These common questions focus
attention on the very real challenge of organizing information.
3.1. Organizational Challenges
In recent years, increasing attention has been focused on the challenge of

organizing information. Yet, this challenge is not new. People have struggled
with the difficulties of information organization for centuries. The field of
librarianship has been largely devoted to the task of organizing and
providing access to information. So why all the fuss now?
Believe it or not, we're all becoming librarians. This quiet yet powerful
revolution is driven by the decentralizing force of the global Internet. Not
long ago, the responsibility for labeling, organizing, and providing access to
information fell squarely in the laps of librarians. These librarians spoke in
strange languages about Dewey Decimal Classification and the Anglo-
American Cataloging Rules. They classified, cataloged, and helped us find
the information we needed.
The Internet is forcing the responsibility for organizing information on more

of us each day. How many corporate web sites exist today? How many
personal home pages? What about tomorrow? As the Internet provides us all
with the freedom to publish information, it quietly burdens us with the
responsibility to organize that information.
As we struggle to meet that challenge, we unknowingly adopt the language

of librarians. How should we label that content? Is there an existing
classification system we can borrow? Who's going to catalog all of that
information?
We're moving towards a world where tremendous numbers of people publish

and organize their own information. As we do so, the challenges inherent in
organizing that information become more recognized and more important.
Let's explore some of the reasons why organizing information in useful ways
is so difficult.
3.1.1. Ambiguity
Classification systems are built upon the foundation of language, and

language is often ambiguous. That is, words are capable of being understood
in two or more possible ways. Think about the word pitch. When you say
pitch, what do I hear? There are actually more than 15 definitions, including:
 A throw, fling, or toss.

 A black, sticky substance used for waterproofing.
 The rising and falling of the bow and stern of a ship in a rough sea.
 A salesman's persuasive line of talk.
 An element of sound determined by the frequency of vibration.
It gets worse. Not only do we need to agree on the labels and their
definitions, we also need to agree on which documents to place in which
categories. Consider the common tomato. According to Webster's dictionary,
a tomato is a red or yellowish fruit with a juicy pulp, used as a vegetable:
botanically it is a berry. Now I'm confused. Is it a fruit or a vegetable or a
berry?[3]
3.1.2. Heterogeneity
Heterogeneity refers to an object or collection of objects composed of

unrelated or unlike parts. You might refer to grandma's homemade broth
with its assortment of vegetables, meats, and other mysterious leftovers as
heterogeneous. At the other end of the scale, homogeneous refers to
something composed of similar or identical elements. For example, Oreo
cookies are homogeneous. Every cookie looks and tastes the same.
An old-fashioned library card catalog is relatively homogeneous. It

organizes and provides access to books. It does not provide access to
chapters in books or collections of books. It may not provide access to
magazines or videos. This homogeneity allows for a structured classification
system. Each book has a record in the catalog. Each record contains the
same fields: author, title, and subject. It is a high-level, single-medium
system, and works fairly well.
Most web sites, on the other hand, are highly heterogeneous in two respects.
First, web sites often provide access to documents and their components at
varying levels of granularity . A web site might present articles and journals
and journal databases side by side. Links might lead to pages, sections of
pages, or to other web sites. Second, web sites typically provide access to
documents in multiple formats. You might find financial news, product
descriptions, employee home pages, image archives, and software files.
Dynamic news content shares space with static human resources
information. Textual information shares space with video, infoarch, and
interactive applications. The web site is a great multimedia melting pot,
where you are challenged to reconcile the cataloging of the broad and the
detailed across many mediums.
The heterogeneous nature of web sites makes it difficult to impose highly

structured organization systems on the content. It doesn't make sense to
classify documents at varying levels of granularity side by side. An article
and a magazine should be treated differently. Similarly, it may not make
sense to handle varying formats the same way. Each format will have
uniquely important characteristics. For example, we need to know certain
things about images such as file format (GIF, TIFF, etc.) and resolution
(640x480, 1024x768, etc.). It is difficult and often misguided to attempt a
one-size-fits-all approach to the organization of heterogeneous web site
content.
3.1.3. Differences in Perspectives
Have you ever tried to find a file on a coworker's desktop computer?

Perhaps you had permission. Perhaps you were engaged in low-grade
corporate espionage. In any case, you needed that file. In some cases, you
may have found the file immediately. In others, you may have searched for
hours. The ways people organize and name files and directories on their
computers can be maddeningly illogical. When questioned, they will often
claim that their organization system makes perfect sense. "But it's obvious! I
put current proposals in the folder labeled /office/clients/red and old
proposals in /office/clients/blue. I don't understand why you couldn't find
them!"
The fact is that labeling and organization systems are intensely affected by
their creators' perspectives. We see this at the corporate level with web sites
organized according to internal divisions or org charts. In these web sites, we
see groupings such as marketing, sales, customer support, human resources,
and information systems. How does a customer visiting this web site know
where to go for technical information about a product they just purchased?
To design usable organization systems, we need to escape from our own
mental models of content labeling and organization.
You must put yourself into the shoes of the intended user. How do they see
the information? What types of labels would they use? This challenge is
further complicated by the fact that web sites are designed for multiple users,
and all users will have different perspectives or ways of understanding the
information. Their levels of familiarity with your company and your web
site will vary. For these reasons, it is impossible to create a perfect
organization system. One site does not fit all! However, by recognizing the
importance of perspective and striving to understand the intended audiences,
you can do a better job of organizing information for public consumption
than your coworker on his or her desktop computer.
3.1.4. Internal Politics

Politics exist in every organization. Individuals and departments constantly
position for power or respect. Because of the inherent power of information
organization in forming understanding and opinion, the process of designing
information architectures for web sites and intranets can involve a strong
undercurrent of politics. The choice of organization and labeling systems can
have a big impact on how users of the site perceive the company, its
departments, and its products. For example, should we include a link to the
library site on the main page of the corporate intranet? Should we call it The
Library or Information Services or Knowledge Management? Should
information resources provided by other departments be included in this
area? If the library gets a link on the main page, then why not corporate
communications? What about daily news?
As an information architect, you must be sensitive to your organization's

political environment. In certain cases, you must remind your colleagues to
focus on creating an architecture that works for the user. In others, you may
need to make compromises to avoid serious political conflict. Politics raise
the complexity and difficulty of creating usable information architectures.
However, if you are sensitive to the political issues at hand, you can manage
their impact upon the architecture.
Organizing Web Sites and Intranets
The organization of information in web sites and intranets is a major factor

in determining success, and yet many web development teams lack the
understanding necessary to do the job well. Our goal in this chapter is to
provide a foundation for tackling even the most challenging information
organization projects.
Organization systems are composed of organization schemes and

organization structures . An organization scheme defines the shared
characteristics of content items and influences the logical grouping of those
items. An organization structure defines the types of relationships between
content items and groups.
Before diving in, it's important to understand information organization in the

context of web site development. Organization is closely related to
navigation, labeling, and indexing. The hierarchical organization structures
of web sites often play the part of primary navigation system. The labels of
categories play a significant role in defining the contents of those categories.
Manual indexing is ultimately a tool for organizing content items into groups
at a very detailed level. Despite these closely knit relationships, it is both
possible and useful to isolate the design of organization systems, which will
form the foundation for navigation and labeling systems. By focusing solely
on the logical grouping of information, you avoid the distractions of
implementation details and design a better web site.
3.2.1. Organization Schemes
We navigate through organization schemes every day. Phone books,

supermarkets, and television programming guides all use organization
schemes to facilitate access. Some schemes are easy to use. We rarely have
difficulty finding a friend's phone number in the alphabetical organization
scheme of the white pages. Some schemes are intensely frustrating. Trying
to find marshmallows or popcorn in a large and unfamiliar supermarket can
drive us crazy. Are marshmallows in the snack aisle, the baking ingredients
section, both, or neither?
In fact, the organization schemes of the phone book and the supermarket are
fundamentally different. The alphabetical organization scheme of the phone
book's white pages is exact. The hybrid topical/task-oriented organization
scheme of the supermarket is ambiguous.
3.2.1.1. Exact organization schemes
Let's start with the easy ones. Exact organization schemes divide information
into well defined and mutually exclusive sections. The alphabetical
organization of the phone book's white pages is a perfect example. If you
know the last name of the person you are looking for, navigating the scheme
is easy. Porter is in the P's which is after the O's but before the Q's. This is
called " known-item" searching. You know what you're looking for and it's
obvious where to find it. No ambiguity is involved. The problem with exact
organization schemes is that they require the user to know the specific name
of the resource they are looking for. The white pages don't work very well if
you're looking for a plumber.
Exact organization schemes are relatively easy to design and maintain

because there is little intellectual work involved in assigning items to
categories. They are also easy to use. The following sections explore three
frequently used exact organization schemes.
3.2.1.1.1. Alphabetical
An alphabetical organization scheme is the primary organization scheme for

encyclopedias and dictionaries. Almost all nonfiction books, including this
one, provide an alphabetical index. Phone books, department store
directories, bookstores, and libraries all make use of our 26-letter alphabet
for organizing their contents. Alphabetical organization often serves as an
umbrella for other organization schemes. We see information organized
alphabetically by last name, by product or service, by department, and by
format. See Figure 3-1 for an example.
Figure 3-1. An alphabetical index supports both rapid scanning for a

known item and more casual browsing of a directory.
3.2.1.1.2. Chronological
Certain types of information lend themselves to chronological organization.

For example, an archive of press releases might be organized by the date of
release (see Figure 3-2). History books, magazine archives, diaries, and
television guides are organized chronologically. As long as there is
agreement on when a particular event occurred, chronological schemes are
easy to design and use.
Figure 3-2. Press release archives are obvious candidates for

chronological organization schemes. The date of announcement
provides important context for the release. However, keep in
mind that users may also want to browse the releases by title or
search by keyword. A complementary combination of
organization schemes is often necessary.
3.2.1.1.3. Geographical
Place is often an important characteristic of information. We travel from one

place to another. We care about the news and weather that affects us in our
location. Political, social, and economic issues are frequently location-
dependent. With the exception of border disputes, geographical organization
schemes are fairly straightforward to design and use. Figure 3-3 shows an
example of a geographic organization scheme.
Figure 3-3. In this example, the map presents a graphical view of the
geographic organization scheme. Users can select a location from
the map using their mouse.
3.2.1.2. Ambiguous organization schemes
Now for the tough ones. Ambiguous organization schemes divide

information into categories that defy exact definition. They are mired in the
ambiguity of language and organization, not to mention human subjectivity.
They are difficult to design and maintain. They can be difficult to use.
Remember the tomato? Do we put it under fruit, berry, or vegetable?
However, they are often more important and useful than exact organization
schemes. Consider the typical library catalog. There are three primary
organization schemes. You can search for books by author, by title, or by
subject. The author and title organization schemes are exact and thereby
easier to create, maintain, and use. However, extensive research shows that
library patrons use ambiguous subject-based schemes such as the Dewey
Decimal and Library of Congress Classification Systems much more
frequently.
There's a simple reason why people find ambiguous organization schemes so
useful: We don't always know what we're looking for. In some cases, you
simply don't know the correct label. In others, you may only have a vague
information need that you can't quite articulate. For these reasons,
information seeking is often iterative and interactive. What you find at the
beginning of your search may influence what you look for and find later in
your search. This information seeking process can involve a wonderful
element of associative learning. Seek and ye shall find, but if the system is
well-designed, you also might learn along the way. This is web surfing at its
best.
Ambiguous organization supports this serendipitous mode of information

seeking by grouping items in intellectually meaningful ways. In an
alphabetical scheme, closely grouped items may have nothing in common
beyond the fact that their names begin with the same letter. In an ambiguous
organization scheme, someone other than the user has made an intellectual
decision to group items together. This grouping of related items supports an
associative learning process that may enable the user to make new
connections and reach better conclusions. While ambiguous organization
schemes require more work and introduce a messy element of subjectivity,
they often prove more valuable to the user than exact schemes.
The success of ambiguous organization schemes depends on the initial

design of a classification system and the ongoing indexing of content items.
The classification system serves as a structured container for content items.
It is composed of a hierarchy of categories and subcategories with scope
notes that define the types of content to be included under each category.
Once this classification system has been created, content items must be
assigned to categories accurately and consistently. This is a painstaking
process that only a librarian could love. Let's review a few of the most
common and valuable ambiguous organization schemes.
3.2.1.2.1. Topical
Organizing information by subject or topic is one of the most challenging

yet useful approaches. Phone book yellow pages are organized topically.
That's why they're the right place to look when you need a plumber.
Academic courses and departments, newspapers, and the chapters of most
nonfiction books are all organized along topical lines.
While few web sites should be organized solely by topic, most should
provide some sort of topical access to content. In designing a topical
organization scheme, it is important to define the breadth of coverage. Some
schemes, such as those found in an encyclopedia, cover the entire breadth of
human knowledge (see Figure 3-4 for an example). Others, such as those
more commonly found in corporate web sites, are limited in breadth,
covering only those topics directly related to that company's products and
services. In designing a topical organization scheme, keep in mind that you
are defining the universe of content (both present and future) that users will
expect to find within that area of the web site.
3.2.1.2.2. Task-oriented
Task-oriented schemes organize content and applications into a collection of

processes, functions, or tasks. These schemes are appropriate when it's
possible to anticipate a limited number of high-priority tasks that users will
want to perform. Desktop software applications such as word processors and
spreadsheets provide familiar examples. Collections of individual actions are
organized under task-oriented menus such as Edit, Insert, and Format.
On today's Web, task-oriented organization schemes are less common, since

most web sites are content rather than application intensive. This should
change as sites become increasingly functional. Intranets and extranets lend
themselves well to a task orientation, since they tend to integrate powerful
applications as well as content. Figure 3-5 shows an example of a task-
oriented site.
Figure 3-5. In this example, General Motors anticipates some of the
most important needs of users by presenting a task-based menu
of action items. This approach enables GM to quickly funnel a
diverse user base into specific action-oriented areas of the web
site.
3.2.1.2.3. Audience-specific
In cases where there are two or more clearly definable audiences for a web
site or intranet, an audience-specific organization scheme may make sense.
This type of scheme works best when the site is frequented by repeat visitors
who can bookmark their particular section of the site. Also, it works well if
there is value in customizing the content for each audience. Audience-
oriented schemes break a site into smaller, audience-specific mini-sites,
thereby allowing for clutter-free pages that present only the options of
interest to that particular audience. See Figure 3-6 for an example.
Figure 3-6. This area of the SIGGRAPH 97 conference web site is
designed to meet the unique needs of media professionals
covering the conference. Other SIGGRAPH audiences with
special needs include contributors and exhibitors.
Audience-specific schemes can be open or closed. An open scheme will

allow members of one audience to access the content intended for other
audiences. A closed scheme will prevent members from moving between
audience-specific sections. A closed scheme may be appropriate if
subscription fees or security issues are involved.
3.2.1.2.4. Metaphor-driven
Metaphors are commonly used to help users understand the new by relating
it to the familiar. You need not look further than your desktop computer with
its folders, files, and trash can or recycle bin for an example. Applied to an
interface in this way, metaphors can help users understand content and
function intuitively. In addition, the process of exploring possible metaphor-
driven organization schemes can generate new and exciting ideas about the
design, organization, and function of the web site (see "Metaphor
Exploration" in Chapter 8, "Conceptual Design").
While metaphor exploration can be very useful while brainstorming, you
should use caution when considering a metaphor-driven global organization
scheme. First, metaphors, if they are to succeed, must be familiar to users.
Organizing the web site of a computer hardware vendor according to the
internal architecture of a computer will not help users who don't understand
the layout of a motherboard.
Second, metaphors can introduce unwanted baggage or be limiting. For

example, users might expect a virtual library to be staffed by a librarian that
will answer reference questions. Most virtual libraries do not provide this
service. Additionally, you may wish to provide services in your virtual
library that have no clear corollary in the real world. Creating your own
customized version of the library is one such example. This will force you to
break out of the metaphor, introducing inconsistency into your organization
scheme.
Figure 3-7 shows a more offbeat metaphor example.

Figure 3-7. In this offbeat example, Bianca has organized the contents of
her web site according to the metaphor of a physical shack with
rooms. While this metaphor-driven approach is fun and conveys
a sense of place, it is not particularly intuitive. Can you guess
what you'll find in the pantry? Also, note that features such as
Find Your Friend don't fit neatly into the metaphor.
3.2.1.3. Hybrid schemes
The power of a pure organization scheme derives from its ability to suggest
a simple mental model for users to quickly understand. Users easily
recognize an audience-specific or topical organization. However, when you
start blending elements of multiple schemes, confusion is almost guaranteed.
Consider the example of a hybrid scheme in Figure 3-8. This hybrid scheme
includes elements of audience-specific, topical, metaphor-based, and task-
oriented organization schemes. Because they are all mixed together, we can't
form a mental model. Instead, we need to skim through each menu item to
find the option we're looking for.
Figure 3-8. A hybrid organization scheme
Examples of hybrid schemes are common on the Web. This happens because
it is often difficult to agree upon any one scheme to present on the main
page, so people throw the elements of multiple schemes together in a
confusing mix. There is a better alternative. In cases where multiple schemes
must be presented on one page, you should communicate to designers the
importance of retaining the integrity of each scheme. As long as the schemes
are presented separately on the page, they will retain the powerful ability to
suggest a mental model for users (see Figure 3-9 for an example).
Figure 3-9. Notice that the audience-oriented scheme (contributors,
exhibitors, media) has been presented as a pure organization
scheme, separate from the others on this page. This approach
allows you to present multiple organization schemes on the same
page without causing confusion.
3.2.2. Organization Structures
Organization structure plays an intangible yet very important role in the

design of web sites. While we interact with organization structures every
day, we rarely think about them. Movies are linear in their physical structure.
We experience them frame by frame from beginning to end. However, the
plots themselves may be non-linear, employing flashbacks and parallel
subplots. Maps have a spatial structure. Items are placed according to
physical proximity, although the most useful maps cheat, sacrificing
accuracy for clarity.
The structure of information defines the primary ways in which users can
navigate. Major organization structures that apply to web site and intranet
architectures include the hierarchy, the database-oriented model, and
hypertext. Each organization structure possesses unique strengths and
weaknesses. In some cases, it makes sense to use one or the other. In many
cases, it makes sense to use all three in a complementary manner.
3.2.2.1. The hierarchy: A top-down approach
The foundation of almost all good information architectures is a well-

designed hierarchy. In this hypertextual world of nets and webs, such a
statement may seem blasphemous, but it's true. The mutually exclusive
subdivisions and parent-child relationships of hierarchies are simple and
familiar. We have organized information into hierarchies since the beginning
of time. Family trees are hierarchical. Our division of life on earth into
kingdoms and classes and species is hierarchical. Organization charts are
usually hierarchical. We divide books into chapters into sections into
paragraphs into sentences into words into letters. Hierarchy is ubiquitous in
our lives and informs our understanding of the world in a profound and
meaningful way. Because of this pervasiveness of hierarchy, users can easily
and quickly understand web sites that use hierarchical organization models.
They are able to develop a mental model of the site's structure and their
location within that structure. This provides context that helps users feel
comfortable. See Figure 3-10 for an example of a simple hierarchical model.
Figure 3-10. A simple hierarchical organization model.
Because hierarchies provide a simple and familiar way to organize

information, they are usually a good place to start the information
architecture process. The top-down approach allows you to quickly get a
handle on the scope of the web site without going through an extensive
content inventory process. You can begin identifying the major content areas
and exploring possible organization schemes that will provide access to that
content.
3.2.2.2. Designing hierarchies

When designing information hierarchies on the Web, you should remember a
few rules of thumb. First, you should be aware of, but not bound by, the idea
that hierarchical categories should be mutually exclusive. Within a single
organization scheme, you will need to balance the tension between
exclusivity and inclusivity. Ambiguous organization schemes in particular
make it challenging to divide content into mutually exclusive categories. Do
tomatoes belong in the fruit or vegetable or berry category? In many cases,
you might place the more ambiguous items into two or more categories, so
that users are sure to find them. However, if too many items are cross-listed,
the hierarchy loses its value. This tension between exclusivity and
inclusivity does not exist across different organization schemes. You would
expect a listing of products organized by format to include the same items as
a companion listing of products organized by topic. Topic and format are
simply two different ways of looking at the same information.
Second, it is important to consider the balance between breadth and depth in

your information hierarchy. Breadth refers to the number of options at each
level of the hierarchy. Depth refers to the number of levels in the hierarchy.
If a hierarchy is too narrow and deep, users have to click through an
inordinate number of levels to find what they are looking for (see Figure 3-
11). If a hierarchy is too broad and shallow, users are faced with too many
options on the main menu and are unpleasantly surprised by the lack of
content once they select an option.
Figure 3-11. In the narrow and deep hierarchy, users are faced with six
clicks to reach the deepest content. In the broad and shallow
hierarchy, users must choose from ten options to reach a limited
amount of content.
In considering breadth, you should be sensitive to the cognitive limits of the

human mind. Particularly with ambiguous organization schemes, try to
follow the seven plus-or-minus two rule.[4] Web sites with more than ten
options on the main menu can overwhelm users.
[4]G. Miller, "The Magical Number Seven, Plus or Minus Two: Some Limits
on our Capacity for Processing Information," Psychological Review 63, no.
2 (1956): 81-97.
In considering depth, you should be even more conservative. If users are

forced to click through more than four or five levels, they may simply give
up and leave your web site. At the very least, they'll become frustrated.
For new web sites and intranets that are expected to grow, you should lean
towards a broad and shallow rather than narrow and deep hierarchy. This
approach allows for the addition of content without major restructuring. It is
less problematic to add items to secondary levels of the hierarchy than to the
main page, for a couple of reasons. First, the main page serves as the most
prominent and important navigation interface for users. Changes to this page
can really hurt the mental model they have formed of the web site over time.
Second, because of its prominence and importance, companies tend to spend
lots of care (and money) on the graphic design and layout of the main page.
Changes to the main page can be more time consuming and expensive than
changes to secondary pages.
Finally, when designing organization structures, you should not become

trapped by the hierarchical model. Certain content areas will invite a
database or hypertext-based approach. The hierarchy is a good place to
begin, but is only one component in a cohesive organization system.
3.2.2.3. Hypertext
Hypertext is a relatively new and highly nonlinear way of structuring

information. A hypertext system involves two primary types of components:
the items or chunks of information which are to be linked, and the links
between those chunks. These components can form hypermedia systems that
connect text, data, image, video, and audio chunks. Hypertext chunks can be
connected hierarchically, non-hierarchically, or both (see Figure 3-12).
Figure 3-12. In hypertext systems, content chunks are connected via
links in a loose web of relationships.
Although this organization structure provides you with great flexibility, it

presents substantial potential for complexity and user confusion. As users
navigate through highly hyper textual web sites, it is easy for them to get
lost. It's as if they are thrown into a forest and are bouncing from tree to tree,
trying to understand the lay of the land. They simply can't create a mental
model of the site organization. Without context, users can quickly become
overwhelmed and frustrated. In addition, hyper textual links are often
personal in nature. The relationships that one person sees between content
items may not be apparent to others.
For these reasons, hypertext is rarely a good candidate for the primary
organization structure. Rather, hypertext can be used to complement
structures based upon the hierarchical or database models.
Hypertext allows for useful and creative relationships between items and
areas in the hierarchy. It usually makes sense to first design the information
hierarchy and then to identify ways in which hypertext can complement the
hierarchy.
3.2.2.4. The relational database model: A bottom-up approach
Most of us are familiar with databases. In fact, our names, addresses, and
other personal information are included in more databases than we care to
imagine. A database is a collection of records. Each record has a number of
associated fields. For example, a customer database may have one record per
customer. Each record may include fields such as customer name, street
address, city, state, ZIP code, and phone number. The database enables users
to search for a particular customer or to search for all users with a specific
ZIP code. This powerful field-specific searching is a major advantage of the
database model. Additionally, content management is substantially easier
with a database than without. Databases can be designed to support time-
saving features such as global search and replace and data validation. They
can also facilitate distributed content management, employing security
measures and version control systems that allow many people to modify
content without stepping on each others' toes.
Finally, databases enable you to repurpose the same content in multiple

forms and formats for different audiences. For example, an audience-
oriented approach might benefit from a context-sensitive navigation scheme
in which each audience has unique navigation options (such as returning to
the main page of that audience area). Without a database, you might need to
create a separate version of each HTML page that has content shared across
multiple audiences. This is a production and maintenance nightmare! In
another scenario, you might want to publish the same content to your web
site, to a printed brochure, and to a CD-ROM. The database approach
supports this flexibility.
However, the database model has limitations. The records must follow rigid
rules. Within a particular record type, each record must have the same fields,
and within each field, the formatting rules must be applied consistently
across records. This highly structured approach does not work well with the
heterogeneous content of many web sites. Also, technically it's not easy to
place the entire contents (including text, graphics, and hypertext links) of
every HTML page into a database. Such an approach can be very expensive
and time consuming.
For these reasons, the database model is best applied to subsites or

collections of structured, homogeneous information within a broader web
site. For example, staff directories, news release archives, and product
catalogs are excellent candidates for the database model.
3.2.2.5. Designing databases
Typically, the top-down process of hierarchy design will uncover content

areas that lend themselves to a database-driven solution. At this point, you
will do well to involve a programmer, who can help not only with the
database implementation but with the nitty-gritty data modeling issues as
well (see Figure 3-13).
Figure 3-13. This entity relationship diagram (ERD) shows a structured
approach to database design. We see that entities (e.g., Resource)
have attributes (e.g., Name, URL). Ultimately, entities and
attributes become records and fields in the database. An ERD
also shows relationships between entities. For example, we see
that each resource is available at one or more locations. The
ERD is used to visualize and refine the data model, before design
and population of the database. (This entity relationship
diagram courtesy of InterConnect of Ann Arbor, a technical
consulting and development firm.)
Within each of the content areas identified as candidates for a database-

driven solution, you will need to begin a bottom-up approach aimed at
identifying the content and structure of individual record types.
For example, a staff directory may have one record for each staff member.
You will need to identify what information will be made available for each
individual. Some fields such as name and office phone number may be
required. Others such as email address and home phone number may be
optional. You may decide to include an expertise field that includes
keywords to describe the skills of that individual. For fields such as this, you
will need to determine whether or not to define a controlled vocabulary.
A controlled vocabulary specifies the acceptable terms for use in a particular

field. It may also employ scope notes that define each term.
For example, the table below lists the controlled vocabulary for keywords in
the ecology area of the Argus Clearinghouse web site (see
http://www.clearinghouse.net/). The scope notes explain that ecology is "the
branch of biology dealing with the relation of living things to their
environments." (See Figure 5-2 for an example of scope notes in action.) This
information is useful for the staff who index resources and the users who navigate the
web site.
Controlled Vocabulary
Argus Clearinghouse: Environment: Ecology
biodiversity coastal zone management

conservation ecology (general)
environment environmental health
environmental resources environmental science
environmental studies land use
reef conservation roadkill
water resources wetlands conservation
wildlife wildlife management
wildlife rehabilitation
Use of a controlled vocabulary imposes an important degree of consistency

that supports searching and browsing. Once users understand the controlled
vocabulary, they know that a search on biodiversity should retrieve all
relevant documents. They do not also need to try biological diversity. In
addition, this consistency allows you to automatically generate browsable
indexes. This is a great feature for users, is not very difficult to implement,
and is extremely efficient from a site maintenance perspective (see Figure 3-
14).
Figure 3-14. You can leverage a controlled vocabulary to automatically
generate browsable indexes. In this example, after selecting
Environmental Health from a menu of acceptable terms in the
Ecology category, the user is presented with a list of relevant
resources. These resources have been manually indexed
according to the controlled vocabulary.
However, creating and maintaining a controlled vocabulary is not a simple

task. In many cases, complementing a simple controlled vocabulary that
divides the items into broad categories with an uncontrolled keyword field
provides a good balance of structure and flexibility. (For more on creating
controlled vocabularies, see Section 5.4.1.3, "Controlled vocabularies and
thesauri" in Chapter 5, "Labeling Systems".)
Once you've constructed the record types and associated controlled

vocabularies, you can begin thinking about how users should be able to
navigate this information. One of the major advantages of a database-driven
approach is the power and flexibility it affords for the design of searching
and browsing systems (see Figure 3-15). Every field presents an additional
way to browse or search the directory of records.
Figure 3-15. A database of organizational resources brings power and
flexibility to the Henry Ford Health System web site. Users can
browse by organizational resource or keyword, or perform a
search against the collection of records. The browsing indexes
and the records themselves are generated from the database.
Site-wide changes can be made at the press of a button. This
flexibility is made possible by a database-driven approach to
content organization and management.
The database-driven approach also brings greater efficiency and accuracy to

data entry and content management. You can create administrative interfaces
that eliminate worry about HTML tags and ensure standard formatting
across records through the use of templates. You can integrate tools that
perform syntax and link checking. Of course, the search and browse indexes
can be rebuilt automatically after each addition, deletion, or modification.
Content databases can be implemented in a variety of ways. The database

management software can be configured to produce static HTML pages in
batch mode or to generate dynamic HTML pages on-the-fly as users
navigate the site. These implementation decisions will be influenced by
technical performance issues (e.g., bandwidth and CPU constraints) and
have little impact upon the architecture.
Creating Cohesive Organization Systems
As you've seen in this chapter, organization systems are fairly complex. You
need to consider a variety of exact and ambiguous organization schemes.
Should you organize by topic, by task, or by audience? How about a
chronological or geographical scheme? What about using multiple
organization schemes?
You also need to think about the organization structures that influence how
users can navigate through these schemes. Should you use a hierarchy or
would a more structured database-model work best? Perhaps a loose
hypertextual web would allow the most flexibility? Taken together, in the
context of a large web site development project, these questions can be
overwhelming. That's why it's important to break down the site into its
components, so you can tackle one question at a time. Also, keep in mind
that all information retrieval systems work best when applied to narrow
domains of homogeneous content. By decomposing the content collection
into these narrow domains, you can identify opportunities for highly
effective organization systems.
However, it's also important not to lose sight of the big picture. As with
cooking, you need to mix the right ingredients in the right way to get the
desired results. Just because you like mushrooms and pancakes doesn't mean
they will go well together. The recipe for cohesive organization systems
varies from site to site. However, there are a few guidelines to keep in mind.
In considering which organization schemes to use, remember the distinction

between exact and ambiguous schemes. Exact schemes are best for known-
item searching, when users know precisely what they are looking for.
Ambiguous schemes are best for browsing and associative learning, when
users have a vaguely defined information need. Whenever possible, use both
types of schemes. Also, be aware of the challenges of organizing information
on the Web. Language is ambiguous, content is heterogeneous, people have
different perspectives, and politics can rear its ugly head. Providing multiple
ways to access the same information can help to deal with all of these
challenges.
When thinking about which organization structures to use, keep in mind that
large web sites and intranets typically require all three types of structure.
The top-level, umbrella architecture for the site will almost certainly be
hierarchical. As you are designing this hierarchy, keep a lookout for
collections of structured, homogeneous information. These potential subsites
are excellent candidates for the database model. Finally, remember that less
structured, creative relationships between content items can be handled
through hypertext. In this way, all three organization structures together can
create a cohesive organization system.
Designing Navigation Systems
As our fairy tales suggest, getting lost is often a bad thing. It is associated
with confusion, frustration, anger, and fear. In response to this danger, we
have developed navigation tools to prevent people from getting lost. From
bread crumbs to compass and astrolabe to maps, street signs, and global
positioning systems, people have demonstrated great ingenuity in the design
and use of navigation tools.
We use them to chart our course, to determine our position, and to find our
way back. They provide a sense of context and comfort as we explore new
places. Anyone who has driven through an unfamiliar city as darkness falls
understands the importance that navigation tools play in our lives.
On the Web, navigation is rarely a life or death issue. However, getting lost
in a large web site can be confusing and frustrating. While a well-designed
hierarchical organization scheme will reduce the likelihood that users will
become lost, a complementary navigation system is often needed to provide
context and to allow for greater flexibility of movement within the site.
Navigation systems can be designed to support associative learning by

featuring resources that are related to the content currently being displayed.
For example, a page that describes a product may include see also links to
related products and services (this type of navigation can also support a
company's marketing goals). As users move through a well-designed
navigation system, they learn about products, services, or topics associated
to the specific content they set out to find.
Any page on a web site may have numerous opportunities for interesting see
also connections to other areas of the site. The constant challenge in
navigation system design is to balance this flexibility of movement with the
danger of overwhelming the user with too many options.
Navigation systems are composed of a variety of elements. Some, such as

graphical navigation bars and pop-up menus, are implemented on the
content-bearing pages themselves. Others, such as tables of contents and site
maps, provide remote access to content within the organization structure.
While these elements may be implemented on each page, together they make
up a navigation system that has important site-wide implications. A well-
designed navigation system is a critical factor in determining the success of
your web site.
4.1. Browser Navigation Features
When designing a navigation system, it is important to consider the

environment the system will exist in. On the Web, people use web browsers
such as Netscape Navigator and Microsoft Internet Explorer to move around
and view web sites. These browsers sport many built-in navigation features.
Open URL allows direct access to any page on a web site. Back and
Forward provide a bidirectional backtracking capability. The History menu
allows random access to pages visited during the current session, and
Bookmark enables users to save the location of specific pages for future
reference. Web browsers also go beyond the Back button to support a "bread
crumbs" feature by color-coding hypertext links. By default, unvisited
hypertext links are one color and visited hypertext links are another. This
feature helps users understand where they have and haven't been and can
help them to retrace their steps through a web site.
Finally, web browsers allow for a prospective view that can influence how
users navigate. As the user passes the cursor over a hypertext link, the
destination URL appears at the bottom of the browser window, ideally
hinting about the nature of that content (see Figure 4-1). If files and
directories have been carefully labeled, prospective view gives the user
context within the content hierarchy. If the hypertext link leads to another
web site on another server, prospective view provides the user with basic
information about this off-site destination.
Figure 4-1. In this example, the cursor is positioned over the Investor
Info button. The prospective view window at the bottom shows
the URL of the Investor Info page.
Much research, analysis, and testing has been invested in the design of these
browser-based navigation features. However, it is remarkable how
frequently site designers unwittingly override or corrupt these navigation
features. For example, designers often modify the unvisited and visited link
colors with no consideration for the bread crumbs feature. They focus on
aesthetics, attempting to match link colors with logo colors. It's common to
see a complete reversal of the blue and purple standard. This is a classic
sacrifice of usability[5] for aesthetics and belies a lack of consideration for
the user and the environment. It's like putting up a green stop sign at a road
intersection because it matches the color of a nearby building.
[5]Analysis of a usability test that explored the impact of graphic design on

users' ability to find information lead to the following conclusion: "Of all the
graphic design elements we looked at, the only one that is strongly tied to
user success was the use of browser-default link color....Our theory is that
use of the default colors is helpful because users don't have to relearn every
time they go to a new site." Jared Spool et al., Web Site Usability (Andover,
MA: User Interface Engineering, 1997).
Given proper understanding of the aesthetic and usability issues, you can in
fact modify the link colors and create an intelligent balance.[6]
Unfortunately, this convention has been violated so frequently, the standard
may no longer be standard.
[6]For an example, see Michigan Comnet at http://comnet.org/. The link

colors have been modified slightly to match the logo colors, but the
red:purple/visited:unvisited link standard is maintained.
A second common example of inadvertently disabling valuable browser

navigation features involves prospective view. Image maps have become a
ubiquitous navigation feature on web sites. The graphic navigation bar
allows the aesthetically pleasing presentation of navigation options.
Unfortunately, server-side image maps completely disable the prospective
view feature of web browsers. Instead of the destination URL preview, the
XY coordinates of the image map are presented. This information is
distracting, not useful. Again, a solution that balances aesthetics and
usability is available. Through an elegant use of tables (or by using client-
side image maps), you can present a graphical navigation bar that leverages
the browser-based prospective view feature.
Once you are sensitive to the built-in navigation features of web browsers, it
is easy to avoid disabling or duplicating those features. In fact, it is both
possible and desirable to find ways to leverage them. In designing
navigation systems, you should consider all elements of that system. Web
browsers are an extremely common and integral part of the user's navigation
experience. From a philosophical perspective, we might say that web pages
do not exist in the absence of a web browser. So, don't override or corrupt
the browser!
Building Context
With all navigation systems, before we can plot our course, we must locate
our position. Whether we're visiting Yellowstone National Park or the Mall
of America, the You Are Here mark on fixed-location maps is a familiar and
valuable tool. Without that landmark, we must struggle to triangulate our
current position using less dependable features such as street signs or nearby
stores. The You Are Here indicator can make all the difference between
knowing where you stand and feeling completely lost.
In designing complex web sites, it is particularly important to provide
context within the greater whole. Many contextual clues in the physical
world do not exist on the Web. There are no natural landmarks and no north
and south. Unlike physical travel, hypertextual navigation allows users to be
transported right into the middle of a large unfamiliar web site. Links from
remote web pages and search engine result pages allow users to completely
bypass the front door or main page of the web site. To further complicate
matters, people often print web pages to read later or to pass along to a
colleague, resulting in even more loss of context.
You should always follow a few rules of thumb to ensure that your sites
provide contextual clues. First, all pages should include the organization's
name. This might be done as part of the title or header of the page. As a user
moves through the levels of a site, it should be clear that they are still within
that site. Carrying the graphic identity throughout the site supports such
context and consistency. In addition, if a user bypasses the front door and
directly accesses a subsidiary page of the site, it should be clear which site
he or she is on.
Second, the navigation system should present the structure of the

information hierarchy in a clear and consistent manner and indicate the
location within that hierarchy. See Figure 4-2 for an example.
Figure 4-2. The navigation system for the Argus Clearinghouse clearly
shows the path the user has taken through the hierarchy and
indicates the user's current location. This helps the user to build
a mental model of the organization scheme that facilitates
navigation and helps them feel comfortable.
Improving Flexibility
As discussed in the previous chapter, hierarchy is a familiar and powerful

way of organizing information. In many cases, it makes sense for a hierarchy
to form the foundation for organizing content in a web site. However,
hierarchies can be fairly limiting from a navigation perspective. If you have
ever used the ancient information browsing technology and precursor to the
World Wide Web known as Gopher, you will understand the limitations of
hierarchical navigation. In Gopherspace, you were forced to move up and
down the tree structures of content hierarchies (see Figure 4-3). It was not
practical to encourage or even allow jumps across branches (lateral
navigation) or between multiple levels (vertical navigation) of a hierarchy.
Figure 4-3. On a Gopher site, you could only move up or down through
the tree structure of the hierarchy.
The Web's hypertextual capabilities removed these limitations, allowing

tremendous freedom of navigation. Hypertext supports both lateral and
vertical navigation (see Figure 4-4). From any branch of the hierarchy, it is
possible and often desirable to allow users to laterally move into other
branches. For example, as you explore the Programs and Events section of a
conference web site, you may decide to register for that conference. A
hypertext link should allow you to jump to Registration without first
retracing your steps back up the Programs and Events hierarchy.
Figure 4-4. In a hypertext system, navigation links can completely
bypass the hierarchy. You can enable users to get anywhere from
anywhere. However, as you can see from this diagram, things can
get confusing pretty quickly. It begins to look like an architecture
from M.C. Escher.
It is also possible and often desirable to allow users to move vertically from
one level in a branch to a higher level in that same branch (e.g., from a
specific Program back to the main Programs and Events page) or all the way
back to the main page of the web site.
The trick with designing navigation systems is to balance the advantages of

flexibility with the dangers of clutter. In a large, complex web site, the
complete lack of lateral and vertical navigation aids can be very limiting. On
the other hand, too many navigation aids can bury the hierarchy and
overwhelm the user. Navigation systems should be designed with care to
complement and reinforce the hierarchy by providing added context and
flexibility.
Types of Navigation Systems
A complex web site often includes several types of navigation systems. To

design a successful site, it is essential to understand the types of systems and
how they work together to provide flexibility and context.
4.4.1. Hierarchical Navigation Systems
Although we may not typically think of it this way, the information

hierarchy is the primary navigation system. From the main page to the
destination pages that house the actual content, the main options on each
page are taken directly from the hierarchy (see Figure 4-5). As noted earlier,
the hierarchy is extremely important, but also rather limiting. It is these
limitations that often require additional navigation systems.
Figure 4-5. Global Navigation Systems
4.4.2. Global Navigation Systems
A global or site-wide navigation system often complements the information

hierarchy by enabling greater vertical and lateral movement throughout the
entire site. At the heart of most global navigation systems are some standard
rules that dictate the implementation of the system at each level of the site.
The simplest global navigation system might consist of a graphical

navigation bar at the bottom of each page on the site. On the main page, the
bar might be unnecessary, since it would duplicate the primary options
already listed on that page. On second level pages, the bar might include a
link back to the home page and a link to the feedback facility, as in Figure 4-
6.
Figure 4-6. The MVAC Web site employs a very simple, icon-based
global navigation system.
A slightly more complex global navigation system may provide for area-
specific links on third level pages and below. For example, if a user explores
the products area of the web site, the navigation bar could include Main
Page, Products, and Search. The obvious exception to this rule-based system
is that pages should not include navigation links to themselves. For example,
the main page of the products area should not include a Products link.
However, this is a great opportunity for the site's graphic designer to devise
the navigation bar to show that you are currently on the main page of the
products area. Designers often leverage a folder tab or button metaphor to
accomplish this effect. (On the Argus web site, we use the @ sign from our
corporate logo, as seen in Figure 4-7.)
Figure 4-7. For the Argus web site, graphic designers from Q LTD came
up with a creative and elegant solution to show context within
the navigation system by leveraging the @ sign from our
corporate logo. In this example, the @ sign indicates that the
Publications page is within the What We Do area.
As you can see, this type of rule-based global navigation system can easily
be applied throughout the entire web site. The navigation system and the
graphic design system should be integrated to provide both flexibility and
context. Note that the relative locations of the options should remain the
same from one version of the bar to another and that, since people read from
left to right, Main Page should be to the left of the other options. Both these
factors enhance the context within the hierarchy.
4.4.3. Local Navigation Systems
For a more complex web site, it may be necessary to complement the global
navigation system with one or more local navigation systems. To understand
the need for local navigation systems, it is necessary to understand the
concept of a sub-site.[7] The term sub-site was coined by Jakob Nielsen to
identify the recurrent situation in which a collection of web pages within a
larger site invite a common style and shared navigation mechanism unique
to those pages.
[7]Jakob Nielsen, The Rise of the Sub-Site. Sept, 1996

(http://www.useit.com/alertbox/9609.html.)
For example, a software company may provide an online product catalog as
one area in their web site. This product catalog constitutes a sub-site within
the larger web site of the software company. Within this sub-site area, it
makes sense to provide navigation options unique to the product catalog,
such as browsing products by name or format or market.
However, it is also important to extend the global navigation system

throughout the sub-site. Users should still be able to jump back to the main
page or provide feedback. Local navigation systems should be designed to
complement rather than replace the global navigation system (see Figure 4-
8).
Figure 4-8. In this example, the bulleted options are part of a simple
local navigation system that guides users through information
about the Digital Dissertations project. The graphical buttons at
the lower left of the page are part of the global navigation
system.
This integration can be challenging, particularly when the global and local
navigation systems provide too many options. Alone they may each be
manageable, but together on one page, the variety of options may
overwhelm the user. In some cases, you may need to revisit the number of
global and local navigation options. In others, the problem may be
minimized through elegant page design.
4.4.4. Ad Hoc Navigation
Relationships between content items do not always fit neatly into the
categories of hierarchical, global, and local navigation. An additional
category of ad hoc links is more editorial than architectural. Typically an
editor or content specialist will determine appropriate places for these types
of links once the content has been placed into the architectural framework of
the web site. In practice, this usually involves representing words or phrases
within sentences or paragraphs (i.e., prose) as embedded hypertext links.
This approach can be problematic if these ad hoc links are important, since
usability testing shows "a strong negative correlation between embedded
links (those surrounded by text) and user success in finding information."[8]
Apparently, users tend to scan pages so quickly that they often miss these
less conspicuous links. You can replace or complement the embedded link
approach with external links that are easier for the user to see.
[8]Spool et al., 41-42.
Embedded Links
As you can see, embedded links are surrounded by text.
Users often miss these links.
One Solution to the Embedded Link Problem is to give links their own
separate lines within the paragraph.
Another solution is to create a separate menu of ad hoc links at the top or

bottom of the page that point to useful related resources:
 Embedded Links
 Users
 One Solution to the Embedded Link Problem
The approach you use should be determined by the nature and importance of
the ad hoc links. For non-critical links provided as a point of interest,
embedded links can be an elegant, unobtrusive solution.
When using ad hoc links, it's important to consider whether the linked
phrase provides enough context for the user. In Figure 4-9, it's fairly obvious
where the Digital Dissertations Pilot Site link will take you. However, if
1861 or 1997 were underlined, you would be hard pressed to guess where
those links would lead. In designing navigation systems for the Web, context
is king.
Figure 4-9. Moderation is the primary rule of thumb for guiding the
creation of embedded ad hoc links. Used sparingly (as in this
example), they can complement the existing navigation systems
by adding one more degree of flexibility. Used in excess, ad hoc
links can add clutter and confusion.
Web Page Design: Types of Navigation
THE MAIN GOAL OF A USER-FRIENDLY

NAVIGATION SYSTEM is to prevent users from
getting lost. Navigation design means creating interfaces
that help people understand where they are, where they
can go and how to get there. A good navigation design
will create short and simple paths between elements,
minimize travel steps by creating hierarchies with the
fewest possible levels, and minimize redundancy by
creating only the necessary paths (Kristof & Satran,
1995). To design an efficient site, it is necessary to understand the
types of navigation systems and how they relate to each other.
From being lost to ... Knowing where you are.
Hierarchical Navigation
Navigation throughout a web site simply means going from one

place to another by clicking on elements that operate as links.
Those elements can be words or phrases embedded in the text,
graphics such as drawings or pictures, and even different types of
animations. When you click on a link, you can access additional
information that can be contained in the same page, another page
within the same site, or in a completely different site. The main
navigation is defined by the hierarchy of the information.Once the
structure of the information is established, the conceptual
relationship between the structural elements constitute the
navigation. The options to go from the home page to other pages
respond directly to the information hierarchy.
A hierarchical navigational
mechanism helps in organizing
information and mirrors the content's
structure.
Site-wide Navigation
In some instances you might need to access information
within a site that does not follow the information
structure. A site-wide navigational system enables
greater flexibility of lateral and vertical movement
throughout an entire site. This system complements the
information hierarchy when it is necessary to give users
additional options to navigate bypassing the hierarchy.
For instance,this system may provide links on second or
third level pages to go back to the home page and to a
discussion forum. Those links wouldn't be necessary on
the first level or home page where they would be
redundant. A more complex site-wide navigation system
may provide links to lower level pages which contain
site-specific information.
A site-wide
navigational
scheme
improves
flexibility by
enabling lateral
and vertical
movement
within a site.
Local Navigation
The need to implement a local

navigation system may stem
from the need to provide a
unique navigational scheme to
a collection of web pages that
belong to a large site. This
navigation system operates
only within those pages. For
instance, a products' catalog
embedded in a company's site
may require a complimentary
system of navigation specific
to the catalog's pages. The
catalog's pages also require a
site-wide navigation
mechanism to allow users to go
to the company's home page as
well as to leave the site.
Navigation Systems within a Site
Integrated Navigation Elements
In global and local navigation systems, the most common and important
navigation elements are those that are integrated into the content-bearing
pages of the web site. As users move through the site or sub-site, these are
the elements they see and use again and again. Most integrated navigation
elements fit into one of two categories: navigation bars and pull-down
menus.
4.5.1. Navigation Bars
You can implement navigation bars in many ways and use them for the
hierarchical, global, and local navigation systems. In simplest form, a
navigation bar is a collection of hypertext links grouped together on a page.
Alternatively, the navigation bar may be graphical in nature, implemented as
an image map or as graphic images within a table structure.
The decision to use text versus graphic navigation bars falls primarily within
the realms of graphic design and technical performance rather than
information architecture. Graphic navigation bars tend to look nicer but can
significantly slow down the page loading speed (although, if you're able to
reuse the same global navigation bar throughout the site, loading speed will
only be hurt once, since the image will be cached locally). If you do use
graphic navigation bars, you need to be sensitive to the needs of users with
low bandwidth connections. You should also consider those users with text-
only browsers (there are still quite a few out there) and those users with
high-end browsers who turn off the graphical capabilities to get around more
quickly. Appropriate use of the <ALT> attribute to define replacement text
for the image will ensure that your site supports navigation for these users.
However, key issues related to the architecture should also influence this
decision. For example, it is usually much easier to add options to a text
menu than a graphic-based menu. If you anticipate substantial growth or
change in a particular area, it may make sense to employ a textual navigation
bar, like the one in Figure 4-10. Cost is also an issue, since graphic
navigation bars require more work to create and change than text-based bars.
In many cases, you might employ a graphic bar for global navigation and a
textual menu for local navigation. A good graphic designer will strike an
elegant balance between form and function in creating these navigation bars.
Figure 4-10. C/Net provides a high-profile example of the use of text-

based navigation options.
It is often best to place the navigation bar towards the top and/or bottom of
the page, rather than at the side.[9] Placement at the top provides immediate
access to the navigation system as well as an instant sense of context within
the site. This supports the scenario in which a user quickly scans the first
paragraph and decides to move on to other areas of the site. Placement at the
bottom assumes navigation once the page has been fully read. Placement at
both the top and bottom should be determined by the length of the content.
[9]One usability study showed that "Sites with navigation buttons or links at
the top and bottom of pages did slightly better than sites with navigation
buttons down the side of the page." Spool et al., 24.
Graphical navigation bars may employ several techniques for conveying

content and context, including textual labels and icons. Textual labels are the
easiest to create and by far most clearly indicate the contents of each option.
Icons, on the other hand, are relatively difficult to create and often fail to
indicate the contents of each option. It's difficult to represent abstract
concepts through images. A picture may say a thousand words, but often
they're the wrong words. Icons can successfully be used to complement the
textual labels. Since repeat users may become so familiar with the icons that
they no longer take the time to read the textual labels, icons are useful in
facilitating rapid menu selection for them. See Figure 4-11 for an example.
Figure 4-11. This navigation bar, which appears at the bottom of the
page, demonstrates an interesting blend of graphic icons (with
labels) and textual options. The global navigation icons provide a
splash of color, while their labels ensure usability. The textual
local navigation options allow for the creation of many footer
navigation bars without restrictive costs.
However, hidden minefields may plague an iconic system. First, the

Internet's global nature introduces the potential for confusion or even anger,
since an image may have very different meanings from one culture to
another. Second, the iconic system may work well for a limited number of
menu options, but if the decision is made to add one or more options,
creating an appropriate icon can be very challenging. While icons certainly
work well sometimes, the skillful use of a color system can facilitate rapid
menu selection without the inherent problems of iconic systems. (For more
about the use of icons, see Chapter 5, "Labeling Systems".)
4.5.2. Frames
Frames present an additional factor to consider in the application of textual

or graphical navigation bars. Frames allow you to define one or more
independently scrollable "panes" within a single browser window.
Hypertextual links within one pane can control the content displayed in other
panes within that same window. This enables the designer to create a static
or independently scrolling navigation bar that appears on every page in that
area of the web site. This frame-based navigation bar will be visible to the
user in the same location in the browser window even while scrolling
through long documents. By separating the navigation system from content
in this way, frames can provide added context and consistency as users
navigate a web site.
However, frames present several serious problems, both from the consumer's
and producer's perspective. Architects should proceed very carefully in
considering frame-based navigation solutions. Let's review a few of the
major considerations.
4.5.2.1. Screen real estate
Static navigation bars implemented through frames often take up significant

portions of valuable screen real estate (see Figure 4-12). No matter how far
the user scrolls, the navigation bar always stays with them. The addition of
winking, blinking banner advertisements into the static navigation bar often
compounds this problem. On a large, high resolution monitor this may be
only a minor irritation. On a standard 640 x 480 monitor, these frames can
be really annoying. If you're going to use a frame-based navigation bar, keep
it relatively small and non-obtrusive. You should also consider a vertical
rather than horizontal frame, since left-to-right reading lends itself to narrow
text columns like those found in newspapers and magazines.
Figure 4-12. The Wall Street Journal's Interactive Edition makes use of
frames. It's a relatively elegant implementation, but it limits
screen real estate and disables basic navigation features.
4.5.2.2. The page model
The Web is built upon a model of pages, with each page having a unique
address or URL. Users are familiar with the concept of pages. Frames
confuse this issue, by slicing up pages into independent panes of content. By
violating the page model, the use of frames frequently disables important
browser navigation features such as bookmarking, visited and unvisited link
discrimination, and history lists. Frames can also confuse and frustrate users
executing simple tasks such as using the back button, reloading a page, and
printing a page. While web browsers have improved in their ability to handle
frames, they can't remove the confusion caused by violating the page model.
4.5.2.3. Display speed
Right off the bat, a web page with multiple panes will take a hit on display
speed. Since each pane is a separate file with its own URL, loading and
displaying each pane requires a separate client-server interaction. In other
words, the user spends a lot of time watching "Host Contacted" messages fly
by at the bottom of the screen. This problem is compounded by heavy
graphics use.
4.5.2.4. Complex design

In theory, there are some compelling reasons to try frames. You can make
global navigation bars or section headers (or advertisements) visible to the
user at all times. However, in practice, designing user-friendly web sites
using frames is quite challenging. Frames add a layer of complexity that
many architects and designers deal with unsuccessfully. You must think
about the multiple ways users will access your frame-based documents.
What if they come from another frame-based document? Then you face the
danger of frames within frames. In addition, while most web browsers now
support frames, different browsers on different computer platforms display
the frames and their contents slightly differently. This requires more testing
and more careful design. Before using frames, make sure you consider the
additional overhead in architecture and design.
4.5.3. Pull-Down Menus
Pull-down menus compactly provide for many navigation options. The user
can expand what appears as a single-line menu to present dozens of options
(as shown in Figure 4-13). The most common pull-down menus on the Web
are implemented using the standard interactive forms syntax. Users must
choose an option from the menu and then hit a Go or Submit button to move
to that destination.
Figure 4-13. This pull-down menu enables users to select a location
without first going to a separate web page. This approach avoids
further cluttering the main page with a long list of locations.
You can implement a more sophisticated version of the pull-down menu

(also know as the pop-up menu ) on the Web by using a programming
language such as Java or JavaScript. As the user moves the cursor over a
word or area on the page, a menu pops up. The user can directly select an
option from that menu.
Use pull-down and pop-up menus with caution. These menus allow
designers to pack lots of options on one page. This is usually what you are
working hard to avoid. Additionally, menus hide their options and force the
user to act before being able to see those options. However, when you have a
very straightforward, exact organization scheme, these menus can work
well.
Remote Navigation Elements
Remote navigation elements or supplemental navigation systems such as

tables of contents, indexes, and site maps are external to the basic hierarchy
of a web site and provide an alternative bird's-eye view of the site's content.
Increasingly, we are seeing these remote navigation elements displayed
outside of the main browser window, in either a separate target window or in
a Java-based remote control panel. While remote navigation elements can
enhance access to web site content by providing complementary ways of
navigating, they should not be used as replacements or bandages for poor
organization and navigation systems. In many ways, remote navigation
elements are similar to software documentation or help systems.
Documentation can be very useful but will never save a bad product.
Instead, remote navigation elements should be used to complement a solid
internal organization and navigation system. You should provide them but
never rely on them.
4.6.1. The Table of Contents
The table of contents and the index are the state of the art in print navigation.
Given that the design of these familiar systems is the result of testing and
refinement over the centuries, we should not overlook their value for web
sites.
In a book or magazine, the table of contents presents the top few levels of
the information hierarchy. It shows the organization structure for the printed
work and supports random as well as linear access to the content through the
use of chapter and page numbers. Similarly, the table of contents for a web
site presents the top few levels of the hierarchy. It provides a broad view of
the content in the site and facilitates random access to segmented portions of
that content. A web-based table of contents can employ hypertext links to
provide the user with direct access to pages of the site.
You should consider using a table of contents for web sites that lend
themselves to hierarchical organization. If the architecture is not strongly
hierarchical, it makes no sense to present the parent-child relationships
implicit in a structured table of contents. You should also consider the web
site's size when deciding whether to employ a table of contents. For a small
site with only two or three hierarchical levels, a table of contents may be
unnecessary.
The design of a table of contents significantly affects its usability. When

working with a graphic designer, make sure he or she understands the
following rules of thumb:
1. Reinforce the information hierarchy so the user becomes increasingly

familiar with how the content is organized.
2. Facilitate fast, direct access to the contents of the site for those users
who know what they want.
3. Avoid overwhelming the user with too much information. The goal is
to help, not scare, the user.
The Search/Browse area of the Argus Clearinghouse, shown in Figure 4-14,

provides an example of a table of contents.
Figure 4-14. This table of contents allows users to select a category (e.g.,
Arts & Humanities) or jump directly to a subcategory (e.g.,
architecture). Because of the clean page layout, users can quickly
scan the major and minor categories for the topic they're
interested in.
Graphics can be used in the design and layout of a table of contents,

providing the designer with a finer degree of control over the presentation.
Colors, font styles, and a variety of graphic elements can be applied to create
a well-organized and aesthetically pleasing table of contents. However, keep
in mind that a graphic table of contents will cost more to design and
maintain and may slow down the page loading speed for the user. When
designing a navigation tool such as a table of contents, form is less important
than function.
4.6.2. The Index
For web sites that aren't conducive to strong hierarchical organization, a

manually created index can be a good alternative to the more structured table
of contents. Similar to an index found in print materials, a web-based index
presents keywords or phrases alphabetically, without representing the
hierarchy. Unlike a table of contents, indexes generally are flat and present
only one or two levels of depth. Therefore, indexes work very well for users
who already know the name of the item they are looking for. A quick scan of
the alphabetical listing will get them where they want to go.
A major challenge in indexing a web site involves the level of granularity of

indexing. Do you index web pages? Do you index individual paragraphs or
concepts that are presented on web pages? Or do you index collections of
web pages? In many cases, the answer may be all of the above. Perhaps a
more valuable question is: What terms are users going to look for? Its
answers should guide the index design. To answer this question, you need to
know your audience and understand their needs. Before launch of the site,
you can learn more about the terms that users will look for through focus
group sessions and individual user interviews. After launch, you can employ
a query tracking tool that captures and presents all search terms entered by
users. Analysis of these actual user search terms should determine
refinement of the index. (To learn more about query tracking tools, see
Chapter 9, "Production and Operations".)
In selecting items for the index, keep in mind that an index should point only
to destination pages, not navigation pages. Navigation pages help users find
(destination) pages through the use of menus that begin on the main page
and descend through the hierarchy. They are often heavy on links and light
on text. In contrast, destination pages contain the content that users are
trying to find. The purpose of the index is to enable users to bypass the
navigation pages and jump directly to these content-bearing destination
pages.
A useful trick in designing an index involves term rotation, also known as

permutation. A permuted index rotates the words in a phrase so that users
can find the phrase in two places in the alphabetical sequence. For example,
in the SIGGRAPH 96 index shown in Figure 4-15, users will find listings for
both New Orleans Maps and Maps (New Orleans). This supports the varied
ways people look for information. Term rotation should be applied
selectively. You need to balance the probability of users seeking a particular
term with the annoyance of cluttering the index with too many permutations.
For example, it would probably not make sense to present Sunday
(Schedule) as well as Schedule (Sunday). If you have the time and budget to
conduct focus groups or user testing, that's great. If not, you'll have to fall
back on your common sense.
Figure 4-15. The SIGGRAPH 96 index allows for multiple levels of

granularity. Selecting "New Orleans" will take you to a page
that introduces this adventurous city and includes a number of
links. One of those links takes you to a New Orleans map. Since
this map is judged to be an important content item, it is also
presented in the index.
4.6.3. The Site Map
While the term site map is used indiscriminately in general practice, we

define it narrowly as a graphical representation of the architecture of a web
site. This definition excludes tables of contents and indexes that use graphic
elements to enhance the aesthetic appeal of tools that are primarily textual. A
real site map presents the information architecture in a way that goes beyond
textual representation.
Unlike tables of contents and indexes, maps have not traditionally been used
to facilitate navigation through bodies of text. Maps are typically used for
navigating physical rather than intellectual space. This is significant for a
few reasons. First, users are not familiar with the use of site maps. Second,
designers are not familiar with the design of site maps. Third, most bodies of
text (including most web sites) do not lend themselves to graphical
representations. As we discussed in Chapter 3, "Organizing Information",
many web sites incorporate multiple organization schemes and structures.
Presenting this web of hypertextual relationships visually is difficult. These
reasons help explain why we see few good examples on the Web of site
maps that can improve navigation systems.
Figure 4-16 shows a site map from http://www.sgml.net. To learn more

about automatically generated site maps, see
http://www.webreview.com/97/05/16/infoarch/index.html.
Figure 4-16. In this example of an automatically generated site map,

gold bars represent pages within a web site. Users must roll their
cursor over a gold bar to see the title of the page. Do you think
this approach is more useful than a text-based table of contents?
If you decide to try a site map, consider physical versus symbolic

representation. Maps of the physical world do not present the exact
geography of an area. Accuracy and scale are often sacrificed for
representative contextual clues that help us find our way through the maze of
highways and byways to our destination. Often, the higher the level of
abstraction, the more intuitive the map. This rule of thumb holds true for all
of the remote navigation elements of web sites. When consulting a table of
contents or index or site map, a user doesn't need to see every single link on
every single page. They need to see the important links, presented in a clear
and meaningful way.
4.6.4. The Guided Tour
A guided tour serves as a nice tool for introducing new users to the major
content areas of a web site. It can be particularly important for restricted
access web sites (such as online magazines that charge subscription fees)
because you need to show potential customers what they will get for their
money.
A guided tour should feature linear navigation (new users want to be guided,
not thrown in), but a hypertextual navigation bar may be used to provide
additional flexibility. The tour should combine screenshots of major pages
with narrative text that explains what can be found in each area of the web
site. See Figure 4-17 for an example.
Figure 4-17. In this example, the navigation options on each screen
allow users to move through the guided tour in a non-linear
manner.
Remember that a guided tour is intended as an introduction for new users

and as a marketing opportunity for the web site. Many people may never use
it, and few people will use it more than once. For that reason, you might
consider linking to the tour from the gateway page[10] rather than the main
page. Also, you should balance the inevitable big ideas about how to create
an exciting, dynamic, interactive guided tour with the fact that it will not
play a central role in the day to day use of the web site.
[10]Web sites sometimes have a gateway page that first-time users encounter
before reaching the main page. This gateway might serve as a splash page
with fancy graphics and animation, as an audience-selection page that sends
users to the appropriate area of a site, or as a preview page that shows users
what they will get if they subscribe to that particular web site.
Designing Elegant Navigation Systems
Designing navigation systems that work well is challenging. You've got so

many possible solutions to consider, and lots of sexy technologies such as
pop-up menus and dynamic site maps can distract you from what's really
important: building context, improving flexibility, and helping the user to
find the information they need.
No single combination of navigation elements works for all web sites. One
size does not fit all. Rather, you need to consider the specific goals,
audience, and content for the project at hand, if you are to design the optimal
solution.
However, there is a process that should guide you through the challenges of
navigation system design. It begins with the hierarchy. As the primary
navigation system, the hierarchy influences all other decisions. The choice
of major categories at the highest levels of the web site will determine
design of the global navigation system. Based on the hierarchy, you will be
able to select key pages (or types of pages) that should be accessible from
every other page on the web site. In turn, the global navigation system will
determine design of the local and then ad hoc navigation systems. At each
level of granularity, your design of the higher-order navigation system will
influence decisions at the next level.
Once you've designed the integrated navigation system, you can consider the
addition of one or more remote navigation elements. In most cases, you will
need to choose between a table of contents, an index, and a site map. Is the
hierarchy strong and clear? Then perhaps a table of contents makes sense.
Does the hierarchy get in the way? Then you might consider an index. Does
the information lend itself to visualization? If so, a site map may be
appropriate. Is there a need to help new or prospective users to understand
what they can do with the site? Then you might add a guided tour.
If the site is large and complex, you can employ two or more of these
elements. A table of contents and an index can serve different users with
varying needs. However, you must consider the potential user confusion
caused by multiple options and the additional overhead required to design
and maintain these navigation elements. As always, it's a delicate balancing
act.
If life on the high wire unnerves you, be sure to build some usability testing
into the navigation system design process. Only by learning from users can
you design and refine an elegant navigation system that really works.
Searching Systems
. Searching and Your Web Site
The preceding three chapters were intended to help you create the best
browsing system possible for your web site. This chapter describes when to
use a search engine with your site and demonstrates techniques that will
make searching work best for it.
Throughout this chapter, we use examples of searching systems from major

sites which allow you to search the entire Web, as well as site-specific search
engines. Although these Web-wide tools are different in that they index a
much broader collection of content than your search system will, it is
nonetheless very useful to study them. Of all searching systems, none has
undergone the testing, usage, and investment that Web-wide search tools
have, so why not benefit from their research?
. When Not To Make Your Site Searchable
Before we delve into searching systems, we need to make a point: think

twice before you make your site searchable.
What? What's the point of having a web site if people can't find information
in it?
Your site should of course support the finding of its information. But don't
assume a search engine alone will satisfy all users' information needs. While
many users want to search a site, some just want to browse it.
Also, does your site have enough content to merit the use of a search engine?
How much is enough? It's hard to say. It could be five resources or fifty; no
specific number serves as a threshold. Perhaps a site with five long, dense
documents deserves a search engine more than one with a collection of
twenty brief, well-labeled documents. In any case, you'll want to balance the
time necessary to set up and maintain a searching system with the payoff it
brings to your site's users.
Because many site developers see search engines as the solution to the
problems that users are experiencing when trying to find information in their
sites, search engines become bandages for sites with poorly designed
browsing systems. If you see yourself falling into this trap, you should
probably suspend implementing your searching system until you fix your
browsing system's problems.
Search engines are fairly easy to get up and running, but like much of the
Web, they are difficult to set up effectively. As a user of the Web, you've
certainly seen incomprehensible search interfaces, and we're sure that your
queries have retrieved some pretty strange results. This often is the result of
a lack of planning by the site developer, who probably installed the search
engine with its default settings, pointed it at his or her site, and forgot about
it. So, if you don't plan on putting some significant time into configuring
your search engine properly, reconsider your decision to implement it.
Now that we've got our warnings and threats out of the way, we'll discuss
when to implement searching systems, and how you can make them work
better.
When To Make Your Site Searchable
Most web sites, as we know, aren't planned out in much detail before they're
built. Instead, they grow organically. This may be all right for smaller web
sites that aren't likely to expand much, but for ones that become popular,
more and more content and functional features get added haphazardly,
leading to a navigation nightmare.
There's a good analogy of physical architecture. Powell's Books

(http://www.powells.com/), which claims to be the largest bookstore in the
world, covers an entire city block (43,000 square feet) in Portland, Oregon.
We guess that it originally started as a single small storefront on that block,
but as their business grew, they knocked a doorway through the wall into the
next storefront, and so on, until they occupied the whole block. The result is
a hodgepodge of chambers, halls with odd turns, and unexpected stairways.
This chaotic labyrinth is a charming place to wander and browse, but if
you're searching for a particular title, good luck. It will be difficult to find
what you're looking for, although you might serendipitously stumble onto
something better.
Yahoo! once was a Web version of Powell's. Everything was there, but fairly
easy to find. Why? Because Yahoo!, like the Web, was relatively small. At
its inception, Yahoo! pointed to a few hundred Internet resources, made
accessible through an easily browsable subject hierarchy. No search option
was available, something unimaginable to Yahoo! users today. But things
soon changed. Yahoo! had an excellent technical architecture that allowed
site owners to easily self-register their sites, but Yahoo!'s information
architecture wasn't very well-planned, and couldn't keep up with the
increasing volume of resources that were added daily. Eventually, the subject
hierarchy became too cumbersome to navigate, and the Yahoo! people
installed a search engine as an alternative way of finding information in the
site. Nowadays it's a decent bet that more people use Yahoo!'s search engine
instead of browsing through all those hierarchical subject categories,
although the browsable categories remain useful as a supplement to the
searching process (and, in fact, are included in search results).
Your site probably doesn't contain as much content as Yahoo! does, but if it's
a substantial site, it probably merits a search engine. There are good reasons
for this: users won't be willing to browse through your site's structure. Their
time is limited, and their cognitive overload threshold is lower than you
think. Interestingly, sometimes users won't browse for the wrong reasons;
that is, they search when they don't necessarily know what to search for.
Even though they would be better served by browsing, they search anyway.
You should also consider creating a searching system for your site if it
contains highly dynamic content. For example, if your site is a Web-based
newspaper, you could be adding dozens of story files daily. For this reason,
you probably wouldn't have the time each day to maintain elaborate tables of
contents, browsable indices, and other browsing systems. A search engine
can help you by automatically indexing the contents of the site once or many
times per day. Automating this process ensures that users have quality access
to your site's content, and you can spend time doing things other than
manually indexing and linking the story files.
Understanding How Users Search
Assuming you've decided to implement a searching system for your web

site, it's important to understand how users really search before designing it.
We'll try to condense decades of research and experience generated by the
field of information retrieval into the next few paragraphs. But it really boils
down to this point: searching systems can and should vary as much as
browsing systems or any other components of web sites do, because all users
aren't alike, and information retrieval is much harder than most people
realize.
Users Have Different Kinds of Information Needs
Information scientists and librarians have been studying users' information

finding habits for decades. Until recently, these studies usually pertained to
traditional information systems, such as how to ask a library patron the right
questions to learn their information needs, or how to make it easier to search
for information in online library card catalogs or other databases.
Many studies indicated that users of information systems aren't members of

a single-minded monolithic audience who want the same kinds of
information delivered in the same ways. Some want just a little information,
while others want detailed assessments of everything there is to know about
a topic. Some want only the most accurate, highest quality information,
while others don't care much about the reliability of the source. Some will
wait for the results, while others need the information yesterday. Some are
just plain happy to get any information at all, regardless of how much
relevant stuff they're really missing. Users' needs and expectations vary
widely, and so the information systems that serve them must recognize,
distinguish, and accommodate these different needs.
To illustrate, let's look at one of these factors in greater detail: the variability
in users' searching expectations.
Known-item searching
Some users' information needs are clearly defined and have a single, correct
answer. When you check the newspaper to see how your stock in
Amalgamated Shoelace and Aglet is doing (especially since the hostile
Microsoft takeover attempt), you know exactly what you want, that the
information exists, and where it can be found. This is the simplest type of
information need. If it were the only type, the job of the web site architect
would be much easier.
Existence searching
However, some users know what they want but don't know how to describe
it or whether the answer exists at all. For example, you might want to buy
shares in a particular type of mutual fund that invests in Moldovan high-tech
start-ups and that carries no load. You are convinced that this sector is up-
and-coming, but do Fidelity and Merrill Lynch know this as well? You might
check their web sites, call a broker or two, or ask your in-the-know aunt.
This kind of information need is more challenging: it might be hard to
convey exactly what you're looking for ("Moldova? What's that?"),
especially if it's a new and as-yet-unheard-of item. Rather than a clear
question for which a right answer exists, you have an abstract idea or
concept, and you don't know whether matching information exists. The
success of your search depends as much upon the abilities of the brokers, the
web sites, and your aunt to understand your idea and its context as whether
the information (in this case, a particular mutual fund) exists.
Exploratory searching
Some users know how to phrase their question, but don't know exactly what
they're hoping to find, and are really just exploring and trying to learn more.
If you ever considered changing careers, you know what we mean: you're
not sure that you definitely want to switch to a career in chinchilla farming,
but you've heard it's the place to be, so you might informally ask a friend of
a friend who has an uncle in the business. Or you call the public library to
see if there's a book on the subject. Or you write to the Chinchilla
Professionals' Association requesting more information. In any case, you are
not sure exactly what you'll uncover, but you're willing to take the time to
learn more. Like existence searching, you have not so much a question
seeking an answer as much as an idea that you want to learn more about.
Unlike the next type of searching, you don't need to know everything there
is; a few pieces of good information will do fine for now.
Comprehensive searching (research)
Some users want everything available on a given topic. Scientific

researchers, patent lawyers, doctoral students trying to find unique and
original dissertation topics, and fans of any sort fit into this category. For
example, if you idolize that late great music duo Milli Vanilli, you'll want to
see everything that has anything to do with them -- singles and records,
bootlegs, concert tour posters, music videos, reviews, fan club information,
paraphernalia, interviews, books, scholarly articles, and record-burning
schedules. Even casual mentions of the band, such as someone's incoherent
ramblings in a web page or Usenet newsgroup, are fair game if you're
seeking all there is to know about Milli Vanilli. So you might turn to all sorts
of information sources for help: friends, the library, bookstores, music
stores, radio call-in shows, Ouija boards, and so on.
There are many other ways of classifying information needs, but the
important thing to remember is that not all users are looking for the same
thing. Ideally, you should anticipate the most common types of needs that
your site's users will have and ensure that these needs are met. Minimally,
you should give some thought to the variations and try to design a search
interface that is flexible in responding to them.
Searching and Browsing Are Integrated
One drawback to the literature on information finding is that much of it deals

with testing and improving a single information system (e.g., an online card
catalog). But the truth is that most people, especially those with more
involved information needs, use many information systems for a particular
search. This often means jumping from Infoseek to Magellan to a specific
site to Hotbot and so on, all in the context of one search. Even when using a
single web site, users often alternate between browsing and searching. For
example, when you use Yahoo!, you might first perform a search, find a
useful site, and then, using its Yahoo! category, browse for similarly indexed
sites.
Multiple Iterations Are Commonplace

Additionally, information searching generally doesn't take place within one
clean pass, unless it's of the known-item searching variety. Information
searching and browsing are by nature iterat ive : users will make a first
attempt at finding information, learn something, refine their query, try
finding some more, learn some more, refine again. This is commonly known
as associative learning . Unfortunately, finding everything you need at once
doesn't happen all that often, because you don't generally know enough
about the topic to articulate your query the right way in the first place.
The Moving Target: A Likely Scenario
A typical example of a search for information might go something like this:
Jan, a budding entrepreneur, wants to get business cards printed for her new
company. She calls her pal Fred to see how he did it and what company he
used. Unfortunately, Fred is not in, and, never one to dawdle, Jan leaves
Fred voice mail and moves on to the yellow pages. She finds nothing under
Business Cards, but does see a number of companies listed under Printers,
and gets a few price quotes, which all seem to be in the same neighborhood.
Not sure which to select, Jan contacts the local chapter of the Better
Business Bureau for their recommendation. The BBB folks refer Jan to their
web site, where she can search a database of companies with dubious
histories. This provides Jan with useful information that helps whittle down
her list of candidate printers. Meanwhile, Fred calls Jan back and tells her
that she really shouldn't have just business cards printed, but that she should
hire a graphic designer to create a full graphic identity package for Jan's new
business, including letterhead, brochures, and so on. So, Jan realizes that she
needs to find an affordable, reputable graphic design firm, and she returns to
the yellow pages. She also goes to the library to do a catalog search to see if
any books describe what it's like to work with a graphic design firm, and
how much she ought to expect to pay. And so on...
As you can see, Jan's initially simple information need becomes a fully
fledged associative learning process, changing at least twice (from a hunt for
a printer to a hunt for a graphic design firm to information on negotiating
and working with a graphic designer), and for all we know, it's not over yet.
It also involves multiple information sources (Fred, the yellow pages, the
library catalog, the bookstore), and utilizes browsing (the yellow pages
directory), searching (the Web database, the library catalog), and even
asking (Fred, the Better Business Bureau). Things aren't always as simple as
they seem! Your challenge, of course, is to design your site's architecture to
support the most common searching and browsing approaches in a smooth
and integrated way
Designing the Search Interface
With so much variation among users to account for, there can be no single
ideal search interface. Although the literature of information retrieval
includes many studies of search interface design, many variables preclude
the emergence of the right way to design search interfaces. Here are a few of
the variables on the table:
 The level of searching expertise users have: Are they comfortable with
Boolean operators, or do they prefer natural language? Do they need a
simple or high-powered interface? What about a help page?
 The kind of information the user wants: Do they want just a taste, or
are they doing comprehensive research? Should the results be brief, or
should they provide extensive detail for each document?
 The type of information being searched: Is it made up of structured

fields or full text? Is it navigation pages, destination pages, or both?
HTML or other formats?
 How much information is being searched: Will users be overwhelmed

by the number of documents retrieved?
We can, however, provide basic advice that you should consider when
designing a search interface.
Support Different Modes of Searching
Before diving into design, think hard about why users are searching your
site, and what they want to get out of their search. Are they likely to search
for certain types of information, such as specific product descriptions or staff
directory entries? If so, support modes of searching that are delineated by
content types -- use the same interface to allow users to search the product
catalog, or the staff directory, or other content areas (content-delineated
indexing involves the creation of search zones, which we'll cover later in this
chapter). Are non-English speakers important to your site? Then provide
them with search interfaces in their native languages, including language-
specific directions, search commands and operators, and help information.
Does your site need to satisfy users with different levels of sophistication
with online searching? Then consider making available both a basic search
interface and an advanced one.
For example, one of our clients, UMI, sells dissertations to an audience that
includes researchers, librarians, and others who have been using advanced
online information systems for years. We needed an interface that would
accommodate this important expert audience who were used to complex
Boolean and proximity operators, and who were already very used to the
arcane search languages of other commercial information services. However,
a simple search interface was also required, because at times users wouldn't
need all the firepower of an advanced search interface, especially when
conducting simple, known-item searches. Additionally, because it had
become available via the Web, a whole new audience of novices would
encounter this product for the first time; we assumed that these newbies
wouldn't be comfortable with a complex search interface.
So we created a simple interface that almost anyone could figure out and use
right away, shown above in Figure 6-1. A simple search box is ideal for the
novice or for a user with a pretty good sense of what he or she is looking for.
(We made sure to provide a single search query box; our experience shows
that most users don't care for separate boxes, one for each query term,
divided by Boolean operators.) Minimal filtering options are provided,
including searching for keywords within title and abstract fields, searching
within the author field, or searching within the publication number field.
These filtering options provide the user with more power by allowing more
specific searching. But because the labels Keyword, Author, and Publication
Number are fairly self-explanatory, they don't force the user to think too
much about these options.
Fielded Searching
Author, Keyword, Title, Subject, and ten other fields are searchable. A
researcher could, for example, find a dissertation related to his or her
area of interest by searching the subject field, and learn who that
doctoral student's advisor was by reading the abstract. To find other
related dissertations, the researcher could then search the Advisor field
to learn about other doctoral students who shared the same advisor.
Familiar Query Language
the style "field(search term)" is used (e.g., "keyword(drosophila)").

Because many different query language conventions are supported by
traditional online products, users may be used to an established
convention. The effort to support these users is made by allowing
variant terms. For the field Degree Date, the user can enter either
"ddt," "da," "date," "yr," or "year."
Longer Queries
More complex queries often require more space than the single line
entry box found in the simple search interface in. The more complex
interface supports a much longer query.
Reusable Result Sets
Many traditional online information products allow searchers to build

sets of results that can be reused. In this example, we've ANDed
together the two sets that we've already found, and could in turn
combine this result with other sets during the iterative process of
searching.
Because this advanced interface supports so many different types of

searching, we provided a substantial help page to assist users. For users of
common browsers, the help page shown in launches in a separate browser
window so that users don't need to exit the search interface to get help.
Searching and Browsing Systems Should Be Closely Integrated
As we mentioned earlier, users typically need to switch back and forth

between searching and browsing. In fact, users often don't know if they need
to search or browse in the first place. Therefore, these respective systems
shouldn't live in isolation from one another.
When we redesigned the Argus Clearinghouse, we integrated these two

elements on a single page called Search/Browse, shown in Figure 6-4. This
combined interface to searching and browsing makes it clear to the user
what he or she can do there. The search/browse approach can be extended by
making search and browse options available on the search results page as
well, especially on null results pages, when a user might be at a dead end
and needs to be gently led back into the process of iterative searching and
browsing before frustration sets in.
Searching Should Conform to the Site's Look and Feel
Search engine interfaces, and more importantly, retrieval results, should look
and behave like the rest of your site. This advice may seem painfully
obvious, but because many search engines are packaged as ready-to-go add-
ons to a site, site developers don't bother to customize them. For example,
the interface and results produced by the Excite search engine are easy to
detect. In fact, they look and work so similarly from site to site that it's easy
to forget that they are actually parts of individual sites. is a great example of
a search interface which hasn't been customized, while shows how the
search interface can be integrated with the rest of the site's look and feel.
It should be mentioned that some search engines, like AltaVista, don't allow
you to modify search and retrieval results pages.
Search Options Should Be Clear
We all pay lip service to the need for user documentation, but with
searching, it's really a must. Because so many different variables are
involved with searching, there are many opportunities for things to go
wrong. On a Help or Documentation page, consider letting the user know the
following:
1. What is being searched. Users often assume that their search query is
being run against the full text of every page in your site. Instead your
site may support fielded searching (as in the UMI example above), or
another type of selective searching (see "Indexing the Right Stuff "
later in this chapter). If they're curious, users should be able to find
out exactly what they are searching.
2. How they can formulate search queries. What good is it to build in
advanced querying capabilities if the user never knows about them?
Show off the power of your search engine with excellent real life
examples. In other words, make sure your examples actually work and
retrieve relevant documents if the user decides to test them.
3. User options. Can the user do other neat things such as changing the
sorting order of retrieval results? Show them off as well!
4. What to do if the user can't find the right information. It's important to
provide the user with some tricks to handle the following three
situations:
a. "I'm getting too much stuff."

b. "I'm not getting anything."
c. "The stuff I'm getting stinks!"
For case (a), you might suggest approaches that narrow the retrieval
results. For example, if your system supports the Boolean operator
AND, suggest that users combine multiple search terms with an AND
between them (ANDing together terms reduces retrieval size).
If they are retrieving zero results, as in case (b), suggest the operator
OR, the use of multiple search terms, the use of truncation (which will
retrieve a term's variants), and so on.
If they are completely dissatisfied with their searches, case (c), you
might suggest that they contact someone who knows the site's content
directly for custom assistance. It may be a resource-intensive
approach, but it's a far superior last resort to ditching the user without
helping them at all.
Choose a Search Engine That Fits Users' Needs
At this point, you ideally will know something about the sorts of searching
capabilities that your site's users will require (not to mention what your
budget will allow!). So select a search engine that satisfies those needs as
much as possible. For example, if you know that your site's users are already
very familiar with a particular way of specifying a query, such as the use of
Boolean operators, then the search engine you choose should also support
using Boolean operators. Does the size of your site suggest that users will
get huge retrieval results? Be sure that your engine supports techniques for
whittling down retrieval sizes, such as the AND and NOT operators, or that
it supports relevance-ranked results that list the most relevant results at the
top. Will users have a problem with finding the right terms to use in their
search queries? Consider building in a thesaurus capability (AltaVista's
SearchWizard (http://altavista.digital.com/av/lt/help.html) is a common
example) or synonym table so that a query for the term car may retrieve
documents with the term automobile. As the market for search engines booms,
more and more interesting options will be packaged with these tools; let your users' needs
be the major factor that guides your choice.
Finding a Search Engine
Okay, you've decided you want to provide a search engine for your web site.
Where do you get one?
There are several commercial solutions for web site indexing. Lycos licenses
its search engine technology for individual web sites. So does Infoseek.
Excite for Web Servers, or EWS, is a free version of the Excite search engine.
You can get it from http://www.excite.com/navigate/. The only requirement is
that you include a link back to their web site.
Other freeware search engines include Glimpse

(http://glimpse.cs.arizona.edu:1994/) and SWISH (Simple Web Indexing
System for Humans) (http://www.eit.com/software/swish/).
Display Search Results Sensibly
You can configure how your search engine displays search results in many
ways. There is no right way to do it. How you configure your search engine's
results depends on two factors.
The first factor is the degree of structure your content has. What will your
search engine be able to display besides just the titles of retrieved
documents? Is your site's content sufficiently structured so that the engine
can parse out and display such information as an author, a date, an abstract,
and so on?
The other factor is what your site's users really want. What sorts of
information do they need and expect to be provided as they review search
results?
When you are configuring the way your search engine displays results, you
should consider these issues:
1. How much information should be displayed for each retrieved

document?
A simple rule is to display less information per result when you

anticipate large result sets. This will shorten the length of the results
page, making it easier to read. Another rule is to display less
information to users who know what they're looking for, and more
information to users who aren't sure what they want. (Based on your
initial research and assumptions about who will be using your site,
you should be able to make at least an intelligent guess as to which
types of users your site should support.)
When it's hard to distinguish retrieved documents because of a

commonly displayed field (such as the title), show more information
to help the user differentiate the results. Consider allowing the user to
choose how much information should be displayed. The Ann Arbor
District Library, for example, allows users to display retrieval results
in three different modes, thus allowing the same tool to serve users
with varying information needs; see Figure 6-7.
2. What information should be displayed for each retrieved document?
Which fields you show for each document obviously depends on

which fields are available in each document (i.e., how structured your
content is). What your engine displays also depends on how the
content is to be used. Users of phone directories, for example, want
phone numbers first and foremost. So it makes sense to show them the
information from the phone number field on the results page (see
Figures Figure 6-8 and Figure 6-9). Lastly, the amount of space
available on a page is limited: you can't have each field displayed, so
you should choose carefully, and use the space that is available wisely.
3. How many retrieved documents should be displayed?
How many documents are displayed depends on the preceding two

factors. If your engine displays a lot of information for each retrieved
document, you'll want to consider a smaller size for the retrieval set,
and vice versa. Additionally, the user's monitor resolution and browser
settings will affect the amount of information that can be displayed
individually. Your best bet is to provide a variety of settings that the
user can opt to select based on his or her own needs, and always let
the user know the total number of retrieved documents.
4. How should retrieved documents be sorted?
Common options for sorting retrieval results include:
o in chronological order
o alphabetically by title, author, or other fields
o by an odd thing called relevance
Certainly, if your site is providing access to press releases or other

news-oriented information, sorting by reverse chronological order
makes good sense. Chronological order is less common, and can be
useful for presenting historical data.
Alphabetical sorts are a good general purpose sorting approach (most

users are familiar with the order of the alphabet!). Alphabetical sorting
works best if initial articles such as a and the are omitted from the sort
order (certain search engines provide this option). Users will find this
helpful as they are more likely to look for The Naked Bungee Jumping
Guide under N rather than T.
Relevance is an interesting concept; when a search engine retrieves

2,000 documents, isn't it great to have them sorted with the most
relevant at the top, and the least relevant at the bottom? Well,
certainly, if this actually would work. Relevance ranking algorithms
(there are many flavors) are typically determined by some
combination of the following: how many of the query's terms occur in
the retrieved document; how many times those terms occur in that
document; how close to each other those terms occur (e.g., are they
adjacent, in the same sentence, or in the same paragraph?); and where
the terms occur (e.g., a document with the query term in its title is
more likely to be relevant than a document with the query term in its
body).
It's confusing for certain if you're responsible for configuring the

search engine, and probably more so for users. Different relevance
ranking algorithms make sense for different types of content, but with
most search engines, the content you're searching is apples and
oranges. So, for example, a retrieval might rank Document A higher
than Document B, but Document B is definitely more relevant. Why?
Because Document B is a bibliographic citation to a really relevant
work, but Document A is a long document that just happens to contain
many instances of the terms in the search query.
Many search engines use counterintuitive sorting approaches by default,

including when the file was last updated or indexed (a variant of
chronological ordering), or what physical directory the file resides in. Avoid
these defaults; they are obtuse and will confuse the user. Whatever approach
you use, make the ranking order clear to users by making the sort field a
prominent part of each result. Consider shifting the decision on what sort is
most useful by giving the user the option of selecting their own sorting
option.
More About Relevance
Let's say you're interested in knowing what the New Jersey sales tax is.
Maybe you're driving through on a trip, and want to know if you should stop
at an outlet mall or wait until you get to Pennsylvania, where you know the
sales tax. So you go to the State of New Jersey web site and search on sales
tax (see Figure 6-11).
As you can see, these documents are almost exactly the same. Both have
very similar titles, and neither uses hidden <META> tags to prejudice the
ranking algorithm. Finally, both documents mean essentially the same thing,
differing only in that one deals with businesses and the other with individual
consumers. The only apparent difference? While sales and tax appear within
<TITLE> and <H1> tags of both documents, they appear in the body of only
the first document, not in the second. The search engine probably adds 2% to
the score of the first document for this reason. Probably, because, as the
algorithm isn't explained, we don't know for sure if this is the correct
explanation.
Always Provide the User with Feedback
When a user executes a search, he or she expects results. Usually, a query

will retrieve at least one document, so the user's expectation is fulfilled. But
sometimes a search retrieves zero results. Let the user know by creating a
different results page specially for these cases. This page should make it
painfully clear that nothing was retrieved, and give an explanation as to why,
tips for improving retrieval results, and links to both the Help area and to a
new search interface so the user can try again (see Figure 6-14).
Other Considerations
You might also consider including a few easy-to-implement but very useful
things in your engine's search results:
 Repeat back the original search query prominently on the results page.
As users browse through search results, they may forget what they
searched for in the first place. Remind them. Also include the query in
the page's title; this will make it easier for users to find it in their
browser's history lists.
 Let the user know how many documents in total were retrieved.
Users want to know how many documents have been retrieved before
they begin reviewing the results. Let them know; if the number is too
large, they should have the option to refine their search.
 Let the user know where he or she is in the current retrieval set.
It's helpful to let users know that they're viewing documents 31- 40 of
the 83 total that they've retrieved.
 Always make it easy for the user to revise a search or start a new one.
Give them these options on every results page, and display the current
search query on the Revise Search page so they can modify it without
reentering it.
In an Ideal World: The Reference Interview

Obviously, searching can get pretty complex, and many pitfalls can prevent a
user from achieving success. So how does it get done in the non-Web world,
and can we learn anything from it?
In the real world, reference librarians and other information professionals

often make the difference. In fact, without them, civilization would creak to
a grinding halt. They are better than anyone else at finding information
because they break up what seems to be a huge, complex information need
into simpler, more digestible components by conducting a reference
interview that is designed to learn more about the information need and its
context (unless, of course, you're just looking for the bathroom or the
copiers!).
Before you get spooked by the term reference interview, consider that you
probably have been through quite a few of them yourself. When you go to
the library and ask someone behind the reference desk a question, they'll
probably respond with an open question, such as "Can you tell me a little
more about how you'll be using this information?" The interview will often
continue with more specific questions, such as "Do you need this
information for business (or school, a dissertation, personal enjoyment,
etc.)?" "Do you need it right away (or can we take some time to do some
more involved searching or interlibrary loan for it)?" "Are you looking for
something at no cost (or would you like us to do a literature search in some
commercial databases like LEXIS/NEXIS or DIALOG)?" "Are you looking
for a few items (or do you need all there is)?" and so on. These interactive
iterations help both the librarian understand what you're looking for, and
may also help you better understand your own needs by forcing you to
articulate them. In effect, both you and the librarian engage in associative
learning about the information need. Associative learning comes naturally to
humans, but is extremely difficult for software systems to handle.
Can a web site do what a reference librarian does? Well, sort of, but not
quite. We've already covered a sample of the variation found in users and
their information needs, and we know that well-architected sites can largely
address these needs. If we can determine the major needs of our sites' users
and take steps to address them, then perhaps we'll cover 80% of all possible
search queries. That would be wonderful, as most sites probably don't do
half that well. But that other 20%, the really tricky stuff, can't be handled by
automated means like a web site. You really do need humans to help out in
those situations, because only humans are really good at figuring out context
and knowing the right questions to ask. Don't hold your breath for this issue
to be solved by an automated approach, such as with an intelligent agent.
Instead, consider making someone in your organization (maybe the librarian,
if your organization employs one) responsible for handling the tough
queries, and make sure your site actively seeks feedback and directs it to
those human information specialists.
In an Ideal World: The Reference Interview
Obviously, searching can get pretty complex, and many pitfalls can prevent a
user from achieving success. So how does it get done in the non-Web world,
and can we learn anything from it?
In the real world, reference librarians and other information professionals

often make the difference. In fact, without them, civilization would creak to
a grinding halt. They are better than anyone else at finding information
because they break up what seems to be a huge, complex information need
into simpler, more digestible components by conducting a reference
interview that is designed to learn more about the information need and its
context (unless, of course, you're just looking for the bathroom or the
copiers!).
Before you get spooked by the term reference interview, consider that you
probably have been through quite a few of them yourself. When you go to
the library and ask someone behind the reference desk a question, they'll
probably respond with an open question, such as "Can you tell me a little
more about how you'll be using this information?" The interview will often
continue with more specific questions, such as "Do you need this
information for business (or school, a dissertation, personal enjoyment,
etc.)?" "Do you need it right away (or can we take some time to do some
more involved searching or interlibrary loan for it)?" "Are you looking for
something at no cost (or would you like us to do a literature search in some
commercial databases like LEXIS/NEXIS or DIALOG)?" "Are you looking
for a few items (or do you need all there is)?" and so on. These interactive
iterations help both the librarian understand what you're looking for, and
may also help you better understand your own needs by forcing you to
articulate them. In effect, both you and the librarian engage in associative
learning about the information need. Associative learning comes naturally to
humans, but is extremely difficult for software systems to handle.
Can a web site do what a reference librarian does? Well, sort of, but not
quite. We've already covered a sample of the variation found in users and
their information needs, and we know that well-architected sites can largely
address these needs. If we can determine the major needs of our sites' users
and take steps to address them, then perhaps we'll cover 80% of all possible
search queries. That would be wonderful, as most sites probably don't do
half that well. But that other 20%, the really tricky stuff, can't be handled by
automated means like a web site. You really do need humans to help out in
those situations, because only humans are really good at figuring out context
and knowing the right questions to ask. Don't hold your breath for this issue
to be solved by an automated approach, such as with an intelligent agent.
Instead, consider making someone in your organization (maybe the librarian,
if your organization employs one) responsible for handling the tough
queries, and make sure your site actively seeks feedback and directs it to
those human information specialists.
Indexing the Right Stuff
So, let's get back to whether you need a search engine. Let's assume that you
do intend to slap a search engine on top of your web site. Shouldn't be a
problem right? Just point the indexer at the directory where all the pages
live, and, voilà! Searchable site!
Of course, you knew it wasn't that simple. Searching only works well when
the stuff that's being searched is the same as the stuff that users want. This
means you may not want to index the entire site. We'll explain.
6.5.1. Indexing the Entire Site
Search engines are frequently used to index an entire site without regard for
the content and how it might vary -- every word of every page, whether it
contains real content or help information, advertising, navigation menus, and
so on.
However, searching works much better when the information space is

defined narrowly and contains homogeneous content. In other words, the
more you search through indices that combine apples and oranges, the
worse your retrieval results will be. After all, when you search a site, you're
probably looking for apples only, not oranges. As already discussed, a site's
content is usually a mix of apples, oranges, kumquats, bell peppers,
chainsaws, and Barbie dolls to begin with. So, when you tell your search
engine to index your entire site, the site's users will be performing searches
against all kinds of stuff -- navigation, destination, and other kinds of pages
-- all at once. What they retrieve can often be ugly.
Let's try an example to see what happens. Searching Netscape's site for
plug-ins, what do we find? Exactly 100 documents.[16] Of these:
[16] Search done on February 2, 1997.
 58 documents are Welcome to Netscape Navigator version X.X pages

for just about every version of Netscape Navigator and include
information about plug-ins.
 16 documents are in German (a language I don't read).
 6 documents contain the potentially relevant term application in their

titles, but 5 of these 6 have exactly the same title (Netscape
Handbook: Application Features).
 2 documents actually contain plug-in in their titles.
 18 other assorted documents may be relevant, but are not labeled in a

way that indicates whether this is the case.
Analyzing these search results, we find two common problems. First, we are
presented with documents that clearly don't belong. If the site had been
selectively indexed with audience differences in mind, 16% of the results
would not have been displayed at all. Second, regarding relevant documents,
it's not clear why we need 58 versions of the same type of document. It
would have been useful to index pages more selectively, such as files
relevant to Windows or Macintosh users, or recent versions versus older
versions of the software. Are very many people still interested in old
Netscape Beta versions? So, our search is less successful than it could have
been; it gave us a lot of irrelevant documents, and too many that could be
relevant.
Our search performed poorly because all the content in the site was indexed
together. By doing so, the site's architects chose to ignore two very important
things: that the information in their site isn't all the same, and that it makes
good sense to respect the lines already drawn between different types of
content. For example, it's clear that German and English content are vastly
different and that their audiences overlap very little (if at all), so why not
create separately searchable indices along those divisions?
The site designers at Netscape are already doing this, in a limited way. They
have put a lot of effort into helping you download the right version of the
software from the nearest location. To download the software, you get asked
several questions (not unlike those in a reference interview). Shown in
Figure 6-15, the site asks the user:
 What operating system does your computer use?

 What language do you speak?
 Which of our products do you need?

Figure 6-15. Three pull-down menus perform a brief reference interview
sufficient to help users download the appropriate software
product.
The result is a list of links to download sites that provide the user the right
information (i.e., software appropriate to the user's platform), taking into
account his or her geographic location and language. Why not apply this
same careful approach to matching users with the right information to the
entire site, instead of just to this specific situation?
6.5.2. Search Zones: Selectively Indexing the Right Content
Search zones are subsets of a web site that have been indexed separately
from the rest of the site's content. When you search a search zone, you have,
through interaction with the site, already identified yourself as a member of
a particular audience or as someone searching for a particular type of
information. The search zones in a site match those specific needs, and the
result is improved retrieval performance. The user is simply less likely to
retrieve irrelevant information.
The Microsoft site has a good example of search zone use. Although this site
suffers from other searching problems, it compares favorably to the
Netscape site when searching for our old stand-by, plug-ins. On the search
page you're asked where you want to search in the Microsoft site, and are
provided with the options on a pull-down menu (Figure 6-16).
Figure 6-16. Microsoft's site employs search zones to help focus the
user's search before submitting a query to the search engine.
You've got many options to review, but you can quickly find the Internet
Explorer area of the site where you'd want to look for plug-ins. Consider
how well the effort the user expends in reviewing and selecting from this
menu compares to the much greater effort of searching the entire site and
then sifting through a tremendously larger retrieval set. Also note the Full
Site Search option; sometimes it does make sense to maintain an index of the
entire site, especially for users who are unsure where to look, who are doing
a comprehensive leave-no-stones-unturned search, or who just haven't had
any luck searching the more narrowly defined indices.
How is search zone indexing set up? It depends on the search engine
software used. Most support the creation of search zones, but some provide
interfaces that make this process easier, while others require you to manually
provide a list of pages to index. In either case, search zone indexing requires
more work on your part than simply pointing the search engine at the entire
site: you'll need to review and mark each page that should be indexed. To
make this easier, you might design your site so that pages that should be
indexed together are located in the same directory; that way, you would
mark for indexing a directory (and, implicitly, its contents) instead of its
individual pages. You may also be working with pages that are generated
from a database. In this case, you could design the database to include a field
for each record denoting which index the generated page should belong to.
You can create search zones in many ways. Examples of four common
approaches are:
 by content type
 by audience
 by subject
 by date
Note that these approaches are similar to the organization schemes discussed
in Chapter 3, "Organizing Information". The decisions you made in selecting
your site's organization scheme will often work for determining search zones
as well. You could also try other ways; the most important consideration is
to choose an approach appropriate to your site's audiences and their
information needs.
6.5.2.1. Apples and apples: indexing similar content types
Most web sites contain, at minimum, two major and dissimilar types of
pages: navigation and destination. Destination pages contain the actual
information you want from a web site: sport scores, book reviews, software
documentation, and so on. The primary purpose of a site's navigation pages
is to get you to the destination pages. Navigation pages may include main
pages, search pages, and pages that help you browse a site.
When a user searches a site, he or she is generally looking for destination

pages. If navigation pages are part of the retrieval, they will just clutter up
the retrieval results. In fact, the reason that the user is searching rather than
browsing some other way could be because the navigation system is
performing poorly in the first place. So why keep showing the user
navigation pages that don't work and aren't relevant to the search?
Let's take a simple example: your company sells computer products via its
web site. The destination pages consist of descriptions, pricing, and ordering
information, one page for each product. Also, a number of navigation pages
help users find products, such as listings of products for different platforms
(e.g., Macintosh versus Windows), listings of products for different
applications (e.g., word processing, bookkeeping), listings of business
versus home products, and listings of hardware versus software products. If
the user is searching for Intuit's Quicken, what's likely to happen? Instead of
simply retrieving Quicken's product page, they might get all these pages:
Financial Products Index Page
Home Products Index Page
Macintosh Products Index Page
Quicken Product Page
Software Products Index Page
Windows Products Index Page
The user retrieves the right destination page (i.e., the Quicken Product Page),
but also five more that are purely navigation pages. In other words, 83% of
the retrieval is in the way. And keep in mind that this example is simple;
what if the user had to ignore 83% of a much larger retrieval set, say, 200
documents?
Of course, indexing similar content isn't always easy, because "similar" is a

highly relative term. It's not always clear where to draw the line between
navigation and destination pages. In some cases, a page can be considered
both. For example, we tried the approach described here for the SIGGRAPH
96 Conference web site.[17] We found that some pages didn't really fit the
navigation/destination breakdown. For example, the Exhibition Hall Map
page appears to be navigation. It links to pages for each of the five sections
of the hall. These five pages appear to be destination, presenting detailed
maps of their respective sections, including booth numbers and the names of
exhibitors. But their parent page also provides important information, such
as where the hall entrances are, and where the five sections are in relation to
one another. So isn't the main Exhibition Hall Map page destination as well
as navigation? The best solution, in this particular case, was to index these
hybrid pages, but it wasn't ideal.
[17]This site evolved greatly during the year leading up to SIGGRAPH 96,
and then some after the conference was complete. The fullest version of this
site is archived at http://siggraph.anecdote.com/conferences/siggraph96.
The more important lesson from this experience was to test out the
navigation/destination distinctions before actually applying them. The
weakness of the navigation/destination approach is that it is essentially an
exact organization scheme (discussed in Chapter 3, "Organizing
Information") which requires the pages to be either one thing (in this case
destination) or another (navigation). In the following three approaches, the
organization approaches are ambiguous, and therefore more forgiving of
pages that fit into multiple categories.
6.5.2.2. Who's going to care? Indexing for specific audiences
If you've already decided to create an architecture for your site that uses an
audience-oriented organization scheme, it may make sense to create search
zones by audience breakdown as well. We found this a useful approach for
the original Library of Michigan web site.
The Library of Michigan has three primary audiences: members of the

Michigan state legislature and their staffs, Michigan libraries and their
librarians, and the citizens of Michigan. The information needed from this
site is different for each of these audiences; for example, each has a very
different circulation policy. Why would a state legislator care how long a
citizen can check a book out for?
So we created four indices: one for the content relevant to each audience,
and one unified index of the entire site in case the audience-specific indices
didn't do the trick for a particular search. Here are the results from running a
query on the word circulation against each of the four indices:
Index Number of Documents Retrieved Retrieval Reduced By
Unified 40 -
Legislature Area 18 55%
Libraries Area 24 40%
Citizens Area 9 78%
As with any search zone, less overlap between indices improves

performance. If the sizes of retrieval results were reduced by a very small
figure, let's say, 10% or 20%, it may not be worth the overhead of creating
separate audience-oriented indices. But in this case, much of the site's
content is specific to one of the audiences.
6.5.2.3. Drilling down: Indexing by subject
If your site uses a strong subject-oriented or topical organization scheme,

you've already distinguished many of the site's search zones. Yahoo! is
perhaps the most popular site to employ subject-oriented search zones.
Every subject category and subcategory in Yahoo! can be searched
individually. For example, let's say you're looking for sites that deal with
science fiction movies. If you search for science fiction against the whole
Yahoo! search index, you'll retrieve a lot of stuff: 35 category and
subcategory matches and 816 site matches. But you're not looking for
science fiction in general; you're looking for science fiction movies. So,
instead you can run the same science fiction search against the index for the
Yahoo! subcategory Movies and Films. This time you'll be happier with your
retrieval: 2 category and subcategory matches and 19 site matches. This is
another excellent example of how hierarchical search zones allow for
increased specificity, and therefore improved retrieval results.
6.5.2.4. Yesterday's news: Indexing recent content
Chronologically organized content allows for perhaps the easiest

implementation of search zones. (Not surprisingly, it's probably the most
common example of search zones.) Because dated materials are generally
not ambiguous, indexing them by date is staightforward.
News.Com is a great example (Figure 6-17); it supports highly flexible

chronological searching by:
Date Range (e.g., from 5/20/97 to 6/26/97)
3 Days Back
7 Days Back
14 Days Back
21 Days Back
30 Days Back
60 Days Back
90 Days Back
Figure 6-17. News.com's search interface uses two components (Date
range and Number of days back) to allow for powerful
chronological searching.
Regular users can return to the site and check up on the news depending on
how regularly they use the site (e.g., every week, two weeks, three weeks).
Users who are looking for news during a particular date range can
essentially generate a custom search zone on the fly. The only negative in
News.Com's implementation is that they don't seem to support a search
against all news articles, regardless of age.[18]
[18]There does seem to be a work-around to this problem: leave the pull-

down menu on the default setting of Days back, and the resulting retrieval
seems larger than 90 days. But this is simply a guess...
To Search or Not To Search?
It's becoming a moot question whether to apply a search engine in your site.
Jared Spool's studies demonstrate how important searching systems are to
users. Although their subjects weren't told to use a site's search engine to
find answers, "about one-third of the people we tested usually tried a search
as their initial strategy, and others resorted to it when they couldn't find an
answer by following links" (browsing).[19] Users generally expect searching
to be available, certainly in larger sites. Yet, we all know how poorly many
search engines actually work. They're easy to set up and easy to forget about.
That's why it's important to understand how users' information needs can
vary so much, and to plan and implement your searching system's interface
and search zones accordingly.
Conceptual Design
Based upon information gathered during the research phase, you must now
create order out of chaos. Is there a metaphor that will drive the organization
of the site? How should the information be organized and labeled at the
highest levels of the hierarchy? What types of navigation systems will be
applied? How will searching work? This is where the fun begins.
Early conceptual design meetings focus on metaphor and high-level
organization. You need to present possible organization schemes, balancing
the desire to reach consensus and move forward with the need to remain
open-minded about alternate approaches. White boards and flip charts, high-
level architecture blueprints, and scenarios are key tools at this stage. After
the major issues have been worked out, later meetings involve the
consideration of more detailed organization, labeling, indexing, and
navigation systems. Detailed blueprints and Web-based prototypes will serve
you well in these discussions.
8.1. Brainstorming with White Boards and Flip Charts
For collaborative purposes, white boards are unparalleled. The ephemeral

nature of white board scribblings permits a creative freedom not found in
other media. The technology disappears and inhibitions fall away.
In early research-oriented meetings, white boards support collaboration

around the definition and refinement of the mission, vision, and goals of the
project. When working with several people from the organization, each with
a different set of experiences, perspectives, and goals, you can use the white
board to help identify issues, resolve differences, and achieve consensus.
White boards are also useful for considering possible information

architectures. Presenting ideas on the white board triggers new
understanding and further brainstorming (see Figure 8-1). The white board,
the architect, and colleagues become connected in a feedback cycle that
moves towards the articulation of an information architecture.
Figure 8-1. Sample white board scribblings
At face level, a major problem of white boards revolves around the difficulty
of recording a white-boarding session. White board scribblings do not leave
a permanent record. Ideas flow. The board fills up. The board is erased.
Eventually, everyone leaves and the scribblings remain trapped on the
surface of the white board, soon to be erased by the participants of the next
meeting.
In reality, you can use this problem to your advantage. Each time consensus
is reached, record the relevant white board scribblings. Differences of
opinion and dead-end discussions are quickly forgotten and only the
agreements remain. Alternatively, if you're not comfortable with this level of
sneakiness, you can assign a designated notetaker to record agreements and
disagreements alike.
We are aware of high-tech white boards that allow you to print or save your
scribbles. While we don't have much direct experience, we're guessing many
of these gadgets are more trouble than they're worth. Sorry for the
skepticism, but what do you expect from librarians?
While the flip chart is a close relative of the white board, several
characteristics distinguish the two. Advantages of using the flip chart during
the research phase include its high portability and intrinsic record-generating
nature. Flip charts are portable. Their tearaway sheets can be taken back to
the office for study and transcription. White boards are often anchored to
walls and won't fit in your car.
However, flip charts don't really support iteration and collaboration. Due to
the difficulty of erasing ink on paper and the ugliness of extensively marked-
up pages, flip charts invoke in people a higher fear of error and greater
resistance to change. When working with flip charts, people try to get it right
the first time. Whether or not they succeed, they tend to live with the results
rather than mark up the page. This limits the freedom and creativity of group
collaboration.
While the visible differences between white boards and flip charts are fairly
subtle and seemingly innocent, the ultimate impact upon the collaborative
process can be significant. For collaborative brainstorming, give us a white
board any day.
Metaphor Exploration
Metaphor can be a powerful tool for communicating complex ideas and

generating enthusiasm. By suggesting creative relationships or by mapping
the familiar onto the new, metaphor can be used to explain, excite, and
persuade. In 1992, vice-presidential candidate Al Gore popularized the term
information superhighway. This term mapped the familiar and respected
metaphor of the physical highway infrastructure of the United States onto
the new and unfamiliar concept of a national information infrastructure.
Gore used this term to excite the voters about his vision for the future. While
the term did oversimplify and has since been horribly overused, it succeeded
in helping people to begin learning about and discussing the importance and
direction of the global Internet.
Three types of metaphor can be applied in the design of web sites. These are
organizational, functional, and visual metaphors:
 Organizational metaphors leverage familiarity with one system's

organization to convey quick understanding of a new system's
organization. For example, when you visit an automobile dealership,
you must choose to enter one of the following departments: new car
sales, used car sales, repair and service, or parts and supplies. People
have a mental model of how dealerships are organized. If you're
creating a web site for an automobile dealership, it may make sense to
employ an organizational metaphor that draws from this model.
 Functional metaphors make a connection between the tasks you can
perform in a traditional environment and those you can perform in a
new environment. For example, when you enter a traditional library,
you can browse the shelves, search the catalog, or ask a librarian for
help. Many library web sites present these tasks as options for users,
thereby employing a functional metaphor.
 Visual metaphors leverage familiar graphic elements such as images,

icons, and colors to create a connection to the new. For example, an
online directory of business addresses and phone numbers might use a
yellow background and telephone icons to invoke a connection with
the more familiar print-based yellow pages.
The process of metaphor exploration can get the creative juices flowing.
Working with your clients or colleagues, begin to brainstorm ideas for
metaphors that might apply to your project. Think about how those
metaphors might apply in organizational, functional, and visual ways. How
would you organize a virtual bookstore or library or museum? Is your site
more like a bookstore or a library or a museum? What are the differences?
What tasks should users be able to perform? What should it look like? You
and your colleagues should cut loose and have fun with this exercise. You'll
be surprised by the ideas you come up with.
After this brainstorming session, you'll want to subject everyone's brilliant

ideas to a more critical review. Start populating the rough metaphor-based
architecture with random items from the expected content to see if they fit.
Try one or two user scenarios to see if the metaphor holds up. While
metaphor exploration is a useful process, you should not feel obligated to
carry all or any of the ideas forward into the information architecture. The
reality is that metaphors are great for getting ideas flowing during the
conceptual design process, but can be problematic when carried forward into
the site itself.
For example, the metaphor of a virtual community has been taken too far in
many cases. Some of these online communities have post offices, town halls,
shopping centers, libraries, schools, and police stations. Figuring out what
types of activities take place in which "buildings" can be a real challenge for
the user. In such cases, the metaphor hampers usability. As an architect, you
should ensure that any use of metaphor is empowering and not limiting (see
Figure 8-2).
Figure 8-2. The Internet Public Library uses visual and organizational
metaphors to provide access to the reference area. Users can
browse the shelves or ask a question. However, the traditional
library metaphor did not support integration of a multi-user,
object-oriented environment, or MOO. Applied in such a strong
way, metaphors can quickly become limiting factors in site
architecture and design.
You should also go into this exercise understanding that people tend to fall
in love with their own metaphors. Make sure everyone knows that this is just
an exercise and that it rarely makes sense to carry the metaphor into the
information architecture design.
Scenarios
While architecture blueprints are excellent tools for capturing an approach to
information organization in a detailed and structured way, they do not tend
to excite people. As an architect who wants to convince your colleagues of
the wisdom of your approach, you need to help them envision the site as you
see it in your mind's eye. Scenarios are great tools for helping people to
understand how the user will navigate and experience the site you design.
They will also help you think through the experience your site will provide
and may generate new ideas for the architecture and navigation system.
To provide a multidimensional experience that shows the true potential for

the site, it is best to write a few scenarios that show how people with
different needs and behaviors would navigate your site. Before beginning the
scenario, you should think about the primary intended audiences. Who are
the people that will use your site? Why and how will they want to use it?
Will they be in a rush or will they want to explore? Try to select three or four
major user types who will use the site in very different ways. Create a
character who represents each type. Give them a name, a profession, and a
reason for visiting your site, as demonstrated in the sidebar. Then, begin to
flesh out a sample session in which that person uses your site. Try to
highlight the best features of the site through your scenario. If you've
designed for a new customization feature, show how someone would use it.
This is a great opportunity to be creative. You'll probably find these scenarios to be easy
and fun to write. Hopefully, they'll help convince your colleagues to invest in your ideas.
Sample Scenario
Rosalind, a tenth grader in San Francisco, regularly visits the LiveFun Web
site because she enjoys the interactive learning experience. She uses the site
in both investigative mode and serendipity mode .
For example, when her anatomy class was studying skeletal structure, she
used the investigative mode to search for resources about the skeleton. She
found the interactive human skeleton that let her test her knowledge of the
correct names and functions of each bone. She bookmarked this page so she
could return for a refresher the night before final exams.
When she's done with homework, Rosalind sometimes surfs through the site
in serendipity mode. Her interest in poisonous snakes led her to articles about
how certain types of venom affect the human nervous system. One of these
articles led her into an interactive game that taught her about other chemicals
(such as alcohol) that are able to cross the blood-brain barrier. This game
piqued her interest in chemistry and she switched into investigative mode to
learn more.
This simple scenario shows why and how users may employ both searching
and browsing within the web site. More complex scenarios can be used to
flesh out the possible needs of users from multiple audiences.
High-Level Architecture Blueprints
The collaborative brainstorming process is exciting, chaotic, and fun.

However, sooner or later, you must hole up away from the crowd and
transform this chaos into order. Blueprints are the architect's tool of choice
for performing this transformation.
The very act of shaping ideas into the more formal structure of a blueprint
forces you to become realistic and practical. If brainstorming takes you to
the top of the mountain, blueprinting brings you back down to reality. Ideas
that seemed brilliant on the white board may not pan out when you attempt
to organize them in a practical manner. It's easy to throw around concepts
such as audience-specific gateways and adaptive information architectures.
It's not so easy to define on paper exactly how these concepts will be applied
to a specific web site.
During the conceptual design phase, high-level blueprints are most useful for
exploring primary organization schemes and approaches. High-level
blueprints map out the organization and labeling of major areas, usually
beginning with a bird's-eye view from the main page of the web site. This
exploration may involve several iterations as you further define the
information architecture. High-level blueprints are great for stimulating
discussions focused on the organization and management of content as well
as the desired access pathways for users. These blueprints can be created by
hand, but we prefer to use diagramming software such as Visio or
NetObjects Fusion. These products not only help you to quickly layout your
architecture blueprints, but can also help with site production and
maintenance.
Figure 8-3. This high-level blueprint shows pages, components within
pages, groups of pages, and relationships between pages. The
grouping of pages can inform page layout. For example, the
three value-added guides should be presented together, whereas
Search & Browse, Feedback, and News should be presented
separately.
Let's walk through the blueprint in Figure 8-3, as we would when presenting
it to clients or colleagues. The building block of this architecture is the sub-
site. Within this company, the ownership and management of content is
distributed among many individuals in different departments. There are
already dozens of small and large web sites, each with its own graphic
identity and information architecture. Rather than try to enforce one standard
across this collection of sites, this blueprint suggests an umbrella
architecture approach that allows for the existence of lots of heterogeneous
sub-sites.
Moving up from the sub-sites, we see a directory of sub-site records. This

directory serves as a card catalog that provides easy access to the sub-sites.
There is a sub-site record for each sub-site. Each record consists of fields
such as title, description, keywords, audience, format, and topic that describe
the contents of that sub-site.
By creating a standardized record for each sub-site, we are actually creating

a database of sub-site records. This database approach enables powerful
known-item searching and more exploratory browsing. As you can see from
the Search & Browse page, users can search and browse by title, audience,
format, and topic.
We also see three value-added guides. These guides take the form of simple
narratives or stories that introduce new users to the organization and to the
web site. Interwoven throughout the text of these narratives are in-context
links to selected sub-sites. They guide users through the site in an interesting
and friendly way.
Finally, we see a dynamic news billboard (perhaps implemented through

Java or JavaScript) that rotates the display of featured news headlines and
announcements. In addition to bringing some action to the main page, this
billboard provides yet another way to access important content that might
otherwise be buried within a sub-site.
At this point in the discussion of the high-level blueprint, you are sure to
have questions. As you can see, the blueprints don't completely speak for
themselves. This is why it's ideal to present these blueprints in person, so
you can answer questions and explore new ideas.
In addition, your architectural ideas may need selling. Now, we're not
suggesting that you buy a polyester suit, but an element of sales is involved.
You need to excite your clients and colleagues about your approach and
vision for the site. You need to explain the ideas behind your labeling and
organization schemes and describe how this model will support growth over
time. These challenges are difficult to address without a meeting (or at least
a telephone conference call).
However, if a meeting is simply not possible, you can accompany blueprints

with descriptive text-based documents that anticipate and answer the most
likely questions. You can then follow up with a conference call to answer the
questions you didn't anticipate and move the process along.
You should note that these high-level blueprints leave out quite a bit of
information. They focus on the major areas of the site, ignoring navigation
elements and page-level details. These omissions are by design, not by
accident. Shaping the information architecture of a complex web site is a
challenging intellectual exercise. You and your colleagues must be able to
focus on the big picture issues at hand. For these blueprints, as with the web
sites you design, remember the rule of thumb that less is more. Detailed
page-level blueprints come later in the process.
Architectural Page Mockups
Information architecture blueprints are most useful for presenting a bird's-

eye view of the web site. However, they do not work well for helping people
to envision the contents of any particular page. They are also not
straightforward enough for most graphic designers to work from. In fact, no
single format does a perfect job of conveying all aspects of an information
architecture to all audiences. Because information architectures are multi-
dimensional, it's important to show them in multiple ways.
For these reasons, architectural page mockups are useful tools during
conceptual design for complementing the blueprint view of the site.
Mockups are quick and dirty textual documents that show the content and
links of major pages on the web site. They enable you to clearly (yet
inexpensively) communicate the implications of the architecture at the page
level. They are also extremely useful when used in conjunction with
scenarios. They help people to see the site in action before any code is
written. Finally, they can be employed in some basic usability tests to see if
users actually follow the scenarios as you expect. Keep in mind that you
only need to mockup major pages of the web site. These mockups and the
designs that derive from them can serve as templates for the design of
subsidiary pages.
Figure 8-4. In this architectural mockup of a combination
search/browse page, we show an area for entering queries and an
area for browsing. We typically use a word processor like
Microsoft Word to create these mockups quickly. However, you
can also create quick and dirty HTML mockups, and even work
quite interactively with the graphic designer.
In the example in Figure 8-4, you see that mockups are easier to read than
blueprints. By integrating aspects of the organization, labeling, and
navigation systems into one view, they will help your colleagues to
understand the architecture. In laying out the content on a page mockup, you
should try to show the logical visual grouping of content items. In this
example, the search interface and the browsing options are two separate
content groups. You can also indicate prominence in these mockups. Placing
a content group at the top of the page or using a larger font size indicate the
relative importance of that content. While the graphic designer will make the
final and more detailed layout decisions, you can make a good start with
these mockups.
Design Sketches
Once you've developed high-level blueprints and architectural page

mockups, you're ready to collaborate with your graphic designer to create
design sketches on paper of major pages in the web site. In the research
phase, the design team has begun to develop a sense of the desired graphic
identity or look and feel. The technical team has assessed the information
technology infrastructure of the organization and the platform limitations of
the intended audiences. They understand what's possible with respect to
features such as dynamic content management and interactivity. And, of
course, the architect has designed the high-level information structure for the
site. Design sketches are a great way to pool the collective knowledge of
these three teams in a first attempt at interface design for the top level pages
of the site. This is a wonderful opportunity for interdisciplinary user
interface design.
Using the architectural mockups as a guide, the designer begins sketching

pages of the site on sheets of paper. As the designer sketches each page,
questions arise that must be discussed. Here is a sample sketching session
dialog:
Programmer:
I like what you're doing with the layout of the main page, but I'd like
to do something more interesting with the navigation system.
Designer:
Can we implement the navigation system using pull-down menus?

Does that make sense architecturally?
Architect:
That might work, but it would be difficult to show context in the

hierarchy. How about a tear-away table of contents feature? We've had
pretty good reactions to that type of approach from users in the past.
Programmer:
We can certainly go with that approach from a purely technical
perspective. How would a tear-away table of contents look? Can you
sketch it for us? I'd like to do a quick-and-dirty prototype.
As you can see, the design of these sketches requires the involvement of
people from all three teams. It is much cheaper and easier for the group to
work with the designer on these rough sketches than to begin with actual
HTML page layouts and graphics. These sketches allow rapid iteration and
intense collaboration. The final product of a sketching session might look
something like that in Figure 8-5.
Figure 8-5. In this example, Employee Handbook, Library, and News

are grouped together as the major areas of the web site.
Search/Browse and Guidelines/Policies make up the bottom of
the page navigation bar. A news area defines space for a dynamic
Java-based news panel.

Chapter 1 Web Engineeering

Uploaded by

Copyright:

Available Formats

Chapter 1 Web Engineeering

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 1 Web Engineeering

Uploaded by

Copyright:

Available Formats

Introduction to Information Architecture

Information Architect: 1) the individual who organizes the patterns inherent

--Richard Saul Wurman, Information Architects, ed. Peter Bradford (Zurich:

2.1. The Role of the Information Architect

So now you're all set to design your web site, right?

That's the main job of the information architect, who:

Although these sound obvious, information architecture is really about

Well-planned information architectures greatly benefit both consumers and

2.1.1. The Consumer's Perspective

Consumers, or users as we more commonly refer to them, want to find

Because different users have varying needs, it's important to support

These modes of finding information are not mutually exclusive. In a well-

Consideration of value to the producer takes us back to the consumer. If

If you're producing an intranet, the employees of your organization are the

Finally, we need to consider the actual costs of designing and implementing

Succinctly, information architecture is about understanding and conveying

Because these thorny and confounding issues of information architecture

Collaboration and Communication

Then companies began to demand more of their sites and, consequently, of

Increasingly, webmasters and their employers began to realize that the

The marketing team focuses on the intended purposes and audiences

The information architects focus on the design of organization,

The success of a web site design and production project depends on

For the information architect, communication is a special challenge because

Our understanding of the world is largely determined by our ability to

We organize to understand, to explain, and to control. Our classification

As information architects, we organize information so that people can find

The Web provides us with a wonderfully flexible environment in which to

3.1. Organizational Challenges

In recent years, increasing attention has been focused on the challenge of

The Internet is forcing the responsibility for organizing information on more

As we struggle to meet that challenge, we unknowingly adopt the language

We're moving towards a world where tremendous numbers of people publish

Classification systems are built upon the foundation of language, and

 A throw, fling, or toss.

 A salesman's persuasive line of talk.

 An element of sound determined by the frequency of vibration.

Heterogeneity refers to an object or collection of objects composed of

An old-fashioned library card catalog is relatively homogeneous. It

The heterogeneous nature of web sites makes it difficult to impose highly

3.1.3. Differences in Perspectives

Have you ever tried to find a file on a coworker's desktop computer?

3.1.4. Internal Politics

As an information architect, you must be sensitive to your organization's

Organizing Web Sites and Intranets

The organization of information in web sites and intranets is a major factor

Organization systems are composed of organization schemes and

Before diving in, it's important to understand information organization in the

3.2.1. Organization Schemes

We navigate through organization schemes every day. Phone books,

3.2.1.1. Exact organization schemes

Exact organization schemes are relatively easy to design and maintain

An alphabetical organization scheme is the primary organization scheme for

Figure 3-1. An alphabetical index supports both rapid scanning for a

Certain types of information lend themselves to chronological organization.

Figure 3-2. Press release archives are obvious candidates for

Place is often an important characteristic of information. We travel from one

3.2.1.2. Ambiguous organization schemes

Now for the tough ones. Ambiguous organization schemes divide