Uxpin Guide To Usability Testing
Uxpin Guide To Usability Testing
Uxpin Guide To Usability Testing
INDEX
1. INTRODUCTION - THE IMPORTANCE OF RESEARCH AND TESTING
2. USABILITY TESTING GOALS
Defining Your Usability Goals
Usability Metrics
Takeaway
3. CHOOSING YOUR TEST AND PARTICIPANTS
Types of Test
Finding Your Target Test Audience
Usability Test Plan
Takeaway
4. SCRIPTED TESTS
Moderated vs. Unmoderated Tests
Tree Testing
Usability Benchmark Testing
Hallway Usability Testing
Takeaway
5....DECONTEXTUALIZED TESTS & HEURISTIC REVIEWS
Card Sorting
User Interviews
Heuristics Evaluations
Takeaway
6....NATURAL & NEAR-NATURAL TESTS
A/B Testing
First Click Testing
Field and Diary Studies
Eye Tracking Test
CHAPTER ONE
Introduction
A quick note from the authors
Test early and test often. Every company and product is different, so there is
no magical usability test that will tell you everything you need to know. Define
your hypothesis, pick several quantitative and qualitative methods, and get
ready to go out of your comfort zone.
In this book, well share a wide breadth of expert commentary, theories,
practices, and real-life examples of usability testing. Well discuss basic concepts
like how to plan your usability test. For more experienced readers, we cover
scripted testing methods, hybrid testing methods, and the differences between
web versus mobile usability tests. Our hope is that it helps you see usability
testing as more than just asking people for their opinions on your app or website.
Usability testing helps you see the bottom line of whether your design works or
doesnt. Well look at how highly successful companies like Apple, MailChimp,
Yahoo, DirecTV, Microsoft, Buffer, among others, used different usability
testing tactics that all suited their own unique needs. Weve also included our
own preferences, and outlined how UXPin conducts usability testing.
Wed love your thoughts on what weve written. And feel free to include anyone
else in the discussion by sharing this e-book.
For the love of users,
Chris Bank
(co-written by Jerry Cao)
6
CHAPTER T WO
ike all significant undertakings, you need to go into usability testing with a
plan. As youll see, a little extra time planning at the beginning can pay off
in the end. By following a few simple guidelines, youll know what to expect,
what to look for, and what to take away from your usability testing.
Obviously youd like to optimize the results of your usability testing, and in
order to do that, you must first know what youre testing for. Well explain how
to define your testing objectives and set your usability metrics.
Timing and Scope What time frame are you working with for collecting
your data? When is it due?
Once youve finished your benchmark questions, you can reverse the roles and
have your team write down their questions (that way youll have identified what
they know, and what theyd like to know). Becky White of Mutual Mobile talks
about a sample exercise to help you narrow down your goals. Gather your team
together and pass out sticky notes. Then, have everyone write down questions
they have about their users and the UX. Collect all the questions and stick them
to a board. Finally, try to organize all the questions based on similarity. Youll
see that certain categories will have more questions than others these will
likely become your testing objectives.
It also helps to make sure your testing objectives are as simple as possible. Your
objectives should be simple like Can visitors find the information they need?
instead of complex objectives like Can visitors easily find our products and make
an informed purchase decision?
11
If you think using usability testing questions as a means to set your goals,
Userium offers this helpful website usability checklist. If you notice youre
lacking in one or more categories, those are where collecting data would be
most helpful (and are good talking points if your team gets stuck during the
initial Q&A).
The simplest usability testing objectives lead to the deepest design insights.
TWEET THIS
Once you know your goals and what type of data youre looking for, its time
to begin planning the actual tests. But before we get into that, lets talk a little
about metrics.
13
Usability Metrics
Metrics are the quantitative data surrounding usability, as opposed to more
qualitative research like the verbal responses and written responses we
described above. When you combine qualitative with quantitative data
gathering, you get an idea of why and how to fix problems, as well as how many
usability issues need to be resolved. You can see how this plays out in the below
diagram from a piece on quantitative versus qualitative data.
Qualitative & quantitative data help you understand what to fix & why, and how many
problems exist.
TWEET THIS
assigned task? When we tested 35 users for a redesign of the Yelp website,
this was one of the most important bottom-line metrics.
Error Rate Which errors tripped up users most? These can be divided
into two types: critical and noncritical. Critical errors will prevent a user
from completing a task, while noncritical errors will simply lower the
efficiency with which they complete it.
Time to Completion How much time did it take the user to complete
the task? This can be particularly useful when determining how your
product compares with your competitors (if youre testing both).
Subjective Measures Numerically rank a users self-determined
satisfaction, ease-of-use, availability of information, etc. Surprisingly, you
can actually quantify qualitative feedback by boiling this down to the Single
Ease Question.
15
Usability metrics are always helpful, but can be a costly investment since you
need to test more people for statistical significance. If you plan on gathering
quantitative data, make sure you collect qualitative data so you have a system of
checks-and-balances, otherwise you run the risk of numbers fetishism. You can
actually see how this risk could play out in the real world in a clever explanation
of margarine causing divorce by Hannah Alvarez of UserTesting.
Theres a fine line between quant analysis and numbers fetishism. Qualitative data is
your reality check.
16
TWEET THIS
Takeaway
In some ways, the planning phase is the most important in usability research.
When its done correctly, with patience and thought, you data will be accurate
and most beneficial. However, if the initial planning is glossed over or even
ignored your data will suffer and call into question the value of the whole
endeavor. Take to heart the items discussed in this chapter, and dont move
forward until youre completely confident in your objectives and how to achieve
them.
In the next chapter, well start to get into the specifics of the actual test
planning, namely what kind of test will work and whom to choose to
participate. As both the type of test and the type of user can differ greatly, its
vital to take the time in deciding.
For more information about the planning process in particular concerning
user testing, download our free e-book, The Guide to UX Design Process and
Documentation. The Research chapter will help flesh out and reiterate the points
covered here.
17
CHAPTER THREE
18
n this chapter were going to discuss two of essential factors in a user test:
the users and the tests. Now that you know what your goals are, youre ready
to hone your test planning to meet those specific goals. There are many tests to
choose from, and many types of people to recruit, so narrowing your focus will
get you closer to the results you want.
Types of Test
Deciding which style of test to administer is a pivotal decision in the entire
process of usability testing, so dont take it lightly. On the bright side, the more
concrete your usability goals are, the more smoothly the selection process will
go.
But no matter what type of test you choose, you should always start with a pilot
test. Many people like to gloss over this, but sacrificing a little extra time for a
pilot test almost always pays off.
I. PILOT TEST
Pilot testing is like a test run of your greater user test. In A Practical Guide to
Usability Testing, Joseph S. Dumas and Janice C. Redish call pilot tests a dress
rehearsal for the usability test to follow. You will conduct the test and collect
the data in the same way you would a real test, but the difference is that you
dont analyze or include the data. You are, quite literally, testing your test.
Before you test your users, test your test. Always run a pilot test.
That may seem like a waste of time and you will likely be tempted to just
jump directly into the actual tests but pilot tests are highly recommended.
19
TWEET THIS
The reason is that, in most cases, something will go wrong with your first test.
Whether technical problems, human error, or a situational occurrence, its rare
that a first test session goes well, or even adequately.
The idea is that these tests should be as scientific and precise as possible. If you
want the most reliable data, run a pilot test or two until you feel you understand
the process and have removed all the kinks.
The second distinction to make when creating tasks is between closed and
open-ended tasks.
Closed A closed task is one with clearly defined success or failure. These
are used for testing specific factors like success rate or time. For example, in
our Yelp redesign exercise, we gave participants the following task: Your
friend is having a birthday this weekend. Find a fun venue that can seat up
to 15 people.
Open-ended An open-ended task is one where the user can complete
it several ways. These are more subjective and most useful when trying
to determine how your user behaves spontaneously, or how they prefer
to interact with your product. For example: You heard your coworkers
talking about UXPin. Youre interested in learning what it is and how it
works.
Well talk more about tasks in the following chapters, but for now keep these
important distinctions in mind as you come to understand what you want out of
your usability testing.
22
Knowing your target audience is not really a topic for usability testing; in
theory, this is something you should have already decided in the Product
Definition phase (as discussed in The Guide to UX Design Process &
Documentation).
23
TWEET THIS
However, depending on the complexity of your tasks, you may need more
than one user group. For example, when conducting user testing for our
Yelp redesign, we realized we needed two groups of people: those with Yelp
accounts, and those who did not. Once we knew the overall groups, we then
decided that both groups needed to have users who were located in the US,
used Yelp at most 1-2x a week, and browsed mostly on their desktops.
When focusing in on your test group, its also important not to obsess over
demographics. The biggest differentiator will likely be whether users have prior
experience or are knowledgeable about their domain or industry not gender,
age, or geography. Once you know whom youre looking for, its time to get
out there and find them. If you find you have more than one target group, thats
okay; just remember to test each group independently of each other that will
make your data more telling.
Dont obsess over demographics. Users prior experience and knowledge will likely matter
more.
TWEET THIS
Like all other factors, how you choose to find your participants will depend
on your specific needs. Keep in mind the who and why youre looking for, but
dont neglect the how much. Qualitative tests can be run with as few as 5 people,
quantitative tests require at least 20 people for statistical significance. For a full
list of user recruiting tips, check out Jakob Nielsens list of 234 tips and tricks to
recruiting people for usability tests.
If youre conducting later-stage beta testing, you can recruit beta testers from
within your existing user base, as long as its large enough. If, however, you need
to recruit them elsewhere, Udemy explains the best ways to find them.
26
Takeaway
We cant stress enough the importance of the pre-planning phases. The type
of test and users you go with will have the biggest impact on your results, and
going with the wrong choices will greatly reduce the accuracy. Having a solid
plan can make all the difference, and ensure that you meet your own personal
needs.
In the next chapter were going to start getting into the types of tests,
specifically scripted tests. With your usability goals ready, keep an eye out for
the tests that will help accomplish your plan to the fullest.
28
CHAPTER FOUR
Scripted Tests
More controlled tests for more specific results
29
scripted test is the most controlled of the test types, and is recommended
for testing specific usage aspects, like whether or not the user can find/
access a certain feature (or how long it takes to do so). Scripted tests tend to
produce more quantitative data, beneficial for usability metrics, but can also
generate qualitative data as well, depending on the how tight or controlling the
script is.
Before we get into the specific types of scripted tests (tree testing, benchmark
tests, and hallway testing), well first discuss a crucial decision in how you
conduct your test: whether to moderate it or not.
30
I. MODERATED TESTS
Luke Bahl and Bryan Andrew, Moderated Testing Manager and UX
Researcher (respectively) at UserTesting, believe that the payoff can be
significant if you have the time available for a moderated study. A moderator can
help probe the participant to delve deeper, creating data that is fuller and more
complete, plus keep users on track and clarify any confusion. Not only that, but
user reactions and even body language can provide useful data as well, but only
if theres someone present to document and interpret them.
31
Photo credit: Wikimedia labs testing. Blue Oxen Associates. Creative Commons.
As you can guess, moderated testing is not recommended for all tests. The
experts at UserTesting recommend it for the following situations:
Early stages in the development process Specifically in the prototyping
phase, where features may be incomplete or not even work, a moderator
can help answer questions and explain the unclear parts.
An advanced, complicated, or high-level product As with a prototype,
if there is a great chance for confusion or misinterpretation, a moderator
will help keep things on course.
Products with strict security concerns In these cases, a moderator can
keep the user where theyre supposed to be and keep them from accessing
sensitive information.
32
But even the moderation proponents admit that moderated tests have their
drawbacks, specifically convenience. Moderated tests require a knowledgeable
moderator, their time, and usually a specified location, as opposed to remote
usability testing. Coordinating the schedules of moderated tests can be
problematic, and only one can be done at a time, unless more moderators are
hired. More importantly, moderated tests can take participants out of their
comfort zone, so special care must be taken to avoid the various kinds of biases.
In UXPin, you can actually run a remote moderated usability tests quite easily.
Download the Chrome plugin, set up your tasks, and start testing. As you can see in
our testing overview, UXPin generates video clips that let you see every click, hear
users thoughts, and see their screens and faces.
For a moderated test, you could also let your testers participate from the
comfort of their own home. For example, Evernote actually ran a remote
usability test that was moderated in which the testers were in different
locations, but the moderators were all in the office. This offers the benefits
of moderation at lower cost (since you dont have to worry as much about
equipment setup), but it may not be suitable if you need a controlled lab
environment due to information sensitivity. Nonetheless, this tactic is effective
33
and Evernote gained insights that helped them improve user retention by 15%.
If you have any of the special needs listed above, moderation may be the right
choice. If you do choose this route, make sure you follow these 12 tips for being
a perfect moderator to minimize the likelihood of bias.
As youll see in the above video from our User Testing & Design e-book, you
can get maximum value for minimum cost when the tasks are written as clearly
as possible. Users are encouraged to think out loud, and you record their onscreen interactions. When the test is done, you can then use the video clips that
are most insightful and present them to your team for design changes.
There are downsides, however. The lack of a moderator means less control, less
personal observation, and a higher risk of confusion. Additionally, conducting
high volume, unmoderated tests using an online tool opens you to the risk of
attracting participants looking only for the incentive without putting effort into
the tasks. On the bright side, such participants can be filtered, especially by
looking at their time-to-completion or open-ended feedback.
Nonetheless, if you choose unmoderated testing, make sure you know the
criteria for picking the best usability tool. As the Nielsen Norman Group
advises, youll want something that offers same-day results, audiovisual
recording, and offers a broad demographic for recruiting testers.
35
Tree Testing
Tree testing allows you to test the information architecture by stripping out
the visual elements. With a tree test, you examine only the labelling and
hierarchy of your content. Martin Rosenmejer of Webcredible names it as one
of the most important steps early in the web design process. And we all know
the importance of information architecture if the content isnt structured
logically with a simple flow, it might as well not exist. Thats why an early tree
test can help identify and solve the problems before they actually become
problems.
Tree tests help solve IA problems before they become problems.
36
TWEET THIS
If your site content doesnt flow with a nice logical structure, it might as well not exist.
TWEET THIS
37
TWEET THIS
If tree testing seems like something that could benefit your project, Jeff Sauro,
Founding Principal of MeasuringU, goes into details about how to properly
run them. He explains that tree testing is used primarily for two reasons:
1. Determine a products searchability How well can users navigate the
site, and what areas cause the most problems with navigation?
2. Validating a change Did a recent update correctly fix the problem, or are
further revisions necessary?
Because tree testing examines the success rate of a specific task, more
participants will give you more accurate results. Check the this article from
MeasuringU to find the smallest margin of error within your means (we
recommend aiming for 5% error or better).
If youre concerned with navigational problems, see our section on card sorting
in the next chapter and compare which, if not both, would benefit you more.
One distinct benefit of tree testing is that you can also test hundreds of items (if
your site is even larger, just prioritize the most used navigation items).
38
In an essay on his website, bestselling author and speaker Scott Berkun points
out that, while other usability tests focus on specific aspects, the usability
benchmark test measures only user performance without regard to the why. In
fact, participants should even be discouraged from explaining their thoughts,
as explanations would impact the time and focus spent interacting with the
product.
Because benchmarks require more time and effort, Berkun outlines the optimal
times in which to run the test:
The product is stable To get the most out of the benchmark, make sure
the product is stable, i.e., all the errors you already know about are fixed
and its running at peak efficiency.
After a major release or update At this time, a benchmark can test how
effective the changes were, or if unforeseen problems arose in the process.
Preceding a major release or update In order to understand how the
next change impacts usability, its best to have a measure from which to
compare. Additionally, you may notice some areas that should be improved
before the next round begins.
39
TWEET THIS
When conducting this type of test, there are a few factors to consider. Nadyne
Richmond, Researcher at VM Press, gives 5 tips for planning out your
benchmark test:
1. Select the most important tasks to the product overall While its
tempting to select tasks related to the newest or experimental features, this
is not the correct test for that. A benchmark measures usability as a whole,
not in a specific area.
2. Use standard metrics The most reliable data comes from success rates,
time-to-completion, error rates, and satisfaction rating.
3. Do not disturb the user Little to no moderation should be involved
in a good benchmark test. Any distraction will bias the results, so avoid
asking for feedback or explanations of their behavior or at least wait until
theyre completely finished.
4. Using your target audience is essential This is especially important for
usability benchmark testing since this is a broad assessment of how your
target users perform with your product.
40
Of course you dont need to literally grab people from the hallway, but the idea
is that any small number of random users (from within your target audience)
will give you a sufficient amount of data for your usability goals.
Hallway testing is the bare minimum for usability testing. Grab 5 coworkers and get to
work.
41
TWEET THIS
The test itself doesnt have to be that complex. Corinna Baldauf, Web
Developer and UX Blogger, elaborated on Spolskys theories. She suggests
setting up a station with your product in a public venue she used an
office break room, while others suggest Starbucks. When someone comes
by, ask them to test the system, perhaps even adding some incentive (dont
underestimate the power of chocolate). Give them instructions, then step
back and watch. Dont forget to take notes, particularly on what is not going as
expected.
If you do this with five people, that should give you data thats accurate enough.
Why five? Jakob Nielson, co-founder of the Nielson Norman Group, created a
formula for the amount of usability problems found in a usability test:
N (1-(1- L ) n )
where N is the number of users and L is the proportions of usability problems
found by a single user, typically 31%. You can see the point of diminishing
returns in this graph.
42
You can see clearly that five people gives you all the data you need, while
anything more seems superfluous.
Hallway usability testing is one of the most popular forms due to its simplicity,
low cost, low resources, and high output. If youre interested in conducting
your own hallway usability test, the USAJOBS Team gives these tips:
Choose the right time and place choose a location with a lot of foot
traffic, at a time when youre not inconveniencing people.
Come prepared make sure you outline your plan ahead of time, and set
up 30 minutes before youd like to start.
Good greeters use greeters who are outgoing and charismatic, and who
can identify your target audience.
Reward your participants it doesnt need to be much, something like a
free coffee, or chocolate just to show you appreciate their help.
Look for ways to improve learn from your experience and keep an eye
out for ways to improve your testing process.
While not recommended to solve specific or complicated problems, hallway
usability testing is the perfect way to go for if youre looking for something
simple and easy.
When observing your user test, make sure you also write down whats not going as
expected.
43
TWEET THIS
At UXPin, were big fans of hallway testing. When we were finishing up our
integration with Photoshop and Sketch, our product team was visiting our
California office so hallway testing happened every day. A developer or designer
would set up his computer and ask us to import a static design file and turn it into a
fully layered prototype. The product manager would then take notes and revise the
weekly sprint based on the insights.
Takeaway
After reading this chapter, you are now more aware of the main scripted tests:
tree testing, usability benchmark testing, and hallway usability testing. You
know that tree testing focuses specifically on navigation, usability benchmark
testing determines a products overall usability, and hallway usability testing is
great for a simple and low-cost usability test. You also learned the difference
between moderated and unmoderated tests, and why unmoderated tests may
be more appropriate, except when you have incomplete or otherwise confusing
setbacks to your product.
In the next chapter, well talk about decontextualized tests, or tests that dont
directly use the product.
44
CHAPTER FIVE
Decontextualized Tests
& Heuristic Reviews
Delving deeper into your product without it immediately present
45
ometimes the best way to test a product doesnt involve the product at all.
Decontextualized tests, or tests that dont involve the product, are generally
geared to testing users attitudes on the product, or in generating ideas. But just
because they may be more conceptual doesnt mean theyre any less valuable as
a source of data.
In this piece, well focus on card sorting and interviews as two popular and
cost-effective decontextualized testing methods. On a related noted, well also
discuss heuristic reviews. Weve included it in this discussion because while
someone is interacting with the product, its not the end-user instead, a
group of experts reviews the features based on established best practices.
Sometimes the best way to test a product doesnt involve the product at all.
TWEET THIS
Card Sorting
The idea is so simple yet so meaningful. You write the different elements of your
product on note cards or Post-It notes, then have your participants organize
them in a way that makes the most sense to them. If youd like to go paperless,
you can also choose usability testing tool like OptimalSort for quick analysis of
common groupings. Regardless of analog or digital, the result gives you a solid
understanding on your products information architecture (IA), a big term than
means simply how you organize the elements of your product.
Card sorting mostly deals with issues of navigation, organization, labelling,
and grouping. This test is similar to tree testing that we learned about in the
last chapter; the main difference is that card sorting helps you understand how
users categorize content while tree testing shows you how they directly interact
with an existing IA to complete tasks.
46
47
Closed Sorting As with open sorting, users are given the elements;
however, they are asked to categorize them into predefined groups. This
is recommended if youre working within the restrictions of pre-existing
categories, for example, when updating an already developed website
structure.
The above image is an example of a closed card sort. In this case, you can see the
four categories in blue and the cards below. Users are then asked to place the
cards under whatever category seems best to them. If this were an open card
sort, youd simply remove the blue categories and ask users to create their own.
Aside from open and closed, other variations include groups or individuals, and
remote or on-location. Groups allow users to work collaboratively, for better
or worse, and can help you learn about multiple users at once; however, group
dynamics might affect your results. Remote location testing for example,
using an online software tool allows you to test more users in a faster time,
yet youre unable to directly observe their decision-making processes. Onlocation gives you a fuller understanding of how your users came to their
decisions, but requires more planning and scheduling.
48
For card sorting, simpler is better. Avoid unnecessarily complex words and jargon.
TWEET THIS
User Interviews
If you want to know what users think, sometimes all you have to do is ask.
Interviews directly connect you with your target audience and give you a high
degree of control over what data you collect; however, your research is mostly
qualitative and limited by the participants self-awareness and articulation.
50
The nuances of interviews lie in what to say and how you say it. Kate Lawrence,
UX Researcher at EBSCO Publishing, offers some great insights into these
areas. When asking questions, its best to center around the participants
perspective of the environment in which your product will exist. Here are a few
great interview questions that apply to any product:
What are five websites/apps/products that you use on a daily or weekly
basis? Knowing what similar products people are using will help you
understand their motivations behind using them, and generally what
theyre looking for.
What is your usual process for searching/shopping/evaluating products
like ours? Its helpful to know how users interact with other similar
products so you can design yours to meet or exceed their expectations.
What do you like or dislike about the Internet/apps/products in
general? The answer to this question can be incredibly revealing, but
you may need to read between the lines.
51
How would you describe your ideal experience with our product? A
little on the nose, but the answers will tell you exactly what your users like.
While you may not want to follow their responses word-for-word, try to
notice any commonalities with the answers from other interviews.
With the right questions and the right atmosphere,you can mine a lot of usable
data from interviewees. But you also need to know how to behave in a way that
wont bias the results while putting interviewees at ease. Michael Margolis,
UX Research Partner for Google Ventures, gives 16 practical tips for running
a usability interview. For example, make sure you also write down interviewee
body language and always ask follow up questions.
When it comes to usability interviews, the same people skills you would use at
a party still apply, just with laser-focused purpose. With the right mood and the
right questions, the interview will be productive and maybe even fun.
Everything the participant says should be fascinating, because even if it might seem
boring, its still valuable research.
52
TWEET THIS
Heuristics Evaluations
Think of heuristic evaluations as a scorecard for your products usability.
Heuristics breaks down the broad category of usability into manageable
sections so that you can rate each individual section and see where to improve
and where to stay the course.
Once you have a working prototype, a heuristic evaluation (or usability review)
can be a low-cost method of checking against usability best practices. Heuristic
evaluations are also helpful for competitive benchmarking since you can
compare competitors against the same criteria (as youll see in this image).
Heuristic reviews can even be carried out by people who arent UX experts,
as long as youve reviewed and walked through the usability scenarios. While
theyre cheap and usually only require a day or two, they dont actually tell
you the usability of a system (since youre not testing with real users) and may
suffer from inconsistency and subjectivity (since theyre carried out by different
people). That being said, they are a still a great reality check since youll be able
to catch glaring UX violations.
Heuristic reviews dont reveal if the product is actually usable - only that it should be
usable.
TWEET THIS
1. Plan the evaluation Establish your usability goals so that you can
communicate them to the evaluators. If you want to know specifically
about the dialogue windows on your website, dont be afraid to mention
that.
2. Choose your evaluators If youre on a limited budget, even
inexperienced evaluators will find 22-29% of your usability problems so
a novice evaluator is better than none. Five experienced evaluators, on the
other hand, can uncover up to 75% of usability issues.
3. Brief the evaluators If you choose not to go with experts, make sure you
brief your evaluators on Nielsens ten heuristic checkpoints so that they
know what theyre looking for. If youre reviewing a website, you can start
with a more concrete 45-point website usability checklist.
4. Conduct the evaluation While its recommended that each evaluator
conduct their examination individually so that they can fully explore the
product on their own terms, sometimes group evaluations are better for
time since they can all happen at once. Whether its performed individually
or together, its best to have 3-5 people.
(Note: Jakob Nielson suggests that each evaluation session should last between one and two hours. If your product
is especially complex and requires more time, its best to break up the evaluation into multiple sessions.)
5. Analyze the results Unless youre going with a professional firm, you
may need to compile and analyze your own responses. Remember that a
high score doesnt mean your product is actually usable, only that it should
be usable.
54
To give you a better idea of how this works in real life, well explain a few
examples. Oracle uses a streamlined 10-point list of heuristics gauging
everything from application complexity to frequency and helpfulness of
error messages. Usability issues are categorized as either low, medium,
or high severity with supporting notes. The team then isolates the top ten
most important issues for immediate fixing. If youre curious about what a full
heuristic report may look like, check out this full heuristic evaluation of Apple
iTunes.
Takeaway
In this chapter, you learned about user tests that examine your product without
actually using it. Decontextualized tests tend to focus more on abstract and
conceptual areas, so if those are what youre looking for, one of these tests may
be what youre looking for.
For analyzing a sites navigation from a design perspective, card sorting is
the best usability method (tree testing works better for testing existing IA).
Some people prefer a more human connection with their users, and for this,
interviewing has been the standard in user research since long before the digital
era. Different but related are heuristic evaluations, which puts your products
usability evaluation in the hands of others.
In the next chapter, well learn about a more direct testing method: testing the
product as the user would use it naturally.
55
CHAPTER SIX
56
ests in which people use the product naturally (without a script) are the
closest you will get to seeing how your product might perform in the
wild. Natural and near-natural tests minimize the amount of interference from
the observer, who is more interested in what the user does of their own will.
These tests are great for broad data, especially ethnographic, but sacrifice control in exchange for greater data validity.
source: UserTesting
Because the goal is to minimize interference from the study, natural tests are
usually conducted remotely and without a moderator. The most common
natural tests A/B testing, first click testing, field/diary studies, and eyetracking are intended to understand user behavior and attitudes as close as
possible to reality.
A/B Testing
In an A/B test, different groups of participants are presented with two choices
or variations of an element. It is generally a scientific test, where only one
variable differs, while the rest are controlled. Mostly conducted with websites to
57
There are also other usability testing tools like Optimizely (great for everything)
and Unbounce (more landing page focused) that make it extremely easy to
get started with A/B testing. These usability tools handle the distribution and
collection of data for you, so all you have to do is wait for the results. If youre
interested in a comprehensive list of website elements to test, you can also
check out this detailed explanation of 71 things to A/B test.
source: WhichTestWon
59
Regardless of what you choose to test, make sure you follow these five
guidelines:
1. Run both variations at the same time Time is a control, so doing
version A first and then version B later may skew the results. Running both
tests simultaneously and evenly will ensure the most accurate results.
2. Test with enough people for statistical significance As shown with this
sample size calculator, you should test each variation with enough people
for a 95% significance rate.
3. Test new users Regular users will be confused if they see a new
variation, especially if you ultimately choose not to use it. Plus, theres the
mere-exposure effect, in which people prefer what theyre used to.
4. Be consistent with variations on all pages For example, if you are
testing the placement of a call to action that appears on multiple pages, a
visitor should see the same variation everywhere. Inconsistency will detract
from accurate results, so dont show variation A on page 1 and variation B
on page 2.
5. Tailor your test length to statistical significance Cancelling the test too
early will reduce accuracy. Decide your statistical significance, then you can
use this test duration calculator to get a rough timeline. Many paid online
usability tools (especially Optimizely) also have a feature for calculating
optimum time based on the goals.
To see some of these best practices put to use, check out this site containing
hundreds of free A/B test case studies. Hubspot also provides a highly visual
and easily digestible 27-page guide to A/B testing.
60
The test itself is simple, and can be conducted with online testing tools like
Chalkmark by Optimal Workshop. The software presents the user with a
screenshot and a task, and then records their first click. For example, as we
discuss in User Testing & Design, we asked users to find a local mechanic on
Yelp and found that 24% of them first clicked on the Search bar (suggesting that
the existing information architecture may not be clear enough).
61
When it comes to the web, first impressions are oftentimes final impressions.
TWEET THIS
I. FIELD STUDY
A field study provides data you cant find anywhere else by letting you observe
users in their own environment. Jared M. Spool, Founder of User Interface
Engineering, believes that while standard usability tests can lead to valuable
insights, the most powerful tool in the toolbox is the field study.
A diary study captures the expectations, mindsets, moods, and social contexts
that affect the user experience. A diary study might reveal that a bad mood
64
or criticism read on the web impacted the users assessment of the product
experience, independent of the product itself.
Lets say that youre asked to improve a web application that helps product
managers track progress. You could provide tape recorders and/or journals to
five product managers and ask them to document anything odd or frustrating
they experienced when using the application. After a few weeks, you would
analyze the data and make specific recommendations.
While these may make the diary study seem like the perfect usability test, like
all others, it too has drawbacks:
Significance of participant The quality of results will depend on the
quality of the participant. Because this takes a good deal of effort on their
part, the participants commitment to the project influences the outcome
whether positively or negatively. On top of that, the participants selfawareness, self-expression, and writing skill can all sway the results.
Training sessions While it may sound like the participant acts
independently, the truth is that a thorough training session is necessary to
ensure the participant understands exactly what is expected before starting.
Analysis The analysis of an entire diary is time-consuming, especially if
it is hand-written.
65
Ruth Stalker Firth, HCI Researcher and Lecturer, believes that diary studies
are best used as a means of cultural probing and go beyond the find out whats
wrong mentality that can be prevalent in usability testing. To help counter the
downsides, you can follow a few best practices:
1. Provide contextual and open-ended questions Contextual questions
like, What prompted you to use the app? give you direct insight, but
open-ended questions like, What would you have done differently in this
situation? can uncover new solutions.
2. Let users decide how to record themselves Text, online photo
galleries, voice recording, even Twitter can all work. It also helps the
process feel more natural and makes participants less self-conscious.
3. Keep size in mind The diary (whatever form) can be as small or large
as needed. On paper, space for forty entries can be overwhelming, while
ten might be more encouraging. Thats also why digital methods might be
better since users can use as much space as they want.
For a more detailed explanation, complete with hypothetical examples, check
out this extensive post by UserTesting and this list of Dos and Donts.
Diary studies are a means of cultural probing that go beyond the find out whats wrong
mentality.
TWEET THIS
tests have given us some general rules that apply across all products, not just
yours. Ritika Puri, co-founder of StoryHackers, writes in a post for Crazy Egg
about the five most important lessons eye tracking has taught us so far:
1. Users are predictable As we can see by the eye tracking patterns above,
peoples sight follows similar trends, allowing us to plan our visual layouts
for the masses. In Web UI Best Practices, we explain how to lay out a site in
accordance to the popular F pattern and Z patterns.
2. Users search pages differently depending on goals A users eye pattern
will differ depending on why they are searching a screen; for example,
browsing and searching for something in particular have two different
modes.
3. Users are drawn to visuals Visuals like thumbnails or vibrant colors will
attract a users attention more than plain text, so use this accordingly.
4. People ignore ads In a phenomenon that Jakob Nielson calls banner
blindness, people will neglect ads habitually, so online advertisers will
have to work harder.
67
68
would be too much strain on your end, but you still want them to continue
reviewing the most updated versions of the product.
6. Adding a new features resets the beta cycle It may seem harmless to
add some new tricks during the end of the beta cycle, but these often have
unforeseen consequences. If a new feature is necessary, accept that youll
need eight more weeks to fully test it.
7. Understand the difference between technical beta and marketing
beta Finding and fixing bugs is technical beta. Prereleases to preferred
customers or press are marketing beta. The feedback from technical data is
what helps you make a better product; marketing beta is mostly for sales/
exposure.
Keep in mind that beta testing should be the last usability test conducted
in the development process. Make sure youre at the right stage before you
proceed; otherwise, there will be a lot of wasted effort. To learn more about
beta testing, you can check out Chapter 7 in The Guide to UX Design Process &
Documentation and the many free e-books in Centercodes library.
70
Takeaway
Tests that observe the users in their natural (or near-natural) environments
provide a specific type of data that other, more controlled tests cant access.
An A/B test lets you make decisions that are informed by more thorough and
statistically significant results (since you have a huge sample size). A hallway
usability test, meanwhile, is just a quick and dirty method but not very
scientific.
Similarly, field and diary studies can provide you with unique information about
your target users namely external factors such as timing, environment, mood,
etc. that more direct card sorting or tree testing cannot. As for first click and
eye tracking tests, they literally let you see your website as your users do, but
make sure you run other types of tests for the right context. While each of the
different test types has its own advantages and disadvantages, sometimes its
best to mix-and-match them to achieve results more specific to you.
71
CHAPTER SEVEN
Hybrid Tests
If other tests dont meet your needs, try combining them
72
ests that incorporate elements from one or more of the previous categories (scripted, decontextualized, natural tests) fall under the label of hybrid tests. These tests tend to lean towards capturing attitudinal and conceptual
feedback, but nonetheless reveal insights that have very specific impact on the
usability of the final design.
Hybrid tests present the user with creative methods for discerning what kind
of experience or features they would want in a product, sometimes even
allowing users to provide direct input for the design. While they may not be
very practical for some of the later stages of product development, the testing
well discuss here can make a big difference in the earlier phases by helping you
understand the minds of your target users. Specifically, well cover desirability
testing, concept testing, and participatory design.
73
Desirability Testing
Desirability tests are similar to interviews (covered in Chapter 4) in that the
tester and the participant sit down together and discuss the conceptual aspects
of a product. The difference and its a notable difference is in the approach.
The idea is that asking participants directly what they want can bring misleading
results. The approaches in desirability testing seek to circumvent factors
like poor articulation, lack of self-awareness, or the apathy that comes from
answering similar questions one after another.
What users say they want can be completely different from what will actually help
them.
TWEET THIS
I. TRIADING
In a roundabout way of gauging your participants emotions, the tester presents
the test-taker with three different but related concepts or ideas for example,
McDonalds, Burger King, and Wendys and asks them to explain how one is
different from the others and explain why. This line of questioning drives harder
than simply asking which do you prefer, and challenges the participant to
think critically. It also engages participants more by encouraging open-ended
thinking.
74
Triading is quite helpful for evaluating the competitive landscape and assessing
different options from an interaction design perspective. Make sure you follow
an iterative process where you encourage participants to continue vocalizing
features that they feel distinguish two concepts from the third until they run
out of ideas. Then, repeat the process with multiple participants (5-6 is a good
sample) and youll be able to see trends that define segments and personas.
75
76
source: FiveSecondTest
Similar to first click testing, this test works well for pinpointing initial
impressions on layout design, information architecture, and content. But
because this test focuses on the users memory of particular elements instead of
emotional impact, its best used as a supplementary method. You could run this
test cheaply and manually by showing screenshots and then asking questions, or
use a scaleable online service like FiveSecondTest.
Michael Hawley, Chief Design Officer at Mad*Pow, writes about his success
with the adjective card. In his test, he gave participants a card with 118 carefully
selected adjectives, both positive and negative. He would then show the
participant a user interface and ask them to describe it with 3-5 words on the
card. This format allowed the test-taker to better articulate their emotions, and
also allowed the opportunity for the tester to follow up on why they felt as they
did.
Dr. David Travis, Managing Director of UserFocus, has also experienced
success with adjective cards. For him, this method stood out by giving
participants permission to criticize the system. In fact, not only did users select
negative and positive adjectives, they could also reveal negative connotations of
otherwise positive adjectives. For example, a user might select sophisticated,
but then explain that the interface was too sophisticated for my tastes.
78
You can run this test manually by printing out and cutting out the full list of 118
cards, or use an online service like MojoLeaf to administer the test remotely to
many participants at once.
Concept Testing
In the spirit of looking before you leap, concept testing allows you to discover
your users reactions to your concepts before you spend the time and resources
on development. These tests follow the same formats as the other usability tests
except they substitute concepts in place of the real product.
I. OVERVIEW
As we discussed in the Guide to Minimum Viable Products, a concept test can
be considered a bare-bones MVP since youre only testing for the viability of
an idea. A concept test could be as simple as a survey sent out to your target
audience or a landing page in which you gauge the concept based on signups
(similar to what Buffer did in the image below).
Scott Smith, Founder of Qualtrics explains the main benefits of concept testing
include finding the target users of the product, finding out what features they
care about, and determining how you might promote and price the product.
Simply put, concept tests provide the feedback to turn a deliberately sketchy
idea for a product or service into something that users might actually want.
Because testing an idea with an actual product can be tricky, concept testing
methods gravitate towards surveys, interviews, and landing pages. However, it
is the focus of these methods that set them apart from more traditional usability
Concept tests provide the necessary feedback to turn sketchy ideas into desirable proTWEET THIS
tests.
There are three main types of concept tests depending on the maturity of
ducts.
the product:
79
New Product Concept Tests These identify the benefits that resonate
most with customers, and the features to create these benefits. Successful
tests let you to prioritize your design elements and better schedule the
development process, plus allow you to plan ahead for after the release.
Product Usability and Serviceability Tests How can you improve
the experience with an existing product or service? This test helps you
understand what direction might make the most sense for updates to
existing products (whether its ease of use, simpler navigation, etc).
Price and Incentives Tests These will give you a head-start on
marketing and promoting your product since youll have a better idea of
what people will pay and how you might bundle the conceptual product
with existing products. If youre testing your concept with a landing page,
80
you can create pricing options and gauge the clickthrough rate on each
option (like Buffers tactic).
If youre interested in low-cost methods of concept testing, SurveyMonkey
offers tips for concept testing with surveys and landing pages.
If youre looking for a cheaper method, you could do a hallway concept test in
which you draw a few sketches, grab a colleague not associated with the project,
show the sketch for five seconds and then ask for what stood out. You could just
as easily replicate this process with five users or customers for quick feedback
on your concept.
82
Participatory Design
Sometimes if you want to design a product your users will like, its best to
involve them directly in the design process. Participatory design is a usability
testing method that falls right within the discipline of user centered design and
can be a great complement to the collaborative design methods we discussed
in Web UI Best Practices. Its become quite a popular methodology with
companies like Pinterest, who incorporate it into their design process.
Photo credit: Many Passionate Teams in Collaboration. Gaurav Mishra. Creative Commons.
83
Erin Muntzert and David Sherwin, UX Consultants for Frog Design, point
out how to get the most use out of participatory design. In terms of general
guidelines, it helps to treat the session as a conversation (instead of a classroom
exercise), be crystal clear about the problem space and scenario, and record the
session (or take detailed notes). Well explain below how to prepare, narrate,
and conduct participatory design sessions.
For the best results, treat participatory design sessions as conversations - not classroom
exercises.
TWEET THIS
86
Photo credit: college & art journal ideas zine. Katie. Creative Commons.
87
source: UXPin
Interface Toolkit Using a tool like UXPin, give participants various premade elements and ask them to build their perfect interface. Not only is
this fun, but its also ideal for seeing how your users prioritize features.
Fill-in-the-Blanks A less-involved and less costly version of the
interface toolkit, you prime users with a narration activity, then provide a
blank set of UI elements (Post-It notes work well) and a canvas (such as a
whiteboard). Then ask them to place and label elements however they
see fit.
88
Prioritization activities help you determine not what the user wants, but what the user
wants most.
TWEET THIS
89
V. CONTEXTUAL ACTIVITIES
By simulating the experience of using the product, users will be better able to
describe their opinions about it. Contextual activities try as best they can to
immerse the participant into what the concept or product might be.
Customizing Scenarios Through the use of text, storyboards, or comic
strips, the participants are presented with scenarios and asked to give
feedback at each step, and even customize the scenario along their own
personal experiences. This helps bridge the gap between product concepts
and how they fit in the users real life.
90
Takeaway
Hybrid tests are a great way to think outside the box and collect insight that
more traditional tests cant reach. Desirability tests go above and beyond in
understanding the target users psyche. Conceptual tests can save you a lot of
time by solidifying your plan before you begin development. More than any
other test, participatory design gives the target user a hands-on approach in
designing towards their needs.
Weve just examined the most common and most useful usability tests available
today.In the next chapter, well close by discussing the differences between web
and mobile usability testing.
91
CHAPTER EIGHT
92
eve spent the bulk of this e-book outlining the different types of usability tests and the strategies to use them most effectively. However, the
scope of these tests is vast and can be used on any product from cloud payment
systems to next-generation gaming consoles. In this chapter we want to narrow
our focus a little so you can best understand how usability evaluation works individually for websites and mobile devices.
The web is more than just your website. Test your competitors.
93
TWEET THIS
The principles for web usability are the same as with other products, except
they are even more important considering that there are over a billion websites
as of September 2014. The bottom line is that there are so many similar websites
that visitors will simply move onto the next site if the first one they visit isnt
usable.
While your website might be your baby, visitors will just move to the next one if its ugly
and unusable.
TWEET THIS
3. Test competitors or peer websites. Only testing your own site robs you
of context. Including other websites will help you see the forests and the
trees. Try asking the participant to show you a site they use on their own,
and have them show you how they use it. Its not just about how users
interact with your website its about tailoring your website based on how
they use the web.
4. Hide which site youre testing. Users tend to be less honest when they
know theyre talking to an employee of the company under scrutiny. Do
your best to not reveal youre testing your site. The user may figure it out by
the end of the session, but the longer you delay it, the more accurate your
first impressions. Try asking them to assess competitor or peer websites
first this puts them in the right critical mindset.
As a guiding principle, try not being too rigid. Keeping an open mind and a
loose attitude will put your test-taker at ease and yield better, more natural
results.
Its not just about how users interact with your website. Its about tailoring your website
based on how they use the web.
95
TWEET THIS
97
For websites, usability is just the bare minimum. Delight is the new standard.
TWEET THIS
98
Takeaway
Web usability and mobile usability may be under the single umbrella of
usability, but the approaches can seem like night and day when you think
about all the subtleties. When planning your goals, keep in mind the usability
functions special to whichever one youre designing for, its distinct functionality
criteria, and the tests best used to study it.
Now its time to get started. Take your time at each step of the way and dont
proceed if you dont understand something. To help standardize the process,
feel free to check out the free usability testing kit created by UXPin CEO
Marcin Treder. As youre testing, remember to always focus on your goals.
Because if you dont know why youre testing, then the methods are irrelevant.
If you dont know why youre testing, then the methods are irrelevant.
101
TWEET THIS
102