Ab Testing Guide
Ab Testing Guide
Ab Testing Guide
Preface
A/B testing was formerly the preserve of organizations with substantial
technical resources, but the arrival on the market of new tools simplifying the
implementation of these tests is democratizing the practice. Such tools permit
anyone at all to create several versions of their web pages and measure the
effectiveness of each one against their objectives, one of the most important of
which is the conversion rate.
Though these tools greatly simplify the process of implementing these tests,
this apparent simplicity should not lead one to forget that obtaining significant
results from A/B testing depends primarily on establishing an appropriate
testing methodology.
Numerous organizations have embarked upon A/B testing in the hope of seeing
their conversion rates increase dramatically, only to end up achieving limited
results. What they very often have in common is that they have rushed into it
without taking time to carefully consider the relevance of their tests and the
contribution of the elements tested to the conversion process.
The purpose of this white book, aimed primarily at marketing and e-business
teams, is to act as an aid to mastering A/B testing methodologies. Within it,
teams will find practical advice about integrating A/B testing into their conversion
optimization strategy. What can we expect from it? How can we benefit from
it? What are the pitfalls to avoid and what is best practice for achieving good
results?
Rmi Aubert
Co-founder of AB Tasty
Contents
10
22
26
33
Conclusion
35
Glossary
36
Notes
38
1
Conversion optimization:
new Holy Grail of e-commerce
Conversion optimization can play a major role in increasing a businesss
profitability, yet it remains little used. Though the average conversion rate for
e-businesses is somewhere between one and three percent1, Forrester Research
estimates that for each $100 spent on traffic acquisition only $1 is dedicated
to conversion optimization2. Many e-businesses are therefore focusing on
acquiring traffic yet failing to convert that traffic in more than 97% of cases.
Investing $1 more to increase conversion rates, even by just a few percentage
points, can prove very profitable and can improve the return on investment
provided by traffic acquisition channels. At a time when acquisition costs are
on the increase and the quest for new sources of traffic is becoming more
complex, why not begin by maximizing the potential of your existing traffic?
What conversion optimization promises is essentially simple: generate more
revenue from a consistent level of traffic.
Though the concept is a simple one to describe, it has to be recognized that for
many businesses the difficulty lies in investing that additional dollar. Conversion
optimization is basically a complicated practice that businesses find difficult to
understand; because the conversion process, itself, is a complex mechanism. It
brings various factors into play, such as:
50
25
ONLINE SURVEYS
CUSTOMER FEEDBACK
USABILITY TESTING
22%
21%
20%
17%
15%
EXPERT USABILITY
REVIEWS
CUSTOMER
JOURNEY ANALYSIS
27%
MULTIVARIANTE
TESTING
COPY OPTIMISATION
29%
ABANDONMENT EMAIL
30%
EVENT-TRIGGERED
BEHAVIOURAL EMAIL
40%
SEGMENTATION
40%
COMPETITOR
BENCHMARKING
42%
CART ABANDONMENT
ANALYSIS
46%
A/B TESTING
E-tailers are not the only ones concerned about A/B testing. Media sites or
internet service providers can use this method to optimize their conversions,
whether its filling out a form, registering for a newsletter or increasing the
consumption of pageviews if their business model is based on advertising. All
players in the web industry can therefore benefit from A/B testing.
2
A/B testing, a practice
rapidly gaining in popularity
A/B testing, a practice that has long been used in direct marketing, involves
submitting several versions of a message, differentiated by a single criteria, to
a sample of consumers then measuring which of the versions has achieved the
best results.
The development of digital marketing has brought new perspectives to the
practice by multiplying the range of tests and performance measurements
possible. When applied to a website, A/B testing effectively permits a practically
unlimited number of versions of a page to be tested and the performance of
each version to be measured using indicators such as visitor engagement or
buying behavior. Advances in technology have also led to the development of
dedicated tools that make implementing these tests and analyzing their results
easier. These tools are finally making it possible to create multivariate type tests
in which multiple elements within a page are simultaneously modified in order to
identify the best combination.
50 registrations
85 registrations
6
Nevertheless, obstacles
to adopting A/B testing exist
The supposed complexity of implementing the tests. Fortunately, new tools
designed for use by marketing teams, such as AB Tasty, have appeared on the
market. Their purpose is to give users the independence to implement their own
tests, without requiring the intervention of technical teams.
The lack of expertise within businesses. Though certain tools make A/B
testing accessible to all, their simplicity must not conceal the fact that a strict
methodological approach must be adopted if a testing program is to be effective.
This white book is intended to solve these problems by providing users with
both a methodological framework and tips and advice to enable them to get the
most from their A/B testing tool.
3
The place of A/B testing in
conversion optimization
A/B testing is a tool for use as part of a conversion optimization strategy, but
the conversion strategy cannot be reduced to the use of a single tool. A/B
testing permits hypotheses to be statistically validated, but it does not on its own
provide all the keys to understanding web visitors behavior. Yet it is precisely by
understanding this behavior that any impeding factors and conversion problems
can be identified.
Other methods and tools that can provide additional information about web
visitors and indicate the hypotheses to be tested must therefore be used to feed
into the A/B testing strategy. Though a good testing tool is necessary, that alone
will not always be sufficient where conversion difficulties are complex.
DIAGNOSIS
POSSIBLE
SOLUTIONS
TESTING
The key to success with an A/B testing strategy is therefore the formulation of
powerful hypotheses that can have a positive impact on conversion. Though
random testing, with no genuine justification provided for the hypotheses tested,
can be justified when learning to use a testing tool, the practice must rapidly be
replaced with a strategy based on solid foundations.
There are numerous sources of information available to help increase your
understanding of web visitors:
Web analytics data. Though this kind of data does not explain
web visitors behavior, it does permit conversion problems
to be identified (e.g. shopping cart abandonment). This kind
of data also helps when prioritizing the pages to test.
Heuristic evaluation and ergonomic audit. These
methods of analysis are an inexpensive way of discovering
what the website experience is like for the user.
User testing. This kind of qualitative data, though limited by the
sample size, can prove to be a source of very rich information not
otherwise revealed through the use of quantitative methods.
Eye tracking and click tracking. These methods shed light on
the way in which web visitors interact with the elements within
an individual page, not just between the different pages.
Client feedback. Businesses already collect a large amount of
feedback from their clients (e.g. comments and reviews left on the
site; questions asked of customer services). The analysis of this
type of feedback can be supplemented by the use of tools such
as client surveys or live chats to collect additional information.
The following are examples of tools used as part of a
conversion optimization strategy
4
Implementing an
A/B testing methodology
Equipping yourself with a rigorous methodological framework is the best
method of obtaining reliable results from a program of A/B testing. In this
chapter we detail the steps to take in implementing such a program.
The procedure is first outlined below then described in detail, one step at a time.
10
1
2
3
4
5
6
7
8
9
10
11
12
Sharing the roadmap will allow the efforts of the participating parties to be
mobilized, aligned and coordinated towards achieving the defined objectives.
Finally, the roadmap will act as a guide for the A/B testing process.
13
14
15
16
It is also advisable to integrate the tests into a web analytics tool in order to
benefit from complementary metrics and to be able to analyze other dimensions
of your tests impact.
Results analysis also depends on the objectives defined beforehand and the
KPIs involved. Though there is nothing to prevent the measurement of several
indicators during a test (e.g. add to cart, visitor engagement levels, etc.), it is
important to identify a primary KPI to differentiate between the variations. It is
not rare, in fact, to observe a test affecting two KPIs in opposing ways (e.g. an
increase in the number of purchases but a decrease in average cart value).
Result interpretation thus differs depending on the businesss objectives.
17
Conversions recorded
Conversion rate
Original version
100
5%
Version 1
100
15
15%
The statistical tests will indicate a gain of 200% with a confidence index of 98%.
However, with the sample size so small, it is possible that the results will be
substantially altered if the test is left running for a few additional days. This is why
it is advisable to have a sample of a sufficiently large size. There are scientific
methods available to calculate the size of the sample. However, for practical
purposes, it is advisable to have a sample of at least 5,000 visitors and 100
recorded conversions per variation.
Finally, even if the sites traffic allows a sufficiently large sample to be quickly
obtained, it is advisable to leave the test running for several days to account
for differences in behavior observed on different days of the week, or even
from hour to hour during a single day. A minimum duration of one week is
therefore preferable, two weeks ideally. In some cases, this period can be even
longer, especially where the conversion process concerns products whose
purchase cycles require time to complete (complex or B2B products/services).
There is therefore no standard test duration.
18
outperforms the original by 10%, the gain may be less outside holiday periods.
Traffic origin can also affect the gains indicated by a test. Level of buzz, or an
acquisition campaign, can cause a peak in conversions involving web visitors
who exhibit behaviors different from the types of behavior normally observed.
20
ACT
MEASURE
TEST
ANALYZE
DESIGN
21
5
Efficient A/B testing:
hints and tips
Our intention here is to describe certain good practices which, we hope, will
enable businesses to avoid some of the pitfalls encountered when implementing
A/B testing. They are born out of the experiences, both positive and negative,
that our clients have had when carrying out their testing.
It is advisable to conduct at least one A/A test to ensure that traffic is randomly
allocated to the different versions. This also provides an opportunity to compare
the indicators reported by the A/B testing software with those from web analytics.
The figures should be evaluated in terms of their approximate values rather than
precisely comparing them. Precise comparison, moreover, proves impossible,
because the methods of calculation are not identical, as is also the case when
different web analytics tools are compared. Large discrepancies, however,
warrant further investigation in order to ascertain whether or not the two tools
are being correctly implemented.
This allows the impact on the variable to be carefully isolated. If an action buttons
placement and caption are both modified simultaneously, it will be impossible to
identify which change produced the effect observed.
22
If there are many variations for a small amount of traffic, the test will take a long
time to produce conclusive results. Where a low amount of traffic is allocated to
a test, a low number of versions must be used, and vice versa.
It is not advisable to make any decision at all until the test has achieved a level of
statistical reliability of at least 95%. The probability that the differences observed
in the results will be due to chance rather than the modifications introduced
will be too high otherwise. Furthermore, it is possible to see a trend in results
reversed if a test is left running for a longer period of time.
Even if a test rapidly achieves statistical reliability, sample size and the
behavioral differences observed on different days of the week need to be taken
into account. It is advisable to allow a test to run at least once per week, twice
per week ideally, and to have registered a minimum of 5,000 visitors and 100
conversions per version.
23
Where a test takes too long to achieve a reliability rate of 95%, it is likely that
the element tested does not impact the indicator measured, or the modification
is not significant enough. There is no point in prolonging the test: it results in
wasted time and unnecessarily monopolizes a portion of the traffic that could be
used for another test.
In some cases, running a test on all the visitors to a website does not make
sense and may even lead to false results. Where a test is designed to measure
the impact of different customer benefit packages on the sites user registration
rate, testing the existing registered user base serves no purpose and may even
create dissatisfaction amongst existing users, who would not be aware of said
benefits. It therefore makes sense to only subject the new visitors to the test.
24
25
6
A/B testing in practice:
which elements to test?
This is a recurring question, and one that relates directly to the fact that, in many
cases, businesses are unable to explain their conversion rate, be it good or
bad. If a business was aware that its web visitors did not understand its product,
it would not attempt to prioritize the testing of the placement or color of its
add to cart button. It would test instead different customer benefits package
formulations. Each case is therefore different, and the aim of this chapter is not
to provide an exhaustive list of elements to test, but rather some of the aspects
to consider.
27
28
29
30
Removing uncertainty
and adding reassurance
This involves all those elements on a website which can give rise to confusion or
cause questions to be posed. Very often, this concerns elements missing from
the site: the user needs more information to convince them that the product or
service meets their needs, but they cannot find it on the site. A qualitative study
carried out on a sample of prospects must be used to highlight these elements
so that they can be integrated into the site.
To dispel any other doubts about the features and the advantages of the product
or service offered, various marketing tactics can be tested:
31
32
7
Beyond A/B testing:
how can conversion rates
be continuously improved?
A/B testing, because of the methodology it imposes and its iterative nature, is
an excellent way of identifying what does and does not work with respect to
different audience segments. Advanced testing tools provide comprehensive
reporting interfaces offering filtering and data recalculation features that allow
you to accurately identify the messages which have been most effective with
respect to each different type of web visitor.
The next stage thus involves taking advantage of what has been learned from
the tests in order to personalize the experience for each user segment. This
means using the right message, with the right visitor, at the right time. By using
an optimized customer journey, the results from A/B testing, and messages
customized for each visitor segment profile, e-businesses maximize their
chances of achieving conversions.
EXTEND
TARGET
sources, behaviors,
characteristics
ANALYZE
TEST
33
Some A/B testing tools, such as AB Tasty, allow you to move from a testing
approach to a personalization approach very easily. They are designed
primarily to provide agility, and this approach extends to content personalization
campaigns. In contrast to other software solutions, based on opaque algorithms
and relying entirely on automation, these types of solutions leave the user with
complete control over the personalization scenarios they envisage.
The user remains in a familiar software environment where they find the
same automation as offered for test creation: same interface, same manner of
operation, same indicators. They can create or modify personalizable elements
with ease using interactive tools they are familiar with. They can then define
the types of user to whom the messages are to be addressed. The user has all
the targeting criteria needed to achieve this at their disposal, enabling them to
create personalized content of varying degrees of complexity:
traffic source (e.g. CPC, affiliation, etc.),
web visitor behavior (e.g. visit history, specific actions, etc.),
data generated by the back office (e.g. existing segmentation, etc.),
type of device (e.g. mobile phone, tablet, etc.) and browser,
geographical location and many others.
Messages can be linked to web user segments in just a few clicks.
The e-business therefore has the benefit of unprecedented flexibility and
speed of execution, in terms of personalizing their users experience, without
ever having to call on the services of technical teams. The possibilities are infinite,
and limited only by the businesss creativity and ability to carry out analysis.
34
8
Conclusion
We hope that the reader, in the course of reading this white book, will have gained
awareness of the prerequisites required for implementation of an effective A/B
testing program. Such a program demands much more than just a high quality
testing tool. As with many disciplines, success depends on a carefully weighted
blend of human resources, the process itself, and technologies. A/B testing is
no exception to this rule.
We also encourage the reader who is implementing A/B testing for the first time
to adopt the methodology introduced over the course of these pages as early
as possible. The advice distilled here will be of invaluable help in establishing
the right foundations from the outset. From experience, we know that the first
tests are decisive in terms of generating and sustaining interest in testing within
the business.
The effort really is worth it, because though tests may prove profitable in the
short term, A/B testing, as a process of continuous improvement, shows its full
potential over the long term. Beyond improvements noted in conversion rates
or other KPIs, A/B testing basically leads to a better understanding of web visitors
and greatly increases the amount known about the customer information of
inestimable value that can impact on everything the business does and give it a
competitive advantage.
35
Glossary
Chi-squared test
A
A/B test
An A/B test involves comparing the
performances of several versions of the same
page in terms of the objectives specific to each
business. This could involve the user registration
rate for a service, the number of sales, or the
average sale value. The different versions are
set in competition in a real environment: each
web visitor, unaware of the test, is randomly
assigned to a single variation on arriving at the
site. At subsequent visits, the visitor remains
assigned to the variation first viewed. With largescale testing, trends emerge to reveal which
version is the best.
Conversion rate
The conversion rate, sometimes called the
transformation rate, corresponds to the
percentage of visitors that have effected the
desired conversion (purchase of a product,
sign-up to a newsletter, etc.). If, for example, a
website attracts 100 visitors per month and two
of them make a purchase on the site, the rate of
conversion of visitors into purchasers will thus
be 2% (number of purchasers / total number of
visitors x 100).
Conversion funnel
Bounce rate
The bounce rate is the percentage of web
visitors who arrive at a web page then leave
the website without consulting other pages
and who, therefore, have only viewed a single
page on the site. An elevated bounce rate
can indicate visitor dissatisfaction. It may also
indicate, however, that they immediately found
what they were searching for.
Heat map
A heat map is a cartography of those elements
of a web page most frequently scanned (via eye
tracking) or clicked (via click tracking) by the
users. It provides a graphical representation in
which warm colors are used to represent the
most attractive elements and cold colors to
represent the least attractive elements.
Macro conversion
This is the primary objective as well as the
reason for the sites existence. In the case of
an e-commerce website, it normally involves
generating transactions and, by consequence,
revenue. The conversion rate, also known
as the global conversion rate, is directly
associated with the act of making a purchase.
In the case of non-transactional websites, the
macro conversion may consist of generating
qualified prospects or examining page views
if the economic model is based on advertising
revenue.
Micro conversion
Micro conversions are secondary conversions
that may contribute to the macro conversion.
Essentially, the web visitor is often not ready
to effect a macro conversion immediately
after they arrive at the site. It is therefore a
good idea to offer them alternatives involving
less engagement (e.g. sign up to a newsletter,
request a free demonstration, etc.) in order to
be able to contact them again. Measuring these
intermediate stages is therefore important when
evaluating the sites capacity to maintain the
relationship with the web visitor throughout their
purchase cycle.
Multivariate test
A multivariate test, or MVT, is a test which allows
multiple versions and multiple variables to be
tested simultaneously. The principle consists of
modifying multiple elements simultaneously on
the same page then identifying, amongst all the
combinations possible, the one which has had
the greatest impact on the indicators tracked.
This kind of test permits, in particular, the role
of associations between variables to be tested,
which is not the case when successive A/B (or
A/B/C, etc.) tests are implemented.
37
Reliability rate
The reliability rate is a statistical indicator that
allows the point at which conclusions can be
drawn from the results provided by the A/B
testing tool to be identified. It is calculated using
different statistical tests, such as the chi-squared
test, and once it reaches a certain threshold (by
convention 95%), it indicates that the differences
in results between two different samples can
justifiably be attributed not to chance but to
the element modified. Below this threshold, it
is hazardous to base decisions on the figures
generated.
S
Split Test
This is the generic term used to designate A/B
type tests, which are not necessarily limited just
to a comparison between two versions. It can
actually also refer to A/B/C tests or A/B/C/D
tests.
V
Variation
In the context of an A/B or multivariate test,
this is a version of the original page on which
one or more elements have been modified in
order to evaluate their impact on the conversion
rate. The performance indicators measured
for that variation are subsequently compared
to those of the original version and statistical
analyses make it possible to confirm whether
the differences observed are significant and not
simply down to chance.
Anthony Brebion
Head of marketing, AB Tasty
About AB Tasty
AB Tasty is the essential SaaS (Software as a Service) A/B testing software
solution. Developed for marketing and e-commerce teams, it simplifies test
creation to the maximum whilst at the same time providing advanced features.
Its graphical editor, in particular, makes it possible to modify a websites pages
without specialist technical knowledge, and to track business indicators specific
to each website (add-to-cart rate, global conversion rate, average cart value,
etc.).
AB Tasty users are therefore rapidly able to turn their optimization ideas into reality,
and they gain in terms of the rapidity with which tests that improve the user journey
and the businesss profitability can be created and launched. Many organizations
of all types and sizes have already placed their confidence in AB Tasty: Bouygues
Telecom, Photobox, Boulanger, Etam, Microsoft, Axa, France Tlvisions, OuestFrance, Prisma Presse.
Interested in seeing
examples of A/B tests?
Consult our library of case studies.
blog.abtasty.com/en/
Looking for an
A/B testing software?
Test the AB Tasty solution for free at:
www.abtasty.com
39
W W W. A B TA S T Y. C O M
40