Which Test Won deserves credit for evangelizing the value and best practices of A/B and Multivariate testing. Unfortunately, this never ending parade of winning results fabricates the impression that testing is all about giant golden nuggets. This creates false expectations followed by big disappointments.

Objects In The Rear View Mirror

‘Which Test Won?’ is a great marketing line. It is very memorable and it creates curiosity and engagement.

At the same time Which Test Won comes with significant baggage.

First, the ‘won’ part.

Is ‘winning’ the main purpose and value of testing?

Real testing experts will tell you that learning and continuous improvement is the real mission and the value of conversion rate optimization (CRO).

Warning: winning results are often misleading (see: Lies, Damned Lies, and eCommerce Statistics) and temporary because visitor preferences are always changing.

Second, the ‘test’ part.

Why would a test ever be a loser?

Strategy: knowledge of and engagement with customers is the only competitive advantage in the age of the customer. Testing is an essential tool of such engagement. By not testing companies are in essence choosing to lose a competitive race.

With the right approach and with proper instrumentation every test can be a winner.

This article will tell you how.

When We All Think Alike

Getting winning test results is not an easy task. One has to explore many options and \’kiss many frogs\’ before finding a princess.

In other words, website testing is the business of dealing with many negative outcomes.

Can we extract value from a test that failed?

To answer that question let’s start with a real life example of a test that did not win.

A Real Life Example:

A sport’s brand just completed a test of mega drop down elements.

Term: a mega drop down is a term used to describe a main navigation drop down that contains product images.

The main objective of the test was to determine which one of the two best selling products should be in the drop down:

  • Treatment A: ‘Go’ product – lower priced version of the best seller
  • Treatment B: ‘Pro’ product – higher priced version of the best seller

The main hypothesis of the test was the assumption that showing product images in the main navigation drop down should produce a lift.

In the tester\’s mind the only question was which product (treatment) will win?

From the very beginning of the test neither treatment was doing well. The overall test results fluctuated within very small +/- 1% range.


An analysis of the individual treatments showed a similar pattern:


Applying  the ‘Which Test Won’ criteria, this test will be treated as a loser and tossed away.

Legacy Solutions

A decision about the test’s performance above is made solely on the basis of Conversion Rate (CR) or Revenue Per Visit (RPV) lift in combination with statistical significance of the results.

Are we using the right data for decision making?

Testing is introducing very intrusive web changes that are influencing different audiences to behave in different ways.

An understanding of such inner workings requires a need for tracking of much broader visitor attributes:

Custom Segments And Goals

A common feature of testing solutions is the ability to provision custom segments and goals.

This enables analysis of test results for a specific audience and micro conversion goal.

Challenge: All custom segments and goals must be explicitly defined and provisioned before the start of the test.

At the end of the test you can’t go back and analyze data for a visitor segment that was not configured at the beginning of the test.

Unfortunately this is an impractical and unscalable solution for the much bigger ongoing data collection problem.

Web Analytics Integration

A more promising approach is to combine test and web analytics data.

On the surface this looks like the right approach. Web analytics solutions already track a broad spectrum of visitor attributes and custom goals. The integration between two is a fairly simple exercise.

Challenges: The integration requires custom tagging of each test separately. Data analysis requires development of custom reports within web analytics tool.

In practice, use of web analytics for enhanced test result analysis is very cumbersome. It’s use is more of an exception rather than the best practice.

It’s The Most Wonderful Time

There is a new breed of testing solutions that are built on the foundation of a rich web analytics platform.

So, what’s new?

This solution provides rich test data analytics directly out of box. There is no need to build custom segments or goals. All tests are automatically integrated into the web analytics data set.

To better understand the real value of the fully integrated testing and web analytics solutions let’s apply its capabilities to the test example from above:

Example (continued):

The test above produce the following RPV lift results:

  • Variation 1 – Featured Go: -2.81%
  • Variation 2 – Featured Pro: +1.12%

RPV lift of +1.12% is not definitively something worth mentioning.

Let’s now peel back layers of data in search of persuadable segments: visitors who are positively reacting to the test treatments.


Marketing channels analysis reveals both the quality of marketing initiatives as well as reactions to the test treatments:


Key finding:

  • e-mail visitors prefer Variation 2 – Featured Pro: +11.88%

Actionable Insight:

  • Target e-mail traffic with Variation 2 – Featured Pro treatment


At the beginning of the test browser analysis can reveal technical problems with the test treatments. At the end of the test it is possible to uncover treatment preferences by visitors using different browsers:


Key finding:

  • Visitors who use Chrome prefer Variation 2 – Featured Pro: +9.90%

Actionable Insight:

  • Target Chrome users with Variation 2 – Featured Pro treatment

Days In Week

Visitor preferences are different for each day in the week:


Key finding:

  • On Tuesdays of every week visitors prefer Variation 2 – Featured Pro: +27.94%

Actionable Insight:

  • Target visitors on Tusday with Variation 2 – Featured Pro treatment

US States

Thera are significant regional differences in the preferences of test treatments:


Key finding:

  • Visitors from Georgia prefer Variation 2 – Featured Pro: +85.41%

Actionable Insight:

  • Target visitors from Georgia with Variation 2 – Featured Pro treatment

This approach makes traditional testing and associated average results irrelevant.

Instead of risky hit-or-miss testing brands now know that every test will yield positive outcomes.  while providing actionable insights that are used for continuous improvements.


Giant gold nuggets are super rare.

This is not the impression you will get by reading Which Test Won.

They are not telling you that in the pool of tens of thousands of prospectors (testers) running hundreds of thousands of test somebody will get lucky and then brag about it. That’s perfectly fine, for as long as we all know that this is an exception and not the norm.

We hope that this article will inspire to take a more systematic approach to testing.

By arming your testing solution with rich web analytics you will be able to extract a few golden pieces from each test and apply learning to continually improve the quality of your test treatments.

That way, your every test will be a winner.