One of the biggest misconceptions in the e-Commerce optimization space is the belief that A/B or MVT testing actually works. In reality, the probability that you will increase e-Commerce revenue as result of traditional A/B or multivariate testing is slim to none.

Key Takeaways:

  • Web visitor behavior is time-varying
  • There is no static solution for dynamic problems
  • To grow revenue, you need an adaptive optimization solution

Is it just me, or did we enter a new industrial age where innovation and great products are replaced by ‘good’ marketing and fluff?

You don’t have to look far to find examples of this phenomenon in the wild: in recent news, the entire fleet of Boeing’s Dreamliner 787 planes (note the “dreamliner” spin) were grounded because they were not safe to fly. Just before that, we learned that the 7-billion-dollar antivirus industry had a dirty little secret: their products are often not very good at stopping viruses.

“The Emperor’s New Clothes” (Danish: Kejserens nye Klæder) is a short tale by Hans Christian Andersen about two weavers who promise an Emperor a new suit of clothes that are invisible to those unfit for their positions, stupid, or incompetent. When the Emperor parades before his subjects in his new clothes, a child cries out, “…but he isn’t wearing anything at all!”

The Emperor's New Clothes

When it comes to e-Commerce revenue growth strategies, we have no choice but to blow a whistle and declare: A/B and MVT does not work!

Visitor behavior is always changing…

In my previous blog post, “What the S&P 500 and e-Commerce have in common,” I illustrated how web visitor behavior is in a state of constant flux. Most e-Commerce professionals intuitively understand that their visitor behavior fluctuates from time to time, but also believe that these changes are small relative to the average value of a performance metric.

If you had access to time-varying reports, you would be surprised by the magnitude of time-varying changes in your online visitors’ behavior:

e-Optimizer® comparison between campaign and baseline

That is why I used the stock market analogy to compare the e-Commerce site dynamic with the general stock market variability. In both cases, buyer reactions are not predictable and they can vary significantly from one time interval to another.

…which implies that there is no static solution to this dynamic problem

When you connect the time variance problem with traditional A/B and MVT methodologies, you can start to see how testing vendors have you locked in the past.

A/B and MVT solutions vendors recommend that you use “best practices” to design a test, and then run it with your live traffic until you get a data sample large enough to reach statistical significance. Upon successful completion of the test, all you have to do is to implement the winner: voilà!

For example, imagine that the results below were outcomes from your current online testing program. At first glance, it appears that Combination 41 is the winner.

e-Optimizer® table displaying RPV lift achieved by different variable combinations

You would then instruct your web developer to permanently implement the winner. This would require making changes to your e-Commerce site:

variable combination makeup and their alternatives

Immediately after the changes were implemented, you might eagerly await your anticipated +55.65% jump in revenue (don’t hold your breath).

What we often find is that results like these hardly ever translate to future real-world performance metrics.

So, if you had experienced something like the example above, you would most likely conclude that the testing software did not work.

A report that measures the performance Combination 41 over time is necessary to explain what actually happened: by the time the combination was implemented, it no longer was providing a positive lift.  Something in the marketplace changed which resulted in the baseline jumping and significantly outperforming the “winner”.

Line graph displaying the difference between campaign and baseline results over time

By focusing solely on the cumulative test result and statistical confidence level, you will lose sight of the time-varying nature of your web visitor’s behavior and most likely make a wrong move.

We often compare use of online test results to driving a car while looking in the rear view mirror.

Well-entrenched testing vendors will rebuke the explanation above by saying, “So what if the combination performance varies and goes up and down? It should still produce the overall lift over period of time.”

By applying a winning version of the page to your eCommerce site, you’re taking a gamble that the number of times the new page performs better than the existing page is greater than the amount of times the new page performs more poorly than the control. The chart below should illustrate this point.

page indicating the potential of loss over time by selecting winning combos

We have an abundance of data collected over a period of years across hundreds of different e-Commerce sites indicating that evergreen (i.e., “best practice” or “winning page combinations”) are very rare.

How did we get to this point?

I’ll admit that when we invented our real time website optimization method (which we patented last year), we were focused on getting multivariate tests done as quickly as possible while reducing the risks associated with online testing.

Once we launched our optimization software, we were very pleased with our ability to consistently deliver revenue lift during the test. It removed the risk of testing, and gave our customers the freedom to experiment with their e-Commerce site without the fear of losing money during the test.

After observing some top-performing page combinations, clients would get overzealous and rush to implement these “winning” combinations. From time to time, we would field complaints that these winners did not perform as anticipated once they were fully implemented. These clients assumed that our solution wasn’t working.

Originally, we suspected that there was something wrong with the algorithm. What we discovered was the time varying behavior of web visitors. As shown above, it was everything but a stable mean value, and the scale was shocking.

We concluded that there wasn’t anything wrong with our solution: the adaptive algorithm was able to consistently produce positive results during tests because it was continually adaptive to visitor behavior. Poor performance only occurred when clients stopped the campaign by implementing a winner, thereby forcing a static experience on constantly changing web visitor behavior.

It’s not about testing; It’s about optimization

We began to realize that we were viewing our own solution in the wrong way. What we developed was not the market’s fastest testing solution – rather, we invented the first real time optimization solution that continuously learns and adapts to changes in visitor behavior.

This realization elevated HiConversion to an entirely new level of customer value, and has led us in the direction of building the next generation of optimization technology by leveraging the time varying nature of e-Commerce site performance.