Elephant In The Room: The Massive Shortcoming Of A/B Testing That Nobody Is Talking About

You can’t be in the online marketing business long without developing an awareness of and appreciation for A/B testing.

There is something profoundly comforting about collecting hard data to guide your marketing efforts, and when those conversion rates tick upward, they provide a measurable, provable, unquestionable ROI.

Test. Rinse. Repeat... the mantra of the A/B tester.

Today, 56% of marketers say they use A/B testing. It has become so popular that numerous tools, services, experts, and platforms have saturated the Conversion Rate Optimization (CRO) space, the largest of which is valued at nearly $600 million.

But with all the hype, case studies, and success stories, there is at least one massive elephant in the glitzy room of A/B testing... and like the idiom denotes, it’s time that we talked about it.

The Difference Between Quantitative & Qualitative Data

Before we look at the major shortcoming of A/B testing, we need to first understand the difference between quantitative and qualitative data.

Quantitative data is information that can be measured numerically or “quantified”. In other words, it is data that can be expressed in numbers.

This type of data is relatively easy to test for, measure, and interpret as numbers are a standardized measuring stick that can easily be compared and measured against other numbers. This is the type of data sought via A/B testing.

For example:

If 254 visitors out of 1,000 download Lead Magnet A while only 221 download Lead Magnet B, we can make a statistically confident, quantitative assessment that Lead Magnet A is the higher performing option.

We can do this because each data point is uniform and easily compared against the other data points in this study. Either the visitor converted or the visitor didn’t convert. Customer A converting is the exact same as Customer B or C or D or E converting.

Qualitative data, on the other hand, is descriptive information that cannot be expressed via numbers. It is usually described by words but can encapsulate ideas or emotions beyond the words themselves.

This type of data tends to be more difficult to test for, measure, and interpret as it is non-standardized.

For example:

If we send out a survey to 50 people asking them to explain how they feel about Lead Magnets A & B, we will receive 50 unique responses.

This provides us with a wealth of information - potentially far more than our 2,000 person quantitative study - but at the same time, it can be a challenge to compare these answers or turn them into a meaningful, actionable conclusion.

If Tim loves Lead Magnet A because of the picture while Carrie prefers the content of Lead Magnet B, how do we compare these qualitative data points or turn them into actionable insights? How do we apply customer feedback into a defined understanding of what the target audience wants in a lead magnet?

Now that we understand the difference between quantitative and qualitative data, let’s look at why this distinction is so important for A/B testing...

... and how it results in our “elephant in the room”.

Planning An A/B Test Is Essentially Guessing

The massive shortcoming of A/B testing is that A/B testing can’t tell you what to test.

In other words, planning out an A/B test is essentially guessing. You are making an educated guess (hypothesis) about WHAT to test. An A/B test can tell you how Option A performs against Option B, but it can’t tell you what Option A should be. It can’t tell you what you should test as Option B.

THIS is the big elephant in the room that is A/B testing - you can waste all your time and money testing a thousand page variations that are all just different shades of mediocre.

Without knowing WHAT to test, the tests themselves might be completely pointless:

  • You can’t tell that half of your visitors are offended by that image you thought was harmless.
  • You can’t tell that the key benefit you are basing your pitch around is actually not that important to your customer base.
  • You can’t tell that the real testimonials next to your CTA are coming across as fake and discouraging trust at a key point in your funnel.
  • You can’t tell that your product doesn’t actually solve the emotionally key problems for you target audience.

You can test minor page elements for years and see some minor improvement, but if you want to make significant improvements, you need something more substantial to test than “change CTA button color to red”.

How do we remedy this?

Usually the answer is qualitative testing, but there is a reason that marketers are so wary of investing in qualitative tests.

Why Most Digital Marketers Leave Out The “WHY

If you need qualitative research to understand “why consumers aren’t converting”, then why aren’t more digital marketers investing in qualitative testing? There are many reasons, but the first is that the “rhythm” of digital marketing is unlike any other marketing function:

Digital marketers work fast. Iteration is the rule. Growth is incremental and achieved by multiple small changes that add up over time. This is very different from other channels that pursue more of a “big bang” campaign, product or brand launch, where massive, immediate results are expected.

Traditional methods of qualitative insight don’t fit the digital marketer’s rhythm

Now that we understand digital marketers, let’s look at the most accessible types of qualitative tests and why they have not been adopted to a greater extent in digital marketing and CRO.

  1. Focus Groups/UX studies

  2. User Surveys

  3. Website Feedback

 

The results from these methods can offer significant value to the traditional marketer. However, for the digital marketer, for whom time and money is of critical concern, most qualitative methods are too expensive and time consuming to include in the daily, weekly or monthly testing process across tens or hundreds of web pages.

Focus Groups & 1:1 Usability Studies

Focus groups or in-depth usability studies allow marketers to get a very deep look at how a select number of users are experiencing a landing page or website.

That’s the upside, but there are also some big-time downsides.

For starters, these types of studies can be very costly. A two-session focus group can run upwards of $20,000. Alternatively, a basic usability study runs between $12,000 - $20,000. This is going to be well over budget for many businesses, and even if it is within your budget, it isn’t guaranteed to give you meaningful results for each page’s optimal conversion.

  • Often, usability studies are conducted with existing customers who are already familiar with the company and thus provide biased answers.
  • Despite the depth of information obtained, the small sample size means your test’s statistical confidence will be very low. In other words, the results you obtain from your focus group might not be representative of your user base.

On the positive note, with a usability study, you can see what the person would do on the page and understand why they made that choice, however, the study is more “rational” than emotional. It is more about the physical experience of the page than the emotional response a new, unfamiliar user might have when engaging with that page.

User Surveys

Similarly to usability groups, many companies tend to utilize their existing customer or lead database for running surveys, as panels are expensive and most often used by outside consultants (read, very expensive and with long lead times).

Unlike focus groups, user surveys allow businesses to achieve statistically significant results at the required confidence levels. Surveys can be distributed to very large samples of users, giving businesses a representative look at their audience.

Unfortunately, the depth of qualitative feedback that can be acquired through a survey is not at the same level as a focus group. Like A/B tests, surveys live or die by the quality of the questions being asked and audience that answers them.

Last, while participants in a survey may give you answers, you still do not really know if they will act the way that they say they would. The survey format is more rational than emotional and therefore will favor what users think they should do, versus what they would actually do.

Website Feedback

Finally, we have website feedback.

On the plus side, running programs that automatically collect website feedback is typically a low cost option. On the downside, the information collected is usually shallow and it takes a long time to reach statistically significant results.

Plus, you are limiting your audience/participants to a unique group who choose to participate in the survey - potentially your outliers vs. your average visitor. And like the surveys listed above, you are more likely to elicit a rational response than emotional one.

At the end of the day, the more companies employ the tools listed above, the closer they will be to their customers and prospective customers and that is always a good thing!

If money was not an option, every digital marketer would use every testing and analytics tool available, but as we previously discussed, digital marketers tend to face very stringent budget and time constraints that limit their ability to engage with any tool that doesn’t provide guaranteed ROI.

The Solution: Decoding Human Behavior

With the shortcomings in A/B testing and common qualitative testing options, is there anyway to efficiently achieve meaningful optimization results?

The real challenge is to find a methodology that moves as quickly as digital marketers do, answering the question “why”, and providing actionable information upon which marketers can improve conversion. The best of both quantitative and qualitative but faster and smarter.

Just as many complex interactions have been made simpler by using technology, we believe that human motivation can be decoded and conversion propensity improved using technology.

Specifically, we believe that a solution that uses both human input and technology can provide the ideal mix of speed, cost and insight (the “technical name” for this is human augmented artificial intelligence).

To build such as solution, it is important to understand the key conversion drivers. We have tested thousands of pages and unearthed both the top level drivers and their digital genes (e.g. the elements make up a driver).  Based on this work, we have defined the core digital drivers, and discovered the elements or genes which yield repeatable patterns of behavior.

The digital drivers we have discovered include: Experience, Clarity, Appeal and Credibility. By understanding how visitors score digital assets in these categories across a multitude of services and customers, we are able to predict at least part of a pages’ success based on algorithm alone.  When rapid crowd-sourced human input is added to the mix, the results become highly predictive.

Our solution is called WEVO, and it has the ability to provide digital marketers lots of information critical to conversion improvement:

  • The visitor’s mindset upon reaching the landing page
  • The visitor’s emotional response while experiencing the landing page
  • The visitor’s level of trust in the company or product
  • The “Why” behind the customer’s conversion or decision not to convert
  • What the visitor hoped to find and see on the landing page
  • What, if anything “turned them off” while viewing the page
  • What, if anything, instigated a motivation to take action and convert

For example, one WEVO client had a landing page that closely followed best practices. The copy heavily featured the primary benefits of the system, touted the strongest features, and spoke to key requirements like “reliability, implementation, and support”.

After undergoing the WEVO test, however, it was discovered that the target consumers - IT Directors - were primarily concerned with the human issues of bringing a new system into their organizations. Yes, the features and benefits mattered, but what REALLY mattered was, “How difficult is it going to be for our users to use this system, and how will that impact the experience our customers have with both our people and our system?”

The client changed their message to talk to the core issues uncovered in the WEVO and features this prominently on their home page, along with proof elements (testimonials that spoke to this issue).  The results are a significantly higher conversion rate for that product.

In another example, an education client seeking to increase online enrollment discovered through WEVO that prospective students wanted to see students that “looked like them” in the school’s marketing materials. While it was very difficult to identify this as an instrumental factor through quantitative testing, it became immediately apparent in the crowd-sourced element of the test.

Overall, WEVO customers have experienced a 1.8X increase in conversion and all done in a few weeks.

Conclusion

The challenge for digital marketers is to find a solution that offers the insight provided by qualitative projects, the analysis and validation provided by quantitative or A/B testing AND actionable recommendations that will provide conversion improvement.

WEVO leverages machine learning and crowdsourcing technology to dramatically reduce the cost and time required to understand and increase customer conversion.  Unlike traditional customer research, which can be inconclusive, or A/B testing, which can be a prolonged process, WEVO provides marketers with customer insights as well as messaging and design that have been proven to deliver superior customer conversion - in days rather than months.

Still have questions? Sign up today for a free demo and see how WEVO can catapult your conversion rates in a fraction of the time at a fraction of the cost.