Best Practices for A-B Testing
Last week we looked at the power and purpose of A-B testing. We defined A-B testing as a simple random experiment with two variables or options, A and B, which are the control and treatment in the experiment. As the name implies, two versions (A and B) are compared, which are identical except for one variation that might affect the behavior. Version A might be the currently used version (control), while Version B is modified in some respect (treatment).
A-B testing is used in all kinds of applications, business and otherwise. Given today’s complex marketplace, companies often use it to either determine or validate the best approach for sales and marketing efforts. In marketing, it can be used to test the effectiveness of digital ads, web pages, online tools, web offers, client preferences, and email, among other things. While it is an illuminating approach for assessing and optimizing sales and marketing efforts, A-B testing requires patience, tracking and lots of data analysis. A company’s leadership must be willing to commit the resources and provide the tools to be able to do A-B testing effectively. Let’s look at best practices for A-B testing in sales and marketing and debunk some myths and mistakes related to this powerful tool.
Top Five A-B Testing Best Practices
To do effective A-B testing, it is important to understand best practices. Here are some key dos and don’ts to employ.
1. Do Create a Hypothesis
It’s crucial to have a hypothesis to test. Without at least some idea of the possible outcomes, A-B testing becomes A-B guessing. Shooting in the dark can waste a lot of time and resources. Similarly, without a hypothesis, discerning the true impact of changes or choices can be difficult and could lead to additional (and potentially unnecessary) testing, or missed opportunities that could have been identified had the test been performed with a specific objective in mind. There should be a clear notion of what is expected. Communicate that hypothesis to everyone involved with the testing.
2. Don’t Compare Radically Different Choices
When doing A-B testing, think granular. While it might be tempting and seem logical to test two very different concepts, designs, statements or options, doing so may not yield any actionable data. That’s because the greater the differences between two choices, the harder it is to determine which factor caused the improvement – or decline – in results.
For example, in testing the effectiveness of two subject lines for an eblast invitation to an event, it is more effective to compare one subject line that has specific wording plus the event date and the other has the same wording but doesn’t have the event date. Testing two totally different subject lines might not clearly pinpoint why one subject line was more effective than the other.
While it is easy to be seduced by the idea that all variations in an A-B test have to be spectacular, show-stopping transformations, it actually the subtle changes that can truly give clear results. Subtle changes can have a demonstrable effect, such as changing only a single color in two versions of a landing page’s design.
3. Do Test, Test and Test Some More
In medicine, the results of one experiment are not enough to prove or disprove a hypothesis. For a procedure or drug to be approved by the Food and Drug Administration, multiple experiments must validate the findings. That is an equally wise practice for businesses. Best practice is to evaluate the impact of one variable per test, but that doesn’t mean only one test should be performed overall. On the contrary, A-B testing should be a granular process. Although the results of the first test may not provide any real insight into customer behavior, it could provide insight in how to do additional tests to determine what elements could have a measurable impact on results. In other words, A-B testing can be used to determine what other variables to A-B test. The sooner A-B testing begins, the sooner ineffective choices or poor practices based on incorrect assumptions can be eliminated. While it is often said that less is more, in the case of A-B testing, more is more. So test, test, and keep testing.
4. Don’t Be Impatient
It is normal to feel pressured to draw conclusions and make changes based on preliminary results from A-B testing. Don’t give in to the urge. A-B testing is an important and powerful tool, but meaningful results don’t generally materialize overnight. Patience is a virtue when performing A-B tests. Even when the results seem to point in an obvious direction, acting prematurely could cost money even when if feels like it will do the opposite.
There is a principle known as statistical significance to identify and interpret the patterns behind the numbers. Statistical significance lies at the very heart of A-B testing best practice. Without it, there is a great risk of making business decisions based on bad data.
Statistical significance is the probability that an effect observed during an experiment or test is caused by changes made to a specific variable, as opposed to mere chance. To reach statistically significant results, there must be a sufficiently large data set to draw upon. Not only do larger volumes of data provide more accurate results, it also makes it easier to identify standard deviations – typical variations from the average result that are not statistically significant. Unfortunately, it takes time to gather this data, even for tests generating large volumes of outcomes.
5. Do Be Open To Unexpected Outcomes
While it is important to start with a hypothesis, it is just as important – perhaps even more important — to keep an open mind to whatever conclusions the data supports. Just because there is an idea of what the outcome of an A-B test should be, the data may not reach that conclusion and instead may show that the original idea was not accurate. That is fine. What isn’t fine is to ignore the data. While it can be tempting, it is vitally important not to hold on to the hypothesis that flies in the face of the test results. Many savvy business leaders have become stuck on a hypothesis. When presented with data that differs significantly from the original idea, they have dismissed the results or the methodologies of the test. Beware of that pitfall. After all, why bother to do A-B testing in the first place if there is such confidence in the assumptions.
Following these best practices, any business leader can test assumptions and identify solid ways to improve results. However, as powerful and useful as A-B testing is, it has its limitations. Here are some common myths.
Myth 1 – A-B Testing will improve sales and or marketing conversions exponentially.
Expecting stratospheric results is sure to disappoint. Anyone expecting sales to increase by 3,451% cue to improvements discovered through A-B testing are in for a let-down. While A-B testing can improve conversions or outcomes, it isn’t likely to improve business so significantly as to make the CFO and Controller fans. While it can and has improved the bottom lines of countless businesses, it does not perform miracles. Keeping expectations modest and remembering that the point of A-B testing is to learn is best.
Myth 2 – A-B Testing can be done quickly.
As explained above, for test data to be validated, tests must be repeated and validated numerous times. Deploying a single eblast at two different times of the day – 9 am and 9 pm — may indicate that one time of day – 9 am – has much higher open and conversion rates. However, deploying 10 different eblasts at the same two different times of the day might show that in eight out of 10 eblasts, the open rate and conversion rate was higher at 9pm than 9am. The initial test might have been skewed because it was deployed before a holiday which affected the behavior of the recipients. Only by performing the same test over and over can the cumulative data provide clear answers, and that takes time.
Myth 3 – A-B Tests done by one company can apply to another company.
What works for one company will not necessarily work for another company. Just because studies show that certain choices produce certain outcomes at one company does not mean that those will apply to another company. Marketing efforts that work for a major retailer such as Walmart or Target will not necessarily work for another retailer. Copying the efforts of a competitor is likely to waste lots of money. The only time it makes sense to copy a competitor’s efforts instead of doing in-house A-B testing is if:
- Both companies have the exact same target market group (visitor by visitor).
- Both companies sell the EXACT same product.
- Both companies market the exact same way (same AdWords, SEO Keywords, ads, websites, etc.).
- Both companies have the same sales process.
Since that is never the case, using another company’s results in lieu of in-house testing is never a good option.
Armed with these best practices and having debunked the biggest myths, a company can get started doing A-B testing to validate ideas and pinpoint client behavior. Test on!
Quote of the Week
“What we have to do is to be forever curiously testing new opinions and courting new impressions.” Walter H. Pater
© 2014, Written by Keren Peters-Atkinson, CMO, Madison Commercial Real Estate Services. All rights reserved.