Cruising around blogosphere, this intriguing headline caught my eye:
Learn what produced a massive 300% boost in conversions
and so I clicked over to check out the post at Mind Valley Labs.
They describe 300% conversion increase from an Adwords copy test. Here's their data:
Converting their conversion rates to conversions, Mind Valley reports their "40 tactics" copy drove 9 conversion events in 150 clicks (6%), versus their control copy, with 2 conversions in 130 events (1.5%).
We here at RKG are huge fans of testing. (Shameless plug: our consulting division now offers conversion improvement projects using Google's formidable and free MVT tool.) Winning at online marketing requires regular testing of keywords, bid strategies, destination URLs, and landing page design. But, when testing, you always need to keep a careful eye on statistical significance.
So, is 9/150 all that different from 2/130?
Strictly speaking, at the 95% confidence level, no, 9/150 isn't statistically different from 2/130.
But let's not get hung up in precise p-value cutoffs. The real issue here is that the test is simply way too small.
Different experts give different rules of thumb for sizing tests. In the catalog industry, I was taught the rule that each cell (each mailing list, each catalog version, whatever) needed at least 100 responses (conversions) before you could say much about its observed response rate with any certainty. Doing some quick web searching, some folks agree with this 100 rule ( Gordon Bell, Jonathan Mendez), while others suggest 25 to 50 (Jonathan Miller).
Having long since traded my catalog marketing beret for a web marketing propeller beanie, I'll concur with the "25 to 50 conversions" rule of thumb. Why? Unlike the 4-mailings-a-year catalog industry where a bad mistake can sink your ship, the web is more forgiving of mistakes. Even if you roll out the incorrect version after a web test, no worries; once discovered, you can swap in a better version tomorrow. You don't need as much data to act because the price of a mistake is so much smaller. (Yet another opportunity to hat tip to Mike Moran's new book and great mantra: do it wrong quickly.)
Let's be permissive and take the lower end of the "25 to 50 conversions" rule-of-thumb range. Applying that to the Mind Valley data, I'd suggest waiting about 12 times longer -- about 1800 clicks, or 180,000 impressions, before running the victory flag up the flag pole.
If forced to choose today, of course I'd go with the current winner, the "40 Tactics" copy version. But I'd place practically zero faith in that preference.
In fact, I'm comfortable wagering that this initial 300% lift degrades to nothing -- or at best, to a single digit percentage difference -- within two weeks. Mind Valley, up for a friendly lunch bet?
For anyone interested marketing statistical significance, we provide a free calculator for sizing marketing test cells. Also, I'd highly recommend Gordon's excellent page on calculating sample size. Math warning: it may take you a bit of time to work through Gordon's page, but it will be time well spent. As an aside, one nice thing about modern web tools (like Google's MVT website optimizer) is that all these significance calcs are baked in, with significance levels presented visually.
Leaving the stats behind and returning to the AdWords angle: we here at RKG believe copy testing is a usually 3rd order effect. After literally hundreds of formal copy tests, we've nearly always found copy to be a far smaller performance driver than keywords, bidding, and matchtypes. (This assumes your copy is decent -- you probably can achieve meaningful lifts if your control is weak to begin with.)
Perhaps I can cajole my colleague George Michie into expanding his thoughts about the value of PPC copy testing, when these makes sense and when it doesn't. Hint: we typically expect and obtain meaningful copy lifts on copy tests related to free shipping, sales, and alternate payment methods. And I'll mention here an SEL article I wrote in August on Eight Essentials For Crafting Killer Paid Search Ad Copy.
Test fast, test often -- but do keep one eye on significance levels.